All project content is available for reading, but you need to be a member of the project for Subversion checkout of source code, or to create/modify any information.
Login if you are a member. Apply here to request membership (open to all).

Ticket #125 (closed enhancement: fixed)

Opened 9 years ago

Last modified 9 years ago

Excluding one ore more pages from being indexed

Reported by: steve@… Owned by: andyturl@…
Priority: minor Component: EasySearch
Keywords: Cc:


Is it possible to exclude a given page from being indexed? For an example, adding a bool property called "NoIndex", setting it to true and then the page won't be indexed by EasySearch?

If this could be a dynamic property, we could actually keep entire page branches from being indexed.

The thing is, sometimes, the customer has old or outdated content that they do not want in the search result.

Change History

comment:1 Changed 9 years ago by andyturl@…

  • Type changed from task to enhancement

Yes, it should be possible. I will look at adding it to the next release.

comment:2 Changed 9 years ago by andyturl@…

  • Status changed from new to closed
  • Resolution set to fixed

A page or a branch of pages can now be excluded from indexing using a dynamic selected/not selected property named 'easysearch_noindex'.

Entire language branches or a specific language branch can be excluded depending on whether the property is unique for each language or not.

Bear in mind that reindexing will need to occur manually if this property is modified on a page in order for EasySearch to pick it up, since there's no event fired to trigger page reindexing.

comment:3 Changed 9 years ago by mari@…

  • Status changed from closed to reopened
  • Resolution fixed deleted

The use of the selected/not selected property type as a dynamic property can be somewhat traumatic (ref blog posts on world).

Let's say you want to remove a page from the EasySearch index, this page has >50 child pages which you want to be indexed. See the problem?

I suggest the following:

Add a selected/not selected page property the 'easysearch_noindex' which will take presence over the dynamic one.

comment:4 Changed 9 years ago by steve@…

As far as I can see, the code in changeset:1049 should work equally well on both a dynamic property and a page property so I don't see why the ticket has been reopened.

comment:5 Changed 9 years ago by andyturl@…

Actually it's not really working how I expected it to. I was under the impression that the dynamic property could be set per page or to apply to the children of that page as well, however it is always inherited further down and somehow I missed this when I tested it.

I am looking at how to accommodate both behaviours, I would prefer to be able to set that a particular page shouldn't be indexed but not all of the pages that happen to be beneath it like Mari says.

I had wanted to avoid needing to add the NoIndex property to a page type just to excluded it from indexing, and I don't think this would override the dynamic property as selected/not selected are either true or null. Also if the dynamic property is set to true further back up the branch then I believe its value will instead be retrieved if the static property on the page is not set (as its null).

I will look more into this, but at the moment it seems to be only useful for not indexing an entire page branch.

comment:6 Changed 9 years ago by steve@…

I haven't tested this, but as I read the code in changeset:1049, you get the property from the Property collection, and it should be there if it has been created as a property on the pagetype, even though the value is null ("not selected").

My theory is that it will actually index a page if the pagetype has the easysearch_noindex property but no value set. Which will render the dynamic property useless for all pages of this pagetype. Due to the way you check the value from code. This of course makes the whole point of overriding the dynamic property useless.

I agree with Mari on this, avoid using PropertyBoolean for this, due to the way EPiServer treats the false value, and to avoid confusion down the road (we're there already).

Would an Int do better?

  • 1 = noindex
  • 0 = index
  • null = inherit

It could be made into a more fancy looking custom property later.

By the way, it seems you're duplicating code in changeset:1049. Why not call LocalisedPageShouldBeIndexed from PagesShouldBeIndexed?

comment:7 Changed 9 years ago by mari@…

I've discovered an issue:

  • If the page already is indexed, it will not be removed after setting property 'easysearch_noindex' and re-indexing the page tree.

I had to remove the complete index, then re-index in order for it to work.

If it's not possible to remove it when indexing, could we add a "Remove from index" button on the Edit plugin?

comment:8 Changed 9 years ago by andyturl@…

Yes, this is a known issue with it. It's not hooked into the dynamic property events so EasySearch doesn't know to reindex the page (or the children of that page, which I haven't coded yet). I couldn't actually find out what events get fired when a dynamic property is changed, and the page it applies to doesn't get a page published event either. Do you know what if any events get fired on a Dynamic Property change?

Also, I will be changing the way this property works in future. It will instead be two properties;

easyearch_indexbranch - dynamic, works the same way the currently implemented property does except using an int where the value is null - index, 0 - exclude, 1 - index.

easysearch_indexpage - a static property on a page that overrides the dynamic indexbranch property for excluding or including individual pages.

comment:9 Changed 9 years ago by steve@…

There are no events for Dynamic Properties, unfortunately. There might be some log4net messages being sent, which one could have listened to (a quick check with Reflector on the EPiServer.DataAbstraction.DynamicProperty class would reveal that.)

Another option is to create a custom property and handle the reindexing when the dynamic property value is changed. Starting a re-index job in the postback is not a good option though. Perhaps some sort of indexing queue could be used (a lot of job though)? Documentation that tells the user to re-index would suffice for now (perhaps part of the custom property UI).

comment:10 Changed 9 years ago by andyturl@…

  • Status changed from reopened to closed
  • Resolution set to fixed

I have checked the new version of this into SVN now that uses 'easysearch_indexpage' and 'easysearch_indexbranch' int properties.

There is still the issue of pages that are already in the index not being removed when the property is set to exclude the page, so reindexing the site is the only other option.

I will look into the custom property firing its own event when its value is changed. I think that is the only option for Dynamic Properties as the only event I found being fired was OnBeforeSavingProperty() deep down in DynamicPagesDB. For now I will just document the current behaviour for these properties.

Note: See HelpUser/Tickets for help on using tickets.