Everybody knows that moss search is security trimmed, but there are certain circumstances where it is not enough, for example:
Imagine a site that is using a custom master page with some links or text in the header, footer and may be in the lateral menu, the problem is that, when you search for a word located in the master page all the pages in the site (at least all the pages that use that master) will be found and that can be a problem sometimes. For example search “reserved” in www.ferrari.com, try some of the webs in http://www.wssdemo.com/Pages/topwebsites.aspx and you will see what I mean ;)
The solution we found was to hide that parts of the page to the crawler.
How to do that? I suggest two options:
- If the content is rendered inside a control you can control the visualization in the control logic.
- If the content is static you can place it into a custom control that shows the content or not depending if the visit comes from the search service or not. You must inherit the control from the System.Web.UI.WebControls.Panel class and overriding the RenderControls method.
How can you know if the visitor is the search service?
Some people might think that identifying the user is the best way, I don’t like that method: In production servers the search service might run under a different account, but in most developer machines all services run under the same account, so we need to find another way.
We checked the User Agent server parameter, MOSS Search Service’s user agent uses something like this:
“Mozilla/4.0 (compatible; MSIE 4.01; Windows NT; MS Search 5.0 Robot Crawler)”
So, by checking the user agent, you can tell if the visitor is the search service and then trim the contents showed. By the way, it usually does not cause problems with the cache, since the user agent is taken into account by default and the crawler is not an anonymous user :)
After setting up your master and controls with the changes, you must make a full crawl and only the proper portions of the pages will be indexed.
Here is more info on MOSS search user agents.