Crawler error 2436 on a Host-Header based WebApp (and loopback check is disabled) 

Tags: Proyectos, MOSS/WSS

This one took us a couple of hours (stressful hours) to spot. We rebooted a server and the crawler stopped crawling ;)

We have two webapps published in the server, the first one is attending to the machine name, and the second (the faulty one) is attending to an alias in the dns (using a host header like “app.company.int” ). The two applications are internal to the company (the authoring farm of the two portals).

After checking the problem we saw that only the host header webapp was unable to be crawled.

We checked the loopback check was disabled, we tried to crawl other WFE machine, we tried changing the AAM settings and the error was still there…

Then we remembered that the day before someone changed the proxy settings of all the machines (at the domain level) by executing a proxycfg command at startup. Since the machine was rebooted the changes had taken effect.

So, what you need to do is:

  1. In the crawler machine, do a “run as” of Internet Explorer using the content access account (the account the crawler uses to gather the contents) and check that it is not using a proxy server.
  2. If the former does not work, try using proxycfg with the parameter –d.

 

Apparently the requests to the host-header webapp were trying to use the proxy, and the ones that pointed to the netbios named webapp didn’t. The host-header webapp was not reachable, since the proxy shouldn’t be used to access that specific machine.

I think that this error (wrong proxy configuration) is what makes a lot of people say that you can’t crawl a host header based web application. It is wrong, you can, but you need to access the server the proper way.

Hope it helps you!

 
Published by Enrique Blanco  30-Jan-10
0 Comments  |  Trackback Url
 

Comentarios

You can comment here:
Use <br/> for linebreaks.

Nombre:
URL:
Email:
Comentarios:
CAPTCHA Image Validation