Search your internet MOSS without a root Site Collection

Since Service Pack 2, it's not possible anymore to crawl a MOSS 2007 Web Application, without the presence of a root Site Collection. And if you use IIS 7.0 HTTP Redirection to redirect the root to a different URL, MOSS 2007 can't see there is actually a root Site Collection.

If that sentence didn't make any sense at all (which it probably didn't), following is the scenario that will make it all clear.

Owkai, here is the situation:

So a customer of mine has set up a brand new MOSS 2007 (or at least: I did the setup), that will be used as a public internet site. Basically it is setup with an authoring farm and publishing farm (separated with a firewall). Content deployment is used to send the changed information from the authoring to the publishing farm. But that's not the issue here.

The URL's will be like this:

A Publishing Portal on http://www.trycatch.be/trycatch
A Publishing Portal on http://www.trycatch.be/catchthis

If you surf to http://www.trycatch.be, you should be redirected to http://www.trycatch.be/trycactch.

So the root http://www.trycatch.be will not be used.

So this is the problem:

If you use IIS 7.0 to redirect http://www.trycatch.be to http://www.trycatch.be/trycatch, MOSS 2007 will not notice the root Site Collection.
So, even if you have a root Site Collection (let's say a "Blank Site"), because it's necessary for crawling, crawling will still not work...

It's the IIS 7.0 HTTP Redirection that fucks things up.

So, here is your solution:


1. Enable HTTP Redirection on your MOSS 2007 Web Application
2. Extend your Web Application to a second Web Application (so you can access the same site through a different URL)
3. Change the Content Source in your Shared Services Provider to point to the extended Web Application (in stead of the original one)
4. Crawl your MOSS 2007

This time crawling will work (there is no HTTP Redirection on your extended Web Application).
Thanks to Alternate Access Mappings, when you search in your original (published) website, you will get the correct URL's back in your search results.

But keep in mind: to keep things simple, try to use a root Site Collection Wink

Party On
Tom

Published Monday, August 10, 2009 6:00 PM by Tom Vandaele
Filed under: , ,
Powered by Community Server (Commercial Edition), by Telligent Systems