Reply to comment

postscript - follow, noindex

As an experiment, I did something differently when I launched another site (Philippines Living).  In this case, I set the default robot meta tag to be "follow,noindex".  I expected that nothing would show up in the search results: all the pages would be either robots.txt'd out, or not indexed because of noindex.

I was wrong.  Here's a few weeks after launch, when I still had the site set to "follow,noindex" (right before setting to the normal setting of "follow,index"):

noindex,follow google search results

So, what happened?

Google crawled all the pages not disallowed by robots.txt, and it saw the links to the robots.txt disallowed URLs, but didn't officially crawl them. Google then did not index any of the crawled pages, because it knew that the pages were "noindex" in the meta tag.  But, it did index the links to the robots.txt disallowed URLs!  Of course, they have no titles or descriptions, because the googlebot doesn't officially crawl them... but, yes, the only pages google indexed were the robots.txt barred pages.  Pretty amusing.  I'll make another entry when I see that google has figured out that the pages are indexable and put them in the online index.  We'll see how long past January 4th that is.

Reply

The content of this field is kept private and will not be shown publicly.
CAPTCHA
This question is for clevery testing whether you are a human visitor and to prevent automated spam submissions.
Image CAPTCHA
Enter the characters shown in the image.