Google Indexes Without Links or Sitemaps
I have long been convinced that Google uses its many tools to find and queue urls for indexing, but I believe I have found indisputable proof. After beginning many of our experiments using the our sitemaps technique (using Google Sitemaps to get pages indexed in order to control for the unpredictable weight of individual links), we noticed a trend.
WhenÂ usingÂ webmaster tools to get subdomains indexed, the domains themselves kept getting indexed. We painstakingly made sure that no URLs were ever linked to, much less the domains themselves which can and do interrupt the quality of our results.
Nevertheless,Â GoogleÂ stillÂ findsÂ aÂ way.Â WhileÂ IÂ thinkÂ thereÂ isÂ noÂ realÂ problemÂ withÂ thisÂ practice, it does seem clear that Google will attempt to spider all parts of a URL…. ie: if you were to get http://subdomain.domain.com indexed, Google – without any other prompting or links, would attempt to index http://domain.com
Strangely enough, google found both the www and non-www version of the 2008exp1.net domain. From my server logs, I can say unequivocally that the only visitors to this version of the domain were myself and googlebot. I do, however, surf with Google Toolbar open and running at most times. I do not have clear evidence at this point to determine whether Google queues up with and without the www on its own, or if the toolbar prompted it to after I had visited it.