Google Responsible for Own Server Clog

As has been the case for over a year now, Google has a severe issue with duplicate caching. Google has yet to fully understand that www.domainname.com is identical to domainname.com and will readily index both.

Google will also index a /index.html and a / version of the same page. And, if the web.domainname.com subdomain is available, it will cache it as well.

I personally have cleaned up sites with 3-6x caches of each page in google, resulting in 100,000+ duplicates being cached. If Google would fix this one problem, it would clear up a huge amount of space.

No tags for this post.

3 Comments

  1. sam
    May 11, 2006

    i am new to SEO,and i really liked the topic where you discussed facts about “a modest proposal”. In fact, i myself has based my first post on my wordpress blog on that particular info.short but straight dude!!!!all your comments are very informational to me as a neophyte in SEO.Thanks,i’ll suely continue to read all of your posts.

    sam

  2. Tony Spencer
    May 13, 2006

    Not to mention duplicates from crawled https. I think there needs to be an addition to the robots.txt protocol to help with that one.

  3. Sparks Flying
    May 19, 2006

    I , like Sam , im new to SEO but cant understand how Google can get away with making it so difficult for normal Joe to get in , and all these “spam” sites indexed without problems…

    I suppose .. If you know how !

    Great site , thanks!

    Sparks Flying

Submit a Comment

Your email address will not be published. Required fields are marked *