W3C HTML Validation and Search Engine Optimization

It has been a while since I have posted some of Virante’s research to the blog, and a good friend and former COO Bob Misita called me out on it. I figured I would release some of the data from a recent study we did on the relationship of W3C HTML Validation and web page rankings. Because validation is quite complex, we chose to take a macro-look rather than our traditional methodology of getting individual sites into the SERPs via sitemaps and then tweaking individual independent variables. In particular, we looked at the W3C validation of approximately 100 separate keywords in Google, Yahoo, MSN Live and Ask. For each keyword, we extracted the top 10 ranking sites, measured the number of errors via a W3C validation check, and used multiple statistical...

PowerSet Will Never Be a Google Killer

Many of you have seen the recent stories about Powerset releasing its public beta. First off, I must say that they have done a great job on an interesting product. However, I think we need to quickly – and I mean very quickly – put to rest this conversation that Powerset could EVER be a Google Killer. ** Please Read the Comment below from Mark at PowerSet for their side of the story ** It takes Powerset a month to index and analyze just Wikipedia – 1/8000 of the web. (Mark from Powerset Disputes this Claim Below and, unfortunately, I have no way to verify one way or another. That being said, even a couple of days to handle a site like Wikipedia which has similar formatting across the entire domain is slow in comparison to a giant like Google)...

Really Solved: Another Common Site Review Problem

Matt Cutts wrote recently of a common site-review problem. Many sites prefer to store links within drop down menus (the “select” element). Unfortunately, this non-standard way of using javascript to link to pages within your site is quite difficult for search engines to spider. (ie: search spiders like GoogleBot have difficulty determining that your javascript code is meant to be interpreted as links). Here are a few examples: http://www.pickwicktea.com/ http://www.evinrude.com/ http://www.yoofi.com Luckily, there is a pretty easily solution to the issue. Start by creating a DIV tag with traditional text links inside representing each of the items you would like to appear in your menu. And then just run the javascript I have included below to...

Exclude-by-Keyword: Thoughts on Spam and Robots.txt

Note: This solution is for spam that cannot be filtered. There are already wonderful tools to help with comment / forum / wikispam such as LinkSleeve and Akismet. However, this proposed method would prevent the more nefarious methods such as HTML Injection, XSS, and Parasitic Hosting techniques. Truth be told, I rarely use the Robots.txt file. It’s functionalities can be largely replicated on a page-by-page basis via the robots META tag and, frankly, we spend a lot more time on getting page into the SERPs than excluding them. However, after running / creating several large communities with tons of user-generated content, I realized that the Robots.txt file could offer a lot more powerful tools for exclusion. Essentially, exclude-by-keyword. The truth is,...

Simplest Trick to Optimize Body Content

So, the general rules of thumb for body content are this… Keywords in important tags (h1, h2, h3, b or strong, em, maybe alt tags) Unique content as close to the top of the page as possible. The first issue is quite easy to handle, and has been spammed to death across the internet since the inception of search engines. However, moving unique content to the top of the code while maintaining an attractive, Google-guidelines-compliant page has proven more difficult. Let’s take a look. Headers, advertisements, navigation and more normally precede the really unique content on the page. But how much code and duplicate content does that create? In the case of FindArticles.com, we are looking at nearly 370 lines of code between the body tag and the unique...

Carbon Neutral Hosting

I know this is something that I and a lot of folks in the web community care about. There are still only a handful of hosting companies that have gone carbon neutral. It is something that we are working towards and, as large-scale buyers of web hosting, it is an easy way to make sure that your site is green… List of Carbon Neutral Hosts No tags for this post.