How to Build a Big-Data Cluster for memSQL 4.0 on the Cheap

So, now that memSQL has rolled out a community edition which is completely free to use, I thought I might show you an easy way to build a huge cluster for cheap that you can run memSQL on quite easily. I know it works because we have been running this in production for quite some time at GREPWords and SERPScape. There isn’t really any magic to it, just finding the best hosting company for the job. What you want is a hosting company that offers a lot of exactly what you need – multi-core, high RAM servers – with no extra perks. I think I found the perfect host for this – CloudSouth. Yes, that is an affiliate link, judge for yourself if you think the prices are amazing. They keep their prices low by simply only offering a couple of types of...

2015: The Year of Broken Link Building

Every year begins with a flood of questions to seo experts about their predictions or recommendations for the upcoming year. I am normally careful not to get involved into too many of these forays because they seem speculative or risky given Google’s penchant to change courses so quickly. When your target is moving so fast, so frequently, and seemingly erratically, it is hard to make predictions and recommendations. So, how should we proceed? One of the common responses is to give very generic, holistic tips like “build your brand”, or “users first”. While I think these are absolutely true, they don’t offer much in the way of actionable advice. Some of the lengthier responses by good SEOs do go into detail on these issues, and...

A Plea for Data

Hey folks, many of you who follow me on twitter (@rjonesx) may have noticed some discussions between myself and @authoritylabs a week or so ago. I have embarked on a new study that looks at the relationship between SERP features (places, ads, carousel, images, video, etc.) and organic click through rates. Using GWT CTR data, Google Keyword Traffic data, and Authority Labs SERP data, we can determine the actual organic traffic a SERP will return. Take for example two keywords – one gets 1000 visits a month, the other gets 400 visits a month. The second has no advertisers and no SERP features, the first has places and 10 ads. It is actually possible that the second one with less traffic volume actually delivers MORE organic traffic because of these SERP...

More on Google’s Javascript Handling

Many of you probably noticed my recent post without any substantive content. I was seeking to answer the following questions… Does Google wait for timeouts and display that content in the index? is that content searchable? How does Google handle content generated at intervals in javascript? Will Google index content that is only displayed after an action like a button click occurs? We now have some pretty solid answers to each question… Does Google wait for timeouts and display that content in the index? is that content searchable?Yes. Google does wait for timeouts and display that content in that index. That is to say, the content that was displayed after the timeout is included in the search index such that you can find it by searching Google for...

Keyword Research on Regular Expressions Steroids in Grepwords

There really hasn’t been much innovation in the keyword research space for a while and for good reason – the largest problem of getting good data has long been answered by top providers like SEMRush, Trellian KeywordDiscovery, WordStream and others like KeywordSpy. The data they provide is wonderfully useful, but the one thing that always felt limiting was the way we could get at their data. While they might provide accurate estimates for Google traffic, or useful data on large numbers of keywords, getting at the data required clumsy querying techniques no better than exact, phrase and broad match. As a developer, I found this cumbersome. Recently, though, I have found a better solution – Regular Expressions. At Virante we have long had access...

Open Penguin Data Project – Calling for Submissions

Many of you may have seen the launch of my new project Open Penguin Data. The description of the project isn’t quite clear so I thought I would explain a little further. What is the Open Penguin Data Project? I want to crowdsource potential variables that might be used by Google to determine which pages are caught by Penguin. I have created a CSV of URLs that are marked as either (1) hit by penguin or (2) not hit by penguin for a series of keywords. I need the SEO community to provide variables and their values for each one of the URLs in the dataset. For Example: Let’s say you believe that having links from blog comments might be a variable Google uses as part of Penguin. You would download the CSV of URLs and mark each one as either having or not...

Latest Research Project