Know Your Risk: Penguin Analysis | Panda Risk

The Advantages of Speed: Finding Exact Match Keyword Domains in Drop Lists

There are a lot of problems out there for which elegant solutions are difficult, cumbersome or outright impossible. One of those that I have dealt with for years is combing through large lists of dropped domains to find which ones are exact match keyword domains – meaning that there is an exact keyword that makes up the entire domain name. To a human this seems like a very easy task, but to a machine not so much. The brute force method of doing this would be to take each domain and break it down into every different potential 1, 2, and 3 word phrase and then look up every combination. ie: ipadminicases.org would be… i padminicases ip adminicases ipa dminicases … i p adminicases i pa dminicases … This would be nearly impossible to do on a...

Google Analytics Style Keyword Suggestion Query Builder

Hey folks, just wanted to show off the new advanced keyword query suggestion builder for GrepWords. We decided to build a keyword suggestion tool modeled directly after Google Analytics Advanced Filters. Most of you are probably aware of what the GA Advanced Filters look like… In Analytics you can drill down through dimensions using containing, begins with, ends with, and RegExp. You can create multiple layers of filters to get down to that perfect result. Well now you can do the exact same thing in GrepWords. Here is a quick video showing it in action… The tool is only available to paid subscribers because it really does traverse our entire 80 million US language keyword database on the fly, no caching involved. Click below to see a full screenshot...

Why Compromise? MemSQL Outperforms NoSQL Solutions Again and Again

So, it probably isn’t much of a surprise to those of you that follow me on twitter that I am huge fan of the in memory database memSQL. There are a lot of awesome reasons why memSQL is crazy fast, which I’ll get to later, for why I have grown to love this database but let’s get started with my latest job… The Scenario I have 30 million results pages from Google searches, meaning 300 million entries for a URL, Domain, Subdomain, Keyword and Ranking. You can easily imagine a giant spreadsheet with this data in it. The row might look something like this in the spreadsheet… 1 | http://www.thegooglecache.com/ | www.thegooglecache.com | thegooglecache.com | google cache | 1 My first job is simple – given any URL, Subdomain or Domain,...