User Contributed – Captcha Breaking W/ PHPBB2 Example

This is a fantastic guest post by Harry over at DarkSEO Programming. His blog has some AWESOME code examples and tutorials along with an even deeper explanation of this post so definitely check it out and subscribe so he’ll continue blogging.

This post is a practical explanation of how to crack phpBB2 easily. You need to know some basic programming but 90% of the code is written for you in free software.

Programs you Need

C++/Visual C++ express edition – On Linux everything should compile simply. On windows everything should compile simply, but it doesn’t always (normally?). Anyway the best tool I found to compile on windows is Visual C++ express edition. Download

GOCR – this program takes care of the character recognition. Also splits the characters up for us . It’s pretty easy to do that manually but hey. Download

ImageMagick – this comes with Linux. ImageMagick lets us edit images very easily from C++, php etc. Install this with the development headers and libraries. Download from here

A (modified) phpbb2 install – phpBB2 will lock you out after a number of registration attempts so we need to change a line in it for testing purposes. After you have it all working you should have a good success rate and it will be unlikely to lock you out. Find this section of code: (it’s in includes/usercp_register.php)

if ($row = $db->sql_fetchrow($result)){if ($row['attempts'] > 3){message_die(GENERAL_MESSAGE, $lang['Too_many_registers']);}}$db->sql_freeresult($result);

Make it this:

if ($row = $db->sql_fetchrow($result)){//if ($row[’attempts’] > 3)//{//message_die(GENERAL_MESSAGE, $lang[’Too_many_registers’]);//}}$db->sql_freeresult($result);

Possibly a version of php and maybe apache web server on your desktop PC. I used php to automate the downloading of the captcha because it’s very good at interpreting strings and downloading static web pages.

Getting C++ Working First

The problem on windows is there is a vast number of C++ compilers, and they all need setting up differently. However I wrote the programs in C++ because it seemed the easiest language to quickly edit images with ImageMagick. I wanted to use ImageMagick because it allows us to apply a lot of effects to the image if we need to remove different types of backgrounds from the captcha.

Once you’ve installed Visual C++ 2008 express (not C#, I honestly don’t know if C# will work) you need to create a Win32 Application. In the project properties set the include path to something like (depending on your imagemagick installation) C:Program FilesImageMagick-6.3.7-Q16include and the library path to C:Program FilesImageMagick-6.3.7-Q16lib. Then add these to your additional library dependencies CORE_RL_magick_.lib CORE_RL_Magick++_.lib CORE_RL_wand_.lib. You can now begin typing the programs below.

If that all sounds complicated don’t worry about it. This post covers the theory of cracking phpBB2 as well. I just try to include as much code as possible so that you can see it in action. As long as you understand the theory you can code this in php, perl, C or any other language. I’ve compiled a working program at the bottom of this post so you don’t need to get it all working straight away to play with things.

Getting started

Ok this is a phpBB2 captcha:

It won’t immediately be interpreted by GOCR because GOCR can’t work out where the letters start and end. Here’s the weakness though. The background is lighter than the text so we can exclude it by getting rid of the lighter colors. With ImageMagick we can do this in a few lines of C++. Type the program below and compile/run it and it will remove the background. I’ll explain it below.

using namespace Magick;

int main( int /*argc*/, char ** argv){

// Initialize ImageMagick install location for WindowsInitializeMagick(*argv);

// load in the unedited imageImage phpBB("test.png");

// remove noisephpBB.threshold(34000);

// save imagephpBB.write("convert.pnm");

return(1);}

All this does is loads in the image, and then calls the function threshold attached to the image. Threshold filters out any pixels below a certain darkness. On linux you have to save the image as a .png however on windows GOCR will only read .pnm files so on linux we have to put the line instead:

// save imagephpBB.write("convert.png");

The background removed.

Ok that’s one part sorted. Problem 2. We now have another image that GOCR won’t be able to tell where letters start and end. It’s too grainy. What we notice though is that each unjoined dot in a letter that is surrounded by dots 3 pixels away should probably be connected together. So I add a piece of code onto the above program that looks 3 pixels to the right and 3 pixels below. If it finds any black dots it fills in the gaps. We now have chunky letters. GOCR can now identify where each letter starts and ends . We’re pretty much nearly done.

using namespace Magick;

void fill_holes(PixelPacket * pixels, int cur_pixel, int size_x, int size_y){int max_pixel, found;

///////////// pixels to right /////////////////////found = 0;max_pixel = cur_pixel+3;// the furthest we want to search// set a limit so that we can't go over the end of the picture and crashif(max_pixel>

///////////// pixels to right /////////////////////found = 0;max_pixel = cur_pixel+3;// the furthest we want to search// set a limit so that we can't go over the end of the picture and crashif(max_pixel>=size_x*size_y)max_pixel = size_x*size_y-1;

// first of all are we a black pixel, no point if we are notif(*(pixels+cur_pixel)==Color("black")){// start searching from the right backwardsfor(int index=max_pixel; index>

// first of all are we a black pixel, no point if we are notif(*(pixels+cur_pixel)==Color("black")){// start searching from the right backwardsfor(int index=max_pixel; index>cur_pixel; index--){// should we be coloring?if(found)*(pixels+index)=Color("black");

if(*(pixels+index)==Color("black"))found=1;}}

///////////// pixels to bottom /////////////////////found = 0;max_pixel = cur_pixel+(size_x*3);if(max_pixel>

///////////// pixels to bottom /////////////////////found = 0;max_pixel = cur_pixel+(size_x*3);if(max_pixel>=size_x*size_y)max_pixel = size_x*size_y-1;

if(*(pixels+cur_pixel)==Color("black")){for(int index=max_pixel; index>

if(*(pixels+cur_pixel)==Color("black")){for(int index=max_pixel; index>cur_pixel; index-=size_x){// should we be coloring?if(found)*(pixels+index)=Color("black");

if(*(pixels+index)==Color("black"))found=1;}}

}

int main( int /*argc*/, char ** argv){

// Initialize ImageMagick install location for WindowsInitializeMagick(*argv);

// load in the unedited imageImage phpBB("test.png");

// remove noisephpBB.threshold(34000);

/////////////////////////////////////////////////////////////////////////////////////////////////////// Beef up "holey" parts/////////////////////////////////////////////////////////////////////////////////////////////////////phpBB.modifyImage(); // Ensure that there is only one reference to // underlying image; if this is not done, then the // image pixels *may* remain unmodified. [???]Pixels my_pixel_cache(phpBB); // allocate an image pixel cache associated with my_imagePixelPacket* pixels; // 'pixels' is a pointer to a PixelPacket array

// define the view area that will be accessed via the image pixel cache// literally below we are selecting the entire pictureint start_x = 0;int start_y = 0;int size_x = phpBB.columns();int size_y = phpBB.rows();

// return a pointer to the pixels of the defined pixel cachepixels = my_pixel_cache.get(start_x, start_y, size_x, size_y);

// go through each pixel and if it is black and has black neighbors fill in the gaps// this calls the function fill_holes from abovefor(int index=0; index<size_x*size_y; index++)fill_holes(pixels, index, size_x, size_y);

// now that the operations on my_pixel_cache have been finalized// ensure that the pixel cache is transferred back to my_imagemy_pixel_cache.sync();

// save imagephpBB.write("convert.pnm");

return(1);}

I admit this looks complicated on first view. However you definitely don’t have to do this in C++ though if you can find an easier way to perform the same task. All it does is remove the background and join close dots together.

I’ve given the C++ source code because that’s what was easier for me, however the syntax can be quite confusing if you’re new to C++. Especially the code that accesses blocks of memory to edit the pixels. This is more a study of how to crack the captcha, but in case you want to code it in another language here’s the general idea of the algorithm that fills in the holes in the letters:

1. Go through each pixel in the picture. Remember where we are in a variable called cur_pixel2. Start three pixels to the right of cur_pixel. If it’s black color the pixels between this position and cur_pixel black.3. Work backwards one by one until we reach cur_pixel again. If any pixels we land on are black then color the space in between them and cur_pixel black.4. Go back to step 1 until we’ve been through every pixel in the picture

NOTE: Just make sure you don’t let any variables go over the edge of the image otherwise you might crash your program.

I used the same algorithm but modified it slightly so that it also looked 3 pixels below, however the steps were exactly the same.

Training GOCR

The font we’re left with is not recognized natively by GOCR so we have to train it. It’s not recognized partly because it’s a bit jagged.

Assuming our cleaned up picture is called convert.pnm and our training data is going to be stored in a directory call data/ we’d type this.

gocr -p ./data/ -m 256 -m 130 convert.pnm

Just make sure the directory data/ exists (and is empty). I should point out that you need to open up a command prompt to do this from. It doesn’t have nice windows. Which is good because it makes it easier to integrate into php at a later date.

Any letters it doesn’t recognize it will ask you what they are. Just make sure you type the right answer. -m 256 means use a user defined database for character recognition. -m 130 means learn new letters.

You can find my data/ directory in the zip at the end of this post. It just saves you the time of going through checking each letter and makes it all work instantly.

Speeding it up

Downloading, converting, and training for each phpbb2 captcha takes a little while. It can be sped up with a simple bit of php code but I don’t want to make this post much longer. You’ll find my script at the end in my code package. The php code runs from the command prompt though by typing “php filename.php”. It’s sort of conceptual in the sense that it works, but it’s not perfect.

Done

Ok once GOCR starts getting 90% of the letters right we can reduce the required accuracy so that it guesses the letters it doesn’t know.

Below I’ve reduced the accuracy requirement to 25% using -a 25. Otherwise GOCR prints the default underscore character even for slightly different looking characters that have already been entered. -m 2 means don’t use the default letter database. I probably could have used this earlier but didn’t. Ah well, it doesn’t do a whole lot.

gocr -p ./data/ -m 256 -m 2 -a 25 convert.pnm

We can get the output of gocr in php using:

echo exec(”/full/path/gocr -p ./data/ -m 256 -m 2 -a 25 convert.pnm”);

Alternatives

In some instances you may not have access to GOCR or you don’t want to use it. Although it should be usable if you have access to a dedicated server. In this case I would separate the letters out manually and resize them all to the same size. I would then put them through a php neural network which can be downloaded from here FANN download

It would take a bit of work but it should hopefully be as good as using GOCR. I don’t know how well each one reacts to letters which are rotated though. Neural networks simply memorize patterns. I haven’t checked the inner workings of GOCR. It looks complicated.

My code

All the code can be found here to crack phpBB2 captcha.

Zip Download

In conclusion to this tutorial it’s a nightmare trying to port over all my code from linux to windows unless it’s written in Java . If only Java was small and quick as well.

It’s worth stating that phpbb2 was easy to crack because the letters didn’t touch or overlap. If they had touched or overlapped it would probably have been very hard to crack.

I plan to look at that line and square captcha that comes with phpBB3 over on my site and document how secure it is.

Thanks for the awesome guest post Harry.

–>

Quick Answers #3 – Old/Expired Domains

This question just in from Till

Hi Eli,

is an old domain just worth if it has quite a lot backlinks or is an old domain also worth if it’s just in the index of search engines, but has almost no backlinks (0-20).

Regards

Till

Great question. It captured my attention because theres always a lot of talk in the SQUIRT forum about expired domains. Several members of the community are talking about how they’re building their SEO Empires with snagged expired domains. I kind of cringe when I hear that not because expired domains are bad, but because I personally have no idea about the history of the domain. Frankly it could sway either way. The practice of using expired domains could be good or bad. The problem I have with it is the unpredictability, which I’ll get to in a moment. For now I assume the people know what they’re doing when they buy the domain and are making wise decisions. Much like buying a used car always do your research and find out the background of what you’re buying. The inherent problem is, the odds are stacked against you. If it was a good domain with value someone would of kept it. Yet, mistakes are made and there are some definite gems out there and if you aren’t on the field you can’t score. So while I think buying up expired domains for SEO reasons is a good thing if you know what you’re doing I am hypocritical in the fact that I don’t do it myself. The main reason is due to a question I have myself.

This question just in from Eli

Hi handsome!About 8 months ago I had several domains expire on me and never managed to pull them out. They were good domains with links, never banned or penalized and were part of several different projects. I reregistered them quickly and managed to get them back. I had no real purpose for them so I added them to a common platform site network I was working on with several other new domains. All the sites had the same structure and went through the same promotion, but for some reason the expired domains took nearly 3 weeks longer to get indexed than the brand new domains. 8 months later they still seem to perform about the same as the other sites, but I’m curious with all their previous backlinks and such why did those exact domains take longer than the others to get reindexed. Any ideas of why that was?

I still don’t know. I don’t have the attention span long enough to buy some control domains and wait a year to expire them out and hope I manage to get them back in order to do any tests and figure it out. Anyone else experienced this by chance?

Either way I see buying expired domains for SEO reasons as having the following benefits.1. Established inbound links2. Aged inbound links

Other than that your still starting from scratch. So my philosophy is, unless the domain is a gem, such as either a good name or it having phenomenal unique backlinks (ie lots of links or saturation like you mentioned) than its easier and more predictable to just work with new domains. Not to mention it saves a bit of headaches and time, and even sometimes money. Which brings me back to the predictability thing. I sometimes get questions from people about a particular basement or foundation site that was an expired domain like it suddenly dropped in ranking, or it got banned, or it lost a bunch of pages in the index. Anything out of the ordinary.

BTW I’d like to take this moment to remind everyone that in case you never noticed, every year right before Christmas sites tend to drop in saturation levels in Google. Its probably due to the upcoming updates that usually happen in January, I don’t know. Either way it seems to happen every year near the beginning of December.

So in cases like this you can look at stuff and maybe find a problem, or you can just write it off as the search engines being weird, but when your dealing with a new site on an previous registered domain you get that extra variable. Is the problem caused by a problem with the site, search engines being weird, or the history of the domain causing problems? It makes the job of diagnosing problems and learning from mistakes that much harder. For me personally, I’m still going to be doing this in 5 years so theres no point in forcing unneeded shortcuts on myself. All my domains will eventually become old, all my domains will eventually get link age. I just let time do its thing and in the mean time work on new exciting projects. <- its a good life

Which nearly answers the question about old domains. Old domains aren’t something I think people should stress about. Every single site I build, while I’m building it, I’m wishing the domain was old. Hell when I’m buying the domains I wish they were old. Yet in a year none of it matters and nothing has changed. I’ll still be wishing the domains I am buying now were older like the domains I bought last year and the year before that. It’s like playing Sim City, it doesn’t matter if you have it on fast mode or slow the strategy is still the same. Because the beautiful thing about age factors are, they are done for you

PS. Please read my Follow up post to SEO Empire if you haven’t already. It talks a lot about shortcuts and how to speed up the process of rankings, which I think is where time is best spent. The more experience you have with that the less you have to worry about domain age.

–>

Quick Answers #2 – The Word of The Day Is Class-Cunt IP

Now that I put the dreaded C-word in the title mine won’t be the only office in the nation calling it Class-Cunt ips. Watch, you’ll catch yourself doing it and frankly you deserve it. To make the transition into a technopotty mouth easier with a handy mnemonic: A Big Cunt Drowns Easier (E is incase we ever make that switch the government keeps rambling on about).

I probably get more questions about my distribution of IPs than any other type. Frankly I can answer it in one word, evenly. But once again hitting up our Open Questions post here’s a question that I think best illustrates the topic.

This one is from Quinton Figueroa

1. For each domain do you split your subdomains up in multiple C Class IPs or do they all stay on 1? Does it depend?

2. For each domain do you link from your subdomains to other subdomains or do you keep each one as its own stand alone “site”?

3. Do you set up in the 100’s of subdomains or in the 1,000’s of subdomains (or maybe more) per domain?

Appreciate the help man, you kick ass!

Google doesn’t penalize a site because of the other sites on the same IP or class. I say this with confidence because even though Matt Cutts publicly said it in one of his video dialogs I still researched it myself to make damn sure (you can thank me later ionhosting). I also haven’t seen any evidence that the other search engines are any different. So I speak the same answer whether I’m talking about one site having a different IP than another or a subdomain having a different IP than the main domain. It’s all under the same point of reference, but to address the question directly what’s the one primary reason why a subdomain has a different IP than a main domain? Thats right, it’s on a different server.

Side TrackBTW when people say a statement like, “I haven’t seen any evidence” it usually means they haven’t LOOKED at any evidence. For future reference, give statements like that about as much authority as a one legged security officer. Do your own research.

Back On TrackIf there is no penalty for sites being on the IP and there is no explicit reward for being on separate IPs than all thats left is two small benefits of 1. If your sites are black hat it makes it harder to track all them down. 2. The links appear to be more natural between two sites if they are on separate IPs (whether or not this is an actual benefit or not remains to be seen). So whole IP diversification business boils down to costs vs financial reward. So while in the past I’ve been very cautious of my own IP dispersement, which was only in part because during that period I was able to acquire IPs very cost efficiently, since I have lessened my efforts. The rewards vs the costs just aren’t there enough to invest any worry into the matter. So my answer is simply “evenly.” Use what you got. If you get a server and it gives you 10 free ips. Use them all and just distribute your sites amongst them. You won’t regret it and at the same time you wouldn’t see any explicit benefits from dumping a bunch of extra money every month into more ips. The money is obviously better spent on things thats make more revenue such as domains and servers. Even if you had unlimited IPs how would you end up distributing them? Evenly…

To be perfectly clear, even though I take IP distribution with a grain of salt it doesn’t mean I take nameserver distribution lightly and the same applies to domain registration info. In fact I’d say the one exception to the IP carefree rule is if you happen to write a blog teaching people how to bend over Google like a Japanese whore. I mention it, because I know some of you do. In which case be very careful about what sites you allow others to see. Throwing a few decoys out also doesn’t hurt because “do no evil” policies don’t apply to profit risks. Paranoia? For a year and a half yes, after Oct 21st of this year. No. You may not get it, but someone somewhere just shit their pants. So feel free to giggle anyways.

As for questions 2 and 3 if you would of asked me a year ago I would of had a completely different response. Yet the basic principle still remains. I talked about this topic to great depth in my SEO Empire Part 1 post. Reread the section where I talk about the One Way Street Theory. The decision on how many subdomains as well as whether or not they should be orphan subdomains or innerlinked is a decision I make by asking whether or not those subdomains would be of benefit to the main domain. If they are of a benefit to it than i establish a relationship between the two (ie a link either one way or exchanged). If they aren’t than I keep the subdomains orphan. BTW the term Orphan subdomain or Orphan Subpage was a term coined by an obnoxious troll here. I kinda liked it so I kept it. It means the subdomain has no relationship with the main domain or any other pages or subdomains of the site. Watch out for innerlinking between subdomains though. Think in terms of sites who do it effectively and sites that don’t. If your innerlinking in a way that mimics About.com or similar than great. If your innerlinking in a way that say Blog Solution or something would, for the sake of link building to each subdomain, I’d advise against it for footprint reasons and for god sakes if you’re hosting a blackhat generated site on a white hat domain don’t even consider it!

Do’s and Don’ts of Subdomains.Do create subdomains for the purpose of exploiting an established domains domain authority. – I’ve talked a lot about software related sites. I think they’re a great and easy way to build domain authority. Anything related can be thrown into a subdomain. I got a couple general sites that have great domain authority and anything i throw up on it does well in the SERPS almost instantly. I make sure to not over do it and it works out very well for me.

Don’t create subdomains to save on domain costs. – It’s less than ten dollars a year for fuck sake. Don’t risk trashing a $20/day site and its authority that it took you a year or two to establish to save $10/year.

–>

Quick Answers #1 – Easy Link Building

I’ll be dedicating a few posts to grabbing questions in the Open Questions post.

There were several questions like it, but I think this one represents it the best.

From Matthew

Hi Eli,

I’ve read through your blog and its amazing info. Thanks. I have a question. I’ve read through and I find the link building stuff (black hole SEO) a bit too complex, could you suggest any other effective link building techniques? I’ve heard great stuff about TNX.net.

Thanks!

I haven’t heard of TNX, but here’s a really simple one most haven’t thought of. Much like ugly girls tend to have better personalities pretty and clean sites tend to be harder to link build, or at least take more effort initially. So a little technique I’ve been using a lot lately is to build sites that gather links really easy. A quick easy way to do this is to build a site that distributes something people want to either put on their websites or social site profiles (ie. myspace, facebook, youtube and such). I’ll give you the most basic of examples. In a really old post on this blog I put up this picture:Since then everyone and their dog has been hotlinking to it, especially since it’ll often times show up in Google images for the term Middle Finger. Not that I care but it illustrates the stupid shit people spread and its effectiveness can be used for links as minuscule as it is. So lets say all I had in my arsenal was this stupid picture of a flaming middle finger. I post it up on some site get it in google images and such just like i did. Then under it I put a textbox that says something like Put this on your ____ copy past etc etc. The code has a link to a random domain of mine, the domain doesn’t even have to be active, or it can be a money site. Who really cares? After awhile the flaming finger image gathers enough links to that domain that i just 301 redirect the domain over (doesn’t even have to be an entire domain as I’ll mention below). All the links goes to my new pretty site.

This is a really weak example I realize, but it can be done with just about any media (video, image, flash, etc.) you’d like. It can be done with any type of site you’d like. As long as its the type that tends to be able to gather links faster than your other type of site. If I got big pretty sites coming out on a future date i can build several of these site and by the time the main site goes live I can have plenty of link volume to rank properly. The only reason why this extra work is necessary is because viral link building tends to be exponential. In other words you got to have links to get rankings and got to have rankings to have links. More links means more links. So it creates a nice little shove on a boat too big to leave the dock on its own. Best of all its really easy and rarely costs anything at all to do.

If you’re wondering why this all kind of sounds familiar or like you should already know it, its because its a creative spin on two of the more popular techniques on this blog, Link Laundering Sites, and Cycle Sites. On a side note I’ve found that this also works on subdirectories. So you can create a site that distributes media or allows people to upload their own and it has em all on a separate subdirectory or page, the links can even go to the same page the item is on. Once each subdirectory or subpage gets enough links to it, cycle it out and let the links go to a site that needs em. You can also put the media up on another subdirectory to be used again in gathering more links. The best advice I can give you on this is to look for types of sites that gain links quickly right out of the gate. They may not make a lot of money, they may be high bandwidth, it doesn’t matter if they’re all temporary. You can still steal the idea to do some easy link building on the harder sites.

More answers coming soon

–>

Open Questions

Well I’ve been working extremely hard this coming Christmas season. Got several personal big launches coming out as well as loads of other fun business. I’ve been writing nearly every day on SEO Empire Part 2. I also got several large followup posts in draft. It occured me though, as if I couldn’t with the flood of emails, that I haven’t updated in a month. So I thought it would be fun to throw out a few short and to the point posts before I dig down into my presents and finish up this bitch of a season.

So lets take on a few open questions. A public dialog between yourself and I and the other readers. I know theres lots of great SEO and marketing questions out there and frankly I can talk about it all day. Fire off any questions you got in the comments and for the next week I’ll be answering as many as I have time for.

Have fun and happy holidays!

–>

Guest Post: Make $50 dollars a day with Google Custom Search

Here’s a nice little guest post contributed by SEOcracy. I love guest posts that involve some form of creative money making idea.

—————————— I hope you have all been paying attention over the past week, because today I am going to build on last weeks database revelations and tell you all how to use database content to make serious money (and build serious traffic) through Google Custom Search.

Now, I have been making decent bank with Google Custom Search for a while now, and having recently amp’d up my efforts in a big way, I feel confident enough to make the claim that you should all be able to make at LEAST Fifty Dollars a day using the techniques I am about to outline.

Google Custom Search came on the scene back in late 2006 and it really didn’t make the splash that I expected it would in the SEO scene. Google Custom Search, in good Web2.0-Mash-Up style, gave me a brand new way to inter-link and cross promote my diverse network of niche sites, and I thought that was pretty cool. In fact, most people thought that was pretty cool, and that was about all we really heard about the launch of GCS.

But let’s stop and examine three things that make GCS especially cool for us SEO’s and Affiliate Marketers, shall we?

1) GCS engines can be highly targeted, returning extremely relevant results. This means that we can create a GCS that will satisfy the search needs of our site’s users based on their specific interests; and a satisfied user is a loyal user.

2) GCS engines allow us to return results only from the websites we choose. This means that we can set up a GCS to promote only those sites within our network. So if we have a mini-network on home development, our GCS can be set to only return those relevant results from sub-sites within our network. For our mini-network on home development our GCS might return results from sub-sites that provide mortgage offers, information on concrete polishing, or how to select granite countertops. This allows us to cross promote the sites within our network instead of having visitors turn to our competitor for more in-depth information.

3) GCS is made to be monetized. You can display your Adsense ads in your search results and thus can profit from the increased impressions when people use your GCS.

From an SEO point of view, GCS is solid gold because it lets you bypass the usual google.com search completely and in so doing, you bypass your competition! Of course, as I’m sure many of you who have played with Google Custom Search have already realized: your search engine is only as good as the number of users you can funnel into it. Meaning, if you have a GCS that only does 10 searches a day then you are really not going to see any tangible benefit to having it setup on your website.

The hardest part about making a profitable custom search is getting traffic to it. Often people add GCS to their website as an afterthought. Maybe they just feel like it is a cheap & easy way to provide search functionality and increased accessibility to their users. That is all well and good, but we aren’t just trying to make our site more accessible, we are trying to make some extra money, and this implementation of a Google search is rarely profitable because only a small percentage of your websites visitors will ever use it.

As of this post, I am running a HUGE amount of different GCS engines. I have a GCS engine for every single niche I am involved in. Every time I start a new niche, one of the first things I do after going online is to create a GCS for it. I constantly maintain and update my GCS’ settings and configuration, adding and removing websites from each niche network. This is a lot of work, but depending on the subject matter for each niche, the profit returned from the GCS alone can rival the profit I make on the niches individual websites themselves.

But still, we face the same problem. My many GCS engines are useless and won’t make a cent unless I am providing them with a sufficient volume of visitors performing searches. I can’t depend on people to use my search engines just by visiting the website and then typing a query into a box. There are too many distractions on a website, too many places to click and things to see. Because of that, you will find that only a very small percentage of your websites visitors ever actually use your GCS.

Rather, what we need is a way to take every one of our users interested in a given niche and funnel their attention directly towards using nothing else other than our niche’s GCS to find the answers they need. This is where things start to get interesting.

The secret here is thinking outside of the box. Sure lots of websites offer inline search functionality, but how about offering a search-based service outside of a proper website? How about getting people to search for answers in your niche without ever even visiting your website, and without ever even knowing your website exists in the first place?

What I am referring to here are desktop widget platforms like Google Desktop or (my favorite) Windows Vista Sidebar. Even web-top widget platforms like Netvibes, Facebook and more. For the sake of this post, I am going to focus solely on Windows Vista Sidebar.

Out of curiosity, I installed Vista on a partition just to take a peek at it, and one of the first things I started messing around with was the sidebar widgets. They have made it incredibly simple for people to create and publish their own widgets to the Windows Live Gallery website for other people to download and use. Also, Windows Vista Sidebar widgets can be very lucrative since not many people are creating them yet! Now is a great time for you all to get a slice of this action before the whole world eventually ports to the Vista OS.

A Vista sidebar widget is comprised of simple HTML and an XML file that acts in many ways like a PAD file does in as much as it identifies your widget and its provenance. If you know how to copy-and-paste and some basic HTML, then you can easily create a desktop widget.

So here is what I do:

Finding the proper niche is key to doing large search volume. Be SPECIFIC. For example, one of my most high traffic search engines was one that enabled people to interpret their dreams.

Realizing that people love to know what their dreams mean, I did some brief research and compiled a database of dreams and their meanings. Then I created a website around that database and did all the usual SEO stuff on it, got it indexed and laid in some Adsense ads just for good measure.

Next, I went to my Google account and set up a custom search engine. The setup interface allows you to target your search engine to a pre-defined list of websites or to have to search the entire web. For my purpose, I made the custom search engine return results from ONLY my website that I had made around the dreams database I compiled. That way, not only do I stand to profit from the ads displayed in the SERPs but since all the SERP results are for pages on my site, I am funneling traffic into my site which in turn shows more of my ads.

After going through all the setup steps, Google spat out the code for me to copy-and-paste into the HTML for my Sidebar Search Widget.I am not going to go into detail on how-to create your widgets as it is extremely easy to do and you will be able to figure it out with just a little research on your part.

So I designed my little search widget with nice clean interface and a snappy title and then I published it to the Windows Live Gallery website. Within one week, over 1000 people had downloaded my widget to their desktop sidebar. Getting that kind of desktop real estate on peoples computer screens is something that most internet marketers would KILL for. My search box is on their screen every day and it continually sends their requests to my website.

The important part is repeating this process on many different subjects. The more subjects you cover, the more search volume you will pull. The beautiful thing here is the sky is really the limit. If you want to pull in that $50/day I’ve been talking about, then you better be prepared to create a GCS for every niche you are in. And you better be prepared to present each of your GCS’ in a different widget for each platform. Ie: Create one widget for Netvibes, one for Vista, one for Google Desktop, etc, etc. Once you get the hang of it, you will find that you can create a simple template for each platform and then just plug in the different GCS code into each one. After a while, you’ll be creating new search widgets at an amazing pace.

Now, remember how I said that this post was going to build on the previous posts about Google Hacking? I wasn’t kidding.

Take a sec to peek through the Free Downloads section of the website and think about how you can build websites around the databases provided there. How about creating a database of Bible verses, and then creating a Bible Study search widget that funnels all searches into that website? How about creating a website of food and drink recipes and making it searchable via a desktop widget?

The databases provided give you an excellent leg-up in creating websites with large amounts of information that are perfectly appropriate for Google Custom Search widgets.

Here are some other ideas to get you started:

* Video Game Cheats Search Widget * Myspace Layouts Search Widget * Baby Names Search Widget * Restuarants Search Widget * Celebrity Gossip Search Widget * Product Recalls Search Widget * Lyrics Search Widget

In parting, here some more food for thought for all your Affiliate Marketers:

Instead of just relying on Adwords for profit, how about a desktop search widget that returns Amazon Books with your Affiliate Link? This is where the real money is.

Before you comment below, I want to make a few things clear: I know that “Make XXX dollars a day” posts tend to be incendiary and/or contentious topics for people. The success of this technique, like every money making technique, depends on how much creativity you employ, and how much volume you push. Further, the number “50″ is completely arbitrary; it makes for a nice round number and a good title for a post. Truth is, if you are smart and play it right, there’s no reason you can’t pull in a lot more than $50.00/day (a lot more). Conversely, if you are half assed about it, like with anything, you probably won’t do nearly as well. So if you are about to bitch and whine about not being able to pull $50/day (which is pretty easy to do) then take a step back and ask yourself if you really are doing everything right and pushing enough volume.

–Rob

———————————Thanks for the great post Rob. He has some great articles and free stuff at his blog SEOcracy.com so be sure to check it out.

–>

Followup: SEO Empire Part 1

AlrightyLet’s discuss the SEO Empire Part 1 post.

The post covered a ton of information very quickly and talked about a lot of different types of example sites for each level. But the million dollar question remains; Why the structure? Why not just stick with the proven practice of, if you throw enough mud at the wall eventually something will stick? Why not just build site after site until something makes you money? After all, that is how most Internet marketers have made it. Well the answer comes from history.

I made a post about this time last year called Float Like A Porn Site Sting Like A Sales Page which basically talked about learning from history and the type of sites that are ahead of us in the game. I’ve always been on the fringe of the webmaster community and I’m a firm believer in that fringe websites such as Adult, Warez, Music, Poker, Pharms are years ahead of the mainstream as far as attracting traffic and conversions. So what works for them now will eventually make its way into mainstream marketing. Back in ‘97-’00 there was a trend amongst Adult, Warez, and MP3 sites to use Top Sites Lists as a method of sustaining traffic growth. At the time in order to make any headway into the niches you had to have at least 3k unique visitors/day. That was about the standard for any site within those groups to be considered semi successful. Now that differs quite a bit today but back then that made them fairly competitive. The owners of the top sites eventually figured out that the best way of propelling their growth was to create a ring of promotion. Remember those things called popups? Haha. So what they would figure out and do is, instead of creating one top sites list and heavily promoting it until it became big. They would create three. Each one would target a different but similar topic. They would manually edit the stats so the other two top sites would show up top on their list, thus getting the most traffic from incoming votes from the other sites on the lists. Then they would manually go through each top sites list they have and sign it up with dozens of other top sites list that make the closest match to their topic. After a few manual click-throughs and votes every day their ring of sites would start pulling in bottom level traffic, that bottom level traffic would more than likely go to another one of their top sites list via the list until they eventually voted out. Since most top sites list have a higher OUT traffic ratio than IN due to other traffic sources their traffic would exponentially climb amongst the small top sites network. Eventually once each top sites list started becoming super successful and getting lots of other legitimate webmasters signing up, they would slowly phase out the top sites list they manually signed up for and end up with nothing but genuine traffic. Traffic that had a tendency to get passed around their mini network mind you. Once the traffic on the promotional ring started reaching its critical mass the webmasters started looking at ways to better monotenize it. When DoubleClick (MS CPC), BabylonX (adult affiliate), and Casino Gold(gambling) banners at the top wouldn’t suffice to upkeep on the 10-20k visitors/day worth of was-then expensive bandwidth they started looking into what would be considered the modern affiliate sites. Which of course spore a boom in TGP, Webrings, and even affiliate landing pages. This, from my own experience, was the defining proof that network building works. Mostly because the webmasters that built networks back then in at least a similar fashion are still around today where as many, sad to say, poop flingers never made it past the early 2000 Internet bubble. The idea of a network works, we just have to apply it to our mainstream system of making money.

So what exactly are the direct benefits of the foundation and basement levels of our SEO Empire? Well, theres two primary benefits. The first is the cost efficiency vs. return on the investments. Like most starting Internet Marketers its good to start out smart. While there’s the need for immediate income for personal reasons theres also the need to make the investments count for as much as possible. Growth is very important in your first year in the industry. With foundation and basement sites you get the most growth and immediate income possible while ensuring less risky investments. In other words, its immediate and sustainable money. The money, over time, also isn’t dependant on how much immediate work you put into it. The work you put in this month, will give reoccurring income next month and a year down the road. So you’re never treading water as far as growth is concerned. Forward momentum is very important no matter what level of business you’re at. So at the very least, do it because it will make you money.

The indirect benefits aren’t as apparent without experience but are by far the most important. Every foundation and basement site you build is increasing the leverage you have for the next site. As you build the total leverage you have for your money sites increases exponentially. So breaking the cardinal rule of “Every site must pay it’s own rent” doesn’t do justice to the fact that these sites are worth a fortune down the road. Theres no bullshit when I say that and I’ll explain how it works. It begins with fundamental concepts of Entry Points (future post) and Link Building Leverage.

Link Building LeverageI talked a little about Link Building Leverage in my SERP Domination post but SEO Empire takes it one step further. It actually gives you an immediate competitive standing within the niche. Theres four basic stages that make up a good link campaign. Each of which if not done results in no or dismal rankings.

Indexing – Covers all aspects of getting into the search engines, getting deep indexed, and minimizing supplementals.

Link Volume – The sheer quantity of links. In proportion to your competition your site must match or exceed the rest of the sites.

Link Quality – encompasses the relevancy and authority of the inbound links.

Link Saturation – The ratio of the volume of inbound links your site has, versus the quantity actually indexed and counting in the indeces. This was thoroughly covered in my Real Life SEO Example post and the Log Link Matching technique.

Link Building As it pertains to difficultyClick To View Full Sized

With the Y axis representing difficulty and the X representing rankings the graph can be stretched to fit any ratio of keyword competitiveness in regards to link building. Notice that Link Volume and Link Quality have essentially the same worth and in fact there is a blurred area between the two that allows one site with more link volume and slightly less link quality to outrank or underrank a site of inverse values. They are, essentially equal they just come in different stages of the rankings. The same could nearly be said for indexing and link saturation. When achieving a number 1 ranking you cannot discount indexing being of more value thank link saturation. Many sites don’t ever consider link saturation and thus many sites don’t achieve top rankings or have a hard time maintaining them. In other words, I kick their hippie rainbow chasing little asses.

So What About Difficulty/Availability?Much like traffic curves mentioned in the SEO Empire post link building curves the same way. Indexing starts off very easy. It’s just getting your site listed with the site: command and an accurate title and description. It then starts to work into deep indexing where as the saturation levels rise the difficulty in getting more pages in increases. It then leads into removal of supplementals and peaks up in technical difficulty. This is where most webmasters start their adventures. From this point on you are an SEO Pro. YOU WILL SPEND AS LITTLE TIME ON THE INDEXING PHASE AS POSSIBLE! Amateurs spend 3 weeks to three months working on this. You need to spend as little as 10 minutes and let the rest happen. I’ve given you the tools to do it so there should be no excuses.

Link Volume is the next phase. This phase starts off easy. There’s millions of sites and just as many link opportunities. At first its very fast moving and its easy to find a good quantity of links. After awhile though you run into what I affectionately call the Link Wall where link volume meets the beginning of Link Quality and easy to grab links start to become a little more scarce, thus the difficulty begins to climb a bit.

Link Wall: Quote From SERP Domination“You got to love the supposedly nonexistent brick wall of relevant links. Dipshits on newbie forums love telling people, “don’t worry about the rankings, just build some relevant links and they’ll come.” So you do just that, after all they have over 4,000 posts on that forum, it can’t all be complete garbage advice. At first it’s totally working, you’re gaining a good 50-200 very relevant links a day. You submit to directories and score a bunch of links from your competitors. After a couple weeks you even manage to score some big authority links within your niche. Suddenly it all starts to slow down. The sites that are willing to link already have, and the rest are holding firm. You’ve just hit the Relevant Link Wall. Don’t bother going back for further advice. They don’t have any, and if they did they are too busy trying to rank for Kitty Litter Paw Prints to help.”

That essentially, in more colorful wording, describes the Link Wall.

So when you’re entering the Link Quality phase at first its difficult. You’re new so very few well established and ranked sites want to link to you. You have to scrounge around and dig for relevant links. Eventually though, your site becomes established and more and more webmasters in the niche start to recognize it and link to it. Very much the same thing happens to blogging. Eventually you break through and you’re on a downhill coast to rankings.

Suddenly though you reach around the top 10 area and things start to get competitive. You can’t move up without shoving someone else down. They all have the same essential links as you and quite a few older ones that you don’t. You can make up in link volume but you need one last push to drive you to the top and maintain it. Thats where the last phase Link Saturation enters the field. If you can become more efficient in your link building you stand a much better chance of defeating them.

This process is essentially what common Internet Marketers have to go through every single time they release a new project. They are in a constant battle to get indexed, acquire link volume, scramble over the link wall and get some link quality then make those links count just to rank. This is why new sites take nearly a year to show their true potential, its a long hard battle with each and every one to get the resources necessary to compete. When you’ve been in the business as long or longer than I have you’ll be the first to testify that this…shit…gets…old. Throwing mud at a wall hoping something will stick is fine, but when it takes that much time consuming effort on every throw it’s easy to see why people drop out of the game so fast. Welcome to McDonalds may I take your order?

There is an easier way and that’s what I’m trying to get at with the first part of the SEO Empire post. Not only is it easier, better, but every step is also 100% profitable while you’re doing it. It may not be a gigantic shortcut like every aspiring Internet Marketer dreams of but it is in all essentials the primary plan of most SEO Pro’s for a very good reason. The resources are recyclable. Let’s talk about resources and how they play into the ranking phases. Here are a few I’ve given you….

The Applications Of ResourcesClick To View Full Sized

So in my SEO Empire post you can see what I was talking about when I said all my previous posts lead into that strategy. Throughout this blog I’ve given several methods of automatically skating through each and every stage of SEO that determine top rankings. So when you create a new site you no longer have to start from the beginning. In a sense you can nearly skip the entire process that takes most webmasters months to years to accomplish. Instead, when you enter a new niche, you start at the top of the Link Wall and only have to do the absolute minimal work to achieve the rankings. So with my basement and foundation levels of my SEO Empire where do I start each site in respect to my competition? Where does my personal involvement end?

Link Building Process After SEO EmpireClick To View Full Sized

Thats quite the workload difference ain’t it? I’d say so. Calling it an easier shorter way is the understatement of the fucking century. I can build a site, launch it, spend a couple hours getting some relevant links from the competition or parallel competition (I’ll explain in a future post) and within the same day as the launch my brand new site instantly has everything it requires to achieve a top ranking. I’ve literally done six months to a years worth of work in a single day. It may take a couple months for the actual rankings to happen but it’s all automated, I don’t have to worry about it or check up on it. I’m off to my next project. Meanwhile my competition is working everyday struggling to keep up with me. To them I end up looking like some kind of sleep deprived SEO juggernaut, when the reality is I spent so little time on the site I didn’t even have a chance to memorize the domain. Let’s jump to some questions….

Questions

Why Build Up? Why Not Down?I normally don’t check my Technorati very often but the other day I caught a very interesting one. Someone plans on writing an article similar to my SEO Empire Part 1 except teach people how to build a money site, then build the foundation and basement below it. It’s essentially the same concept except you can jump straight to money sites. Here’s the problem with that. Look at the chart above. That allows me to accomplish the requirements for any new site in a single day. If you build money sites then build downward it would take several days to a week or two worth of work to get what you were needing with every site launch. It kind of defeats the purpose and isn’t practical in the long run. The second point being is the speed at which you’ll be adding new money sites. No point in building a foundation or basement site for the sole reason of promoting a single money site, so every one you build you have to look back through your other money sites and add them in as well to your new foundation and basement site. Management of foundation and basement sites are easy, management of money sites becomes increasingly difficult the larger your empire grows. I’m still looking forward to reading the post though.

How Do You Manage Your Foundation and Basement Sites?I was fortunate enough to preplan my empire through lots of premeditation and previous experience. So I developed a centralized system that allows for easy management. Every page I create with my database sites, hosted blackhat, spammed blackhat etc. etc. I put in a database call to a single table that holds all the pages of every site I make and a list of links with each one. When I create a new database site I loop through all the pages and give them a number (primary key ids), i make a small note of their keywords. Such as in my example I said a certain page would have the keywords Remax Real Estate In Portland Oregon. The keywords that would be inserted into the link inventory database would be Remax,real,estate,portland,oregon. The same goes with my cycle sites, the .htaccess goes to a script that gives the error message -Keywords- Sorry this site has been shut down due to terms of service violation -sincerely DoucheHosting Inc. -ads- (hehe still gotta make the money). Then the script pulls from the cyclesite database that also has its keywords. If it gets assigned a site it then does a 301 perm redirect to that site instead of displaying the error page. So whenever I create a new money site, all I have to do is enter my dashboard, put in its keywords, url, anchor text, and specify how many links I want. It’ll come up with list of all the pages/cycle sites that have a relative match to the keywords along with a count of how many total links. I select as many or all that I want, then if I want any more I can go through and select irrelevant ones for as much as I need, or specify alternate keywords that’ll work as well, such as land, homes, garden, pets or some shit along those lines. Building thousands of links becomes a 5 minute job.

How Long Did It Take You To Build Your SEO Empire?It took me about a year to get it up to the level I’m happy with. It wasn’t a big deal, because during that entire year and process I was making money. The foundation and basements have been very successful for me and continually brought me in bigger and bigger checks every month (rule #2). So it was not some rough time I had to bear through. I’m also always building on my SEO Empire. I treat it, much like it is, a 9-5 job. If you already have a 9-5 make the SEO Empire your 7pm:9pm job. Hell 1am-5am job. It doesn’t matter. The reality is, you can build a respectable and workable foundation and basement within only one month. Then as you get more time and more money just keep building it outward (rule #3). It’s an ongoing process. If you’re new to it, try building one of each type of site mentioned. You can have that done by the end of the week. Once you got the initial ones built, by recycling much of the code you can have 9 more of each type built by end of next week. It’s all very easy and quick once you sit down and actually do it. I’m not writing this shit out of vanity ya know I want you to actually accomplish it.

How Long Should I Spend On My Foundation Sites?The best advice I can give you. Make your first one absolutely beautiful. Be proud of it, get a few opinions from your IM friends (rule #4). As you build more you’ll quit caring and you’ll get sloppier and sloppier. You’ll see the same results but that template pack you started using will start wearing thin and you’ll start going for the easier to change templates. You’ll still always have that prideful one to show your affiliate managers and the volume to match. Most importantly I say this because the more time and thought you invest in your first site the less problems you’ll run into on your second.

Help! I Don’t Have Access To SQUIRT! And You Just Mentioned It In That Little Chart Thingy.I did a post on How To Build Your Own SQUIRT. I didn’t post that immediately after the launch to brag or deter people from signing up. I want people to build it, if you read Blue Hat regularly it shouldn’t be some big secret on how it works. I laid it out for you in plain English, there it is. A lot of people have the skills but not the financial resources, its not their fault and we were all there at one point. If that’s you, read the post and go for it buddy. If you run into something you’re unsure of email me and I’ll see if I can help you through it (rule #4). Just don’t let a little set back deter you from your goals.

Adsense Is A Possible Footprint. Everything Is Ruined Before I Even Made My First Site! My Empire Is Crumbling!*stares blankly at Kenneth*………*hands him a Problem Solving For Dummies book and a red pill*

You’ll be alright buddy

–>

SEO Empire – Part 1

Podcast Versions:

Printer Friendly: Part 1

This is exactly how I make money online…

This blog has a lot of great tips and techniques to help the average webmaster break beyond their barriers. However they are nothing more than skillsets. Skillsets are worthless without direction. For that reason before I’m done with the missions I want for this hobby (blog) I want to lay down 4 corner stone strategy posts. This is the second behind my SERP Domination post which taught the power behind numbers. As mentioned in my Log Link Matching article every technique on this blog interconnects like a well connected puzzle and fits together perfectly to form an ultimate SEO strategy. This is that strategy. In that spirit every post before this one builds up to this post and every post after is a follow-up to it. By now you hopefully have had time to browse through the archives and digest all the past posts. This will give you the necessary skillset and more importantly mindset to put all this into practice. I’ve always preached that there is no rules in SEO only loosely enforced guidelines. So it’s time to take the Jalape

Updates Coming

More updates are coming soon. I got some great guest posts coming up and I’m currently finishing up on a HUGE post that I’ve been working on for the last month. You bet your ass it’s big but I selfishly think its the most important must-read post in the entire industry. So I want to do a good job on it. In the meantime read this great post over on Jon Waraas’ blog written by our very own guest poster Mark.

By and by! If anyone has a good radio voice and would like to donate it to making my upcoming big post a mp3 podcast please email me Eli at BlueHatSEO dot Com. I’ll send you the post a couple days early. It’s very long and should be covered at least twice to digest all the information, and I’d hate to force everyone into only having the option of reading it.

Miss Ya’ll!

–>