Dynamic bot identification with PHP

I've previously mentioned that I plan on posting some of my SEO software here and one of those is a cloaker that I'm still putting the finishing touches on. One thing I'll be adding into it is this snippet I came across on the syndk8 forums. With it you can identify bots or SE people dynamically and keep your own IP lists up to date.

function isBotDNS($ip)

{

$data=explode("rn",file_get_contents("./bad_host.txt"));

$revDNS=gethostbyaddr($ip);

for($i=0; $i<sizeof($data); $i++)

{

if(stristr($revDNS,$data[$i])!==FALSE)

{

return(true);

}

}

return(false);

}

bad_hosts.txt

google

yahoo

altavista

akamai

inktomi

.bot

bot.

crawl.

.crawl

.live.

microsoft

msn.

.ask.

ask.com

yahoo.net

As far as the cloaker goes, I'm not promoting it or making a big deal because I'm sure it will have a bug or few in the beginning, plus it will be free so I don't have a whole lot to gain by promoting it. It may not be the best out there, but it will be free and open source so hopefully others will help improve on it since school is keeping me busy. Some of the features include templates that are 100% customizable including the use of static keywords, custom keyword & link densities, several caching methods(cache by individual SE, show all SEs the same content, or cache for all visitors including SEs), customizable links(you have complete control over how your links will look), and htaccess support. When I say cache for all visitors, the cloaker really acts as a content generator. With the templating system I made, you can put in a little extra effort and the cloaker will make a virtually unlimited amount of readable, unique pages so there really isn't a need for IP delivery if you go this route.

These are just the things I can remember off the top of my head; I haven't worked on the cloaker in a couple weeks. BTW, the cloaker was actually completed about 2 years ago and I've used it with great results on several sites. What I've been doing is going through it line-by-line rewriting everything using OOP so it's easier to modify and extend. The original version was condensed into a couple files when I thought I might sell it. Now that I've been rewriting it, the code is much cleaner and will be easier for people to modify.

I'll go into more detail about the cloaker in the future. :)

Hold the ads when launching a new site

Some time ago someone mentioned that when you have a brand new website to promote, you should leave all advertising off of it for a while until you begin to establish solid traffic. I think it was Shoemoney who said this, but I'm not 100%. Anyway, when I read that it gave me a little reassurance that my instincts were right because I've been doing that all along. When some people have a new website, the first thing they do is get all their ads, like Adsense, to look just right. The only problem is, when you go to start marketing it, if the first thing people see is a bunch of ads, they'll figure it's another MFA site and not pay much attention to it.

Here's how I deal with this using PHP... First, when I'm designing the site, I will go ahead and add the space for advertising or throw in the adsense ads so everything looks how I'll want it to when the site is running full-bore. Then I use a simple PHP snippet to hide the ads until I'm ready to show them.

Most sites I have use a config file for database connection stuff. If I don't have a config file, I'll use any file that gets included with every page or make one just for this. Inside that file, I'll put a simple variable like:

$show_ads = 0;

Then, wherever I have ads, I'll use a simple check to see whether they should be displayed or not, like this:

if($show_ads){
//show ads here
}

Obviously you would replace "//show ads here" with your ad code or an include or however you show ads. By doing it this way, I can do site design all at once, getting it to look just right, then hide the ads while I market the site. A few weeks or so down the road, I just switch the $show_ads variable to 1 and voila, advertising suddenly appears throughout the entire site.