Dynamic bot identification with PHP

I've previously mentioned that I plan on posting some of my SEO software here and one of those is a cloaker that I'm still putting the finishing touches on. One thing I'll be adding into it is this snippet I came across on the syndk8 forums. With it you can identify bots or SE people dynamically and keep your own IP lists up to date.

function isBotDNS($ip)

{

$data=explode("rn",file_get_contents("./bad_host.txt"));

$revDNS=gethostbyaddr($ip);

for($i=0; $i<sizeof($data); $i++)

{

if(stristr($revDNS,$data[$i])!==FALSE)

{

return(true);

}

}

return(false);

}

bad_hosts.txt

google

yahoo

altavista

akamai

inktomi

.bot

bot.

crawl.

.crawl

.live.

microsoft

msn.

.ask.

ask.com

yahoo.net

As far as the cloaker goes, I'm not promoting it or making a big deal because I'm sure it will have a bug or few in the beginning, plus it will be free so I don't have a whole lot to gain by promoting it. It may not be the best out there, but it will be free and open source so hopefully others will help improve on it since school is keeping me busy. Some of the features include templates that are 100% customizable including the use of static keywords, custom keyword & link densities, several caching methods(cache by individual SE, show all SEs the same content, or cache for all visitors including SEs), customizable links(you have complete control over how your links will look), and htaccess support. When I say cache for all visitors, the cloaker really acts as a content generator. With the templating system I made, you can put in a little extra effort and the cloaker will make a virtually unlimited amount of readable, unique pages so there really isn't a need for IP delivery if you go this route.

These are just the things I can remember off the top of my head; I haven't worked on the cloaker in a couple weeks. BTW, the cloaker was actually completed about 2 years ago and I've used it with great results on several sites. What I've been doing is going through it line-by-line rewriting everything using OOP so it's easier to modify and extend. The original version was condensed into a couple files when I thought I might sell it. Now that I've been rewriting it, the code is much cleaner and will be easier for people to modify.

I'll go into more detail about the cloaker in the future. :)

These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Digg
  • del.icio.us
  • Furl
  • Reddit
  • Slashdot
  • Technorati
  • YahooMyWeb
  • Netscape

Comments

One Response to “Dynamic bot identification with PHP”

  1. SlightlyShadySEO on November 22nd, 2007 1:06 am

    You’re welcome ;-)

    heh. I ran into this site randomally. It was a bit odd to see my code staring me in the face..

Leave a Reply