Bots: How to block bots that don’t Respect your robots.txt file
In this article, we will show you how to block bots that don’t respect your robots.txt file.
For this article, you will need the PHP script mentioned in our other article here.
Once you have the block.php file in place, edit or create a robots.txt file in the public_html directory.
Enter the following…
Disallow: block.php Disallow: /block.php
Now we need to set the trap. Enter the following code at the top of your website (just after the body tag)…
<a title="" href="http://yourdomain.com/block.php"><img alt="" src="pixel.gif" width="1" height="1" /></a> |
Make sure you replace yourdomain.com with your actual domain.
Russ March 4, 2014 at 11:23 am
Hi, I understand adding this disallow option to the robots.txt, but wouldn’t everyone be blocked by adding this title at the top of a website?
James Davey March 4, 2014 at 12:13 pm
Hi Russ,
It looks that way, but no. This code actually does nothing on the site – it is designed to hang the link out there, something that crawler bots will be unable to avoid or resist.
Russ March 4, 2014 at 12:36 pm
Gotcha. Thanks for the info!
P September 24, 2015 at 1:45 am
Nice Script :) Thanks
Stephen February 18, 2016 at 7:28 pm
wordpress uses php, where is the body tag found?
James Davey February 19, 2016 at 5:22 am
Hello Stephen,
There is no body tag in that case. But you can create a flat html file with this, and upload it to public_html, and have the same results.
Monat March 9, 2016 at 3:54 pm
If I’m adding this into Magento, which html file do I amend? When you say put a flat html file into the public_html, I know where it goes, but what does that html file look like? Does it include only the above code?
Monat March 9, 2016 at 3:58 pm
Also, the .htaccess file is invisible. Even when I upload something named different, and try to change its name to .htaccess, the server tells me I’m replacing a file that is already there, even though I can’t see it.
James Davey March 10, 2016 at 7:19 am
Hello Monat,
The preceding . in the filename indicates that this is a hidden file. You will need to specifically allow your FTP client to show hidden files (found in the settings, though the specific location varies) to see it.
James Davey March 10, 2016 at 7:18 am
Yes, this would include only this code.