How do I block a bot from spidering my website?

 
  • Issue:
    You would like to block a particular bot from crawling/indexing your site
  • Solution:
    Edit your store's Robots.txt file.

To begin:

  1. Log into your Shift4Shop Online Store Manager
  2. Using the left hand navigation menu, go to Marketing >SEO Tools
  3. Locate and click on the link labeled "Edit Robots.txt File"

This page will have two distinct areas. Within the top half of the page, you will see the Robots.txt section containing your store's regular robots.txt file. It should look like this:

Sitemap: http://[store-url]/sitemap.xml

# Disallow all crawlers access to certain pages.
User-agent: *
Disallow: /checkout.asp
Disallow: /add_cart.asp
Disallow: /view_cart.asp
Disallow: /error.asp
Disallow: /shipquote.asp
Disallow: /rssfeed.asp
Disallow: /mobile/

Note
If the robots.txt file does not look like the above, you may click on the "Restore Default Robots.txt" link along the bottom of the window to revert it to default.

Most bots are given a user-agent name that can be used in the Robots.txt file to block them from accessing the site. Provided you know the name of the bot in question, simply add the following to the Robots.txt file

# Block [name] from crawling site

User-agent: [name]
Disallow: /
  1. Simply replace the [name] above with the user-agent name of the offending bot.
  2. Click "Save" at the top right to commit your changes.

Additional Information
It should be noted that the robots.txt file rules are merely a guideline and don't necessarily need to be followed by the bots. If the reason for blocking the bot is due to abuse or because the bot is crawling the site too much (and using up bandwidth), then it's not very likely that they would follow robots.txt rules to begin with. Therefore, a better solution would be to use IP blocking to prevent the bot's originating IP from accessing the site at all.



Attachments 
 

Help Desk Software by Kayako fusion