Bot Control

Manage and block non-AI scrapers and SEO crawlers.

Last updated Feb 3, 2025

Overview

Bot Control helps you manage non-AI bots that visit your site — SEO crawlers, content scrapers, archivers, and other automated tools. While the Analytics tab focuses on AI bots, Bot Control handles everything else.

Bot Categories

CitedPro includes a database of 280+ known bots, categorized as:

Search Engine Bots

Legitimate search engine crawlers that index your content:

  • Googlebot, Bingbot, DuckDuckBot
  • Yandex, Baidu, Sogou

SEO Crawlers

Tools used for SEO analysis and competitive research:

  • Ahrefs, SEMrush, Moz
  • Majestic, Screaming Frog

Content Scrapers

Bots that copy or aggregate content:

  • Feed fetchers, content aggregators
  • Price comparison bots

Archivers

Services that preserve web content:

  • Internet Archive (Wayback Machine)
  • Archive.today

Viewing Bot Activity

  1. Go to CitedPro → Bot Control
  2. See a list of non-AI bots that have visited your site
  3. View visit counts and last seen timestamps
  4. Filter by category

Blocking Bots

To block a bot from crawling your site:

  1. Find the bot in the Bot Control list
  2. Click the Block button
  3. The bot is added to your blocked list

Blocking works at two levels:

  • robots.txt: Adds a Disallow directive (polite bots will respect this)
  • PHP-level: Returns a 403 Forbidden response (stops all bots)

Caution

Blocking search engine bots (Googlebot, Bingbot) will prevent your site from appearing in search results. Only block bots you're certain you want to exclude.

Custom Bot Blocking

To block a bot not in the database:

  1. Go to CitedPro → Bot Control
  2. Scroll to Custom Blocked Bots
  3. Enter the user agent string (or partial match)
  4. Click Add

Custom blocks match against the User-Agent header. For example:

  • BadBot blocks any user agent containing "BadBot"
  • SomeCompany/1.0 blocks that specific version

Unblocking Bots

  1. Find the bot in your blocked list
  2. Click Unblock
  3. The bot is removed from the block list
  4. robots.txt is automatically updated

Best Practices

  • Don't block search engines: Googlebot, Bingbot, etc. are essential for SEO
  • Monitor before blocking: Watch bot activity to understand impact
  • Block aggressive scrapers: Bots making too many requests or copying content
  • Allow archivers: Internet Archive preserves your content for posterity
  • Review periodically: New bots appear regularly

Bot Database Updates

The bot database is bundled with the plugin and updated with each plugin release. The database is sourced from Arcjet's open-source bot list (MIT licensed).