Save 15% with code SAVE15

AI Bot Tracking

Detect and monitor 60+ AI bots across 5 categories with auto-updating bot database.

Last updated Feb 21, 2026

Overview

CitedPro detects and logs visits from 60+ AI bots across five categories: AI Agent, AI Assistant, AI Data Scraper, AI Search Crawler, and Undocumented AI Agent. Every time an AI system crawls one of your pages, CitedPro records the visit so you can understand which AI platforms are consuming your content and how often.

All tracking data is stored locally in your WordPress database. No information is sent to external analytics services.

How detection works

CitedPro identifies AI bots using server-side user-agent string matching. When a request comes in, the plugin checks the HTTP User-Agent header against its database of known AI bot signatures using fast substring matching (stripos, not regex) for minimal performance impact.

To prevent duplicate counting, CitedPro generates a session hash from the combination of the visitor's IP address, user agent, and the current hour using MD5. If the same bot visits multiple pages within the same hour, each page visit is recorded, but the session hash ensures accurate unique-session reporting.

Auto-updating bot database

The bot database is automatically updated from api.srworks.co and cached locally for 24 hours. If the API is temporarily unavailable, CitedPro falls back to a bundled bot list that ships with the plugin. This ensures detection continues working even if the API is unreachable.

New bots are added to the API database as they are discovered, so your site stays current without needing a plugin update.

Tracked AI bots

CitedPro recognizes bots from every major AI company. Here are some of the most common ones across each category:

AI Search Crawlers

  • OpenAI: GPTBot, ChatGPT-User, OAI-SearchBot
  • Anthropic: ClaudeBot, Claude-Web, Claude-SearchBot
  • Perplexity: PerplexityBot
  • Google: Google-Extended, Gemini-Deep-Research
  • xAI: GrokBot, Grok-DeepSearch
  • DuckDuckGo: DuckAssistBot

AI Agents

  • Amazon: Amazonbot
  • Apple: Applebot-Extended
  • Meta: Meta-ExternalAgent
  • Microsoft: Copilot

AI Data Scrapers

  • TikTok: Bytespider
  • Common Crawl: CCBot
  • Cohere: cohere-ai
  • Diffbot: Diffbot

AI Assistants and undocumented agents

CitedPro also tracks AI assistant bots and undocumented agents that have been observed crawling the web but are not yet officially documented by their operators.

What gets recorded

For each AI bot visit, CitedPro stores a row in the wp_cited_bot_visits database table with the following fields:

FieldDescription
bot_nameThe identified bot name (e.g., GPTBot, ClaudeBot)
bot_categoryAI Agent, AI Assistant, AI Data Scraper, AI Search Crawler, or Undocumented AI Agent
bot_idUnique identifier for the bot in the database
user_agentThe full HTTP User-Agent string
page_urlThe page that was visited
ip_addressThe bot's IP address
visit_timeTimestamp of the visit
session_idMD5 hash of IP + UA + hour for deduplication
referrerHTTP referrer if present
is_human0 for bots, 1 for LLM referral visits

Viewing bot activity

Bot visit data surfaces in two places:

  • Dashboard: Shows total AI bot visits, the top bot, category breakdown, and a recent activity feed at a glance.
  • Bots tab: Lists every detected bot with its hit count and category. Click any bot to view its recent visit details including pages visited, timestamps, and full user agent strings.

For trend charts, date range filtering, and CSV exports, see the Analytics documentation.

Data retention

Tracking data is automatically cleaned up by a daily cron job based on your retention setting:

SettingRangeDefault
Data retention30 to 365 days365 days

To adjust your retention period:

  1. Go to CitedPro → Settings
  2. Change the Data Retention value
  3. Save changes

Records older than the configured retention period are permanently deleted by the daily cleanup cron. Reducing the retention period will delete older records on the next cron run.

Understanding your data

What high bot traffic means

Frequent visits from AI crawlers indicate your content is being indexed for AI search and recommendation systems. This is generally positive. It means AI assistants can potentially cite and recommend your business when users ask relevant questions.

What low bot traffic means

If you are seeing few AI bot visits:

  • Your site may be new or have limited backlinks
  • Check that your robots.txt is not blocking AI bots you want to allow
  • Verify your llms.txt and AI discovery files are accessible
  • AI bots tend to prioritize higher-authority sites first

Tip

Complete your business setup and generate your AI discovery files to increase the likelihood of AI bots finding and crawling your content.

Performance impact

Bot tracking adds minimal overhead to your site:

  • Detection runs early in the request lifecycle using fast string matching
  • Only bot visits are logged, not regular human traffic
  • Session deduplication prevents excessive database writes
  • The daily cleanup cron keeps the database table lean

If you want to disable tracking entirely, toggle it off in CitedPro → Settings.