AI Bot Tracking
Detect and monitor 60+ AI bots across 5 categories with auto-updating bot database.
Last updated Feb 21, 2026
Overview
CitedPro detects and logs visits from 60+ AI bots across five categories: AI Agent, AI Assistant, AI Data Scraper, AI Search Crawler, and Undocumented AI Agent. Every time an AI system crawls one of your pages, CitedPro records the visit so you can understand which AI platforms are consuming your content and how often.
All tracking data is stored locally in your WordPress database. No information is sent to external analytics services.
How detection works
CitedPro identifies AI bots using server-side user-agent string matching. When a request comes in, the plugin checks the HTTP User-Agent header against its database of known AI bot signatures using fast substring matching (stripos, not regex) for minimal performance impact.
To prevent duplicate counting, CitedPro generates a session hash from the combination of the visitor's IP address, user agent, and the current hour using MD5. If the same bot visits multiple pages within the same hour, each page visit is recorded, but the session hash ensures accurate unique-session reporting.
Auto-updating bot database
The bot database is automatically updated from api.srworks.co and cached locally for 24 hours. If the API is temporarily unavailable, CitedPro falls back to a bundled bot list that ships with the plugin. This ensures detection continues working even if the API is unreachable.
New bots are added to the API database as they are discovered, so your site stays current without needing a plugin update.
Tracked AI bots
CitedPro recognizes bots from every major AI company. Here are some of the most common ones across each category:
AI Search Crawlers
- OpenAI: GPTBot, ChatGPT-User, OAI-SearchBot
- Anthropic: ClaudeBot, Claude-Web, Claude-SearchBot
- Perplexity: PerplexityBot
- Google: Google-Extended, Gemini-Deep-Research
- xAI: GrokBot, Grok-DeepSearch
- DuckDuckGo: DuckAssistBot
AI Agents
- Amazon: Amazonbot
- Apple: Applebot-Extended
- Meta: Meta-ExternalAgent
- Microsoft: Copilot
AI Data Scrapers
- TikTok: Bytespider
- Common Crawl: CCBot
- Cohere: cohere-ai
- Diffbot: Diffbot
AI Assistants and undocumented agents
CitedPro also tracks AI assistant bots and undocumented agents that have been observed crawling the web but are not yet officially documented by their operators.
What gets recorded
For each AI bot visit, CitedPro stores a row in the wp_cited_bot_visits database table with the following fields:
| Field | Description |
|---|---|
bot_name | The identified bot name (e.g., GPTBot, ClaudeBot) |
bot_category | AI Agent, AI Assistant, AI Data Scraper, AI Search Crawler, or Undocumented AI Agent |
bot_id | Unique identifier for the bot in the database |
user_agent | The full HTTP User-Agent string |
page_url | The page that was visited |
ip_address | The bot's IP address |
visit_time | Timestamp of the visit |
session_id | MD5 hash of IP + UA + hour for deduplication |
referrer | HTTP referrer if present |
is_human | 0 for bots, 1 for LLM referral visits |
Viewing bot activity
Bot visit data surfaces in two places:
- Dashboard: Shows total AI bot visits, the top bot, category breakdown, and a recent activity feed at a glance.
- Bots tab: Lists every detected bot with its hit count and category. Click any bot to view its recent visit details including pages visited, timestamps, and full user agent strings.
For trend charts, date range filtering, and CSV exports, see the Analytics documentation.
Data retention
Tracking data is automatically cleaned up by a daily cron job based on your retention setting:
| Setting | Range | Default |
|---|---|---|
| Data retention | 30 to 365 days | 365 days |
To adjust your retention period:
- Go to CitedPro → Settings
- Change the Data Retention value
- Save changes
Records older than the configured retention period are permanently deleted by the daily cleanup cron. Reducing the retention period will delete older records on the next cron run.
Understanding your data
What high bot traffic means
Frequent visits from AI crawlers indicate your content is being indexed for AI search and recommendation systems. This is generally positive. It means AI assistants can potentially cite and recommend your business when users ask relevant questions.
What low bot traffic means
If you are seeing few AI bot visits:
- Your site may be new or have limited backlinks
- Check that your robots.txt is not blocking AI bots you want to allow
- Verify your llms.txt and AI discovery files are accessible
- AI bots tend to prioritize higher-authority sites first
Tip
Complete your business setup and generate your AI discovery files to increase the likelihood of AI bots finding and crawling your content.
Performance impact
Bot tracking adds minimal overhead to your site:
- Detection runs early in the request lifecycle using fast string matching
- Only bot visits are logged, not regular human traffic
- Session deduplication prevents excessive database writes
- The daily cleanup cron keeps the database table lean
If you want to disable tracking entirely, toggle it off in CitedPro → Settings.