Vai al contenuto principale
All Guides

Robots.txt

Weight: 15% of your AX score. This check verifies that your robots.txt explicitly configures access rules for AI crawlers — not just traditional search engines.

/robots.txt not found

Your site does not have a /robots.txt file at the root. Without it, AI crawlers have no explicit instructions on what they can or cannot access. While most crawlers will still index your site, having an explicit robots.txt signals intent and gives you fine-grained control.

Create a robots.txt file in your site root (usually /public/robots.txt in Next.js or the web root directory) with User-agent entries for the AI crawlers you want to allow:

# Allow all AI crawlers
User-agent: GPTBot
Allow: /

User-agent: ClaudeBot
Allow: /

User-agent: Google-Extended
Allow: /

User-agent: PerplexityBot
Allow: /

User-agent: Bytespider
Allow: /

User-agent: CCBot
Allow: /

# Traditional crawlers
User-agent: *
Allow: /

Sitemap: https://your-site.com/sitemap.xml
Quick fix with ax-init
Run npx ax-init --from https://your-site.com to auto-generate a robots.txt with all 29+ AI crawler entries pre-configured.

Missing core AI crawlers

ax-audit detected that your robots.txt exists but is missing explicit User-agent entries for some of the 6 core AI crawlers. The core crawlers are:

  • GPTBot — OpenAI's crawler for ChatGPT and plugins
  • ClaudeBot — Anthropic's crawler for Claude
  • Google-Extended — Google's AI-specific crawler
  • PerplexityBot — Perplexity AI's search crawler
  • Bytespider — ByteDance's crawler
  • CCBot — Common Crawl's crawler (used by many AI training datasets)

Add explicit entries for each missing crawler. Each entry needs its own User-agent line followed by Allow: /:

User-agent: GPTBot
Allow: /

User-agent: ClaudeBot
Allow: /

No core AI crawlers configured

Your robots.txt exists but has zero explicit entries for any of the 6 core AI crawlers. This means AI agents rely entirely on the wildcard User-agent: * rule, which may or may not permit access.

Add User-agent blocks for all 6 core crawlers listed in the section above. Being explicit about AI crawler access is a best practice — it removes ambiguity and ensures your site is discoverable.


Blocked by wildcard rule

Your robots.txt has a User-agent: * / Disallow: / rule that blocks all crawlers, including AI crawlers that don't have their own explicit entry.

If you want to keep the wildcard block for traditional crawlers but allow AI agents, add explicit entries for each AI crawler before the wildcard rule:

# AI crawlers — explicitly allowed
User-agent: GPTBot
Allow: /

User-agent: ClaudeBot
Allow: /

# ... other AI crawlers ...

# Block everything else
User-agent: *
Disallow: /
How robots.txt matching works
When a crawler identifies itself as "GPTBot", the server looks for a matching User-agent: GPTBot section first. Only if no specific match exists does it fall back to User-agent: *. So explicit entries always take precedence.

Explicitly blocked crawlers

One or more AI crawlers have explicit Disallow: / rules in your robots.txt. If this is intentional (e.g., you don't want certain AI companies indexing your content), you can ignore this warning.

If you want to allow these crawlers, change their Disallow: / to Allow: /:

# Before (blocked)
User-agent: GPTBot
Disallow: /

# After (allowed)
User-agent: GPTBot
Allow: /

Partial path restrictions

Some AI crawlers have Disallow rules on specific paths rather than a full block. For example:

User-agent: GPTBot
Disallow: /private/
Disallow: /api/internal/

This is often intentional — you may want AI crawlers to access most of your site but not private sections. ax-audit flags this as a warning so you can verify the restrictions match your intent. For maximum AX score, use only Allow: / for AI crawlers and handle path restrictions through the wildcard rule.


Missing Sitemap directive

Your robots.txt does not include a Sitemap: directive. While AI crawlers can discover your sitemap through other means, including it in robots.txt is a widely-supported standard that helps all crawlers find your site map immediately.

Add a Sitemap directive at the end of your robots.txt:

Sitemap: https://your-site.com/sitemap.xml

Use an absolute URL (including https://). If you have multiple sitemaps, add one line per sitemap.


Low AI crawler coverage

ax-audit tracks 29+ known AI crawlers. Your robots.txt has explicit rules for fewer than 10 of them. While the 6 core crawlers are most important, configuring additional crawlers improves discoverability across more AI platforms.

Run npx ax-init to generate a robots.txt that includes all known AI crawler entries. You can also view the full list in the ax-audit source code.