jotteerr
(Joachim Trinkwitz)
February 27, 2026, 1:02pm
1
1. The problem I’m having:
From time to time, my site (a bibliography database with 15,000+ entries) gets overwhelmed by bot requests and not responding any more, in spite of blocking all through a robots.txt.
I found a recipe for Apache (at the French site Docs Evolix - Gestion des bots ):
RewriteEngine on
RewriteCond %{HTTP_USER_AGENT} "FooBot" [NC]
RewriteRule ^ - [F,L]
but I can’t ‘translate’ this into a Caddyfile entry.
Apologies if this sounds trivial …
2. Error messages and/or full log output:
—
3. Caddy version:
v2.11.1 h1:C7sQpsFOC5CH+31KqJc7EoOf8mXrOEkFyYd6GpIqm/s=
4. How I installed and ran Caddy:
Installation through Debian package manager
a. System environment:
Debian GNU/Linux 13.3
b. Command:
c. Service/unit/compose file:
d. My complete Caddy config:
(common) {
header {
Strict-Transport-Security "max-age=31536000; includeSubDomains; preload"
X-Xss-Protection "1; mode=block"
X-Content-Type-Options "nosniff"
X-Frame-Options "DENY"
Content-Security-Policy "upgrade-insecure-requests"
Referrer-Policy "strict-origin-when-cross-origin"
Cache-Control "public, max-age=15, must-revalidate"
Feature-Policy "accelerometer 'none'; ambient-light-sensor 'none'; autoplay 'self'; camera 'none'; encrypted-media 'none'; fullscreen 'self'; geolocation 'none'; gyroscope 'none'; magnetometer 'none'; microphone 'none'; midi 'none'; payment 'none'; picture-in-picture *; speaker 'none'; sync-xhr 'none'; usb 'none'; vr 'none'"
}
encode gzip
}
www.bobc.uni-bonn.de {
root * /var/www/wikindx
root /adminer* /usr/share/adminer
file_server
import common
log {
output file /var/log/caddy/access.log
}
php_fastcgi unix//run/php/php8.4-fpm.sock
}
www.bobc.uni-bonn.de:8088 {
reverse_proxy localhost:3000
}
5. Links to relevant resources:
—
1 Like
That would just be header matchers rules on the User-Agent header, and applying that matcher to an abort or error handler.
3 Likes
pothi
(Pothi Kalimuthu)
February 28, 2026, 5:36am
3
While this has been answered already, I’d like to expand on the existing answer. The following block of code might help to get rid of some bots…
example.com {
root /path/to/root
@forbidden_bots {
header User-Agent GPTBot
header User-Agent ClaudeBot
header User-Agent AnyOtherBot
}
respond @forbidden_bots "Forbidden bots" 403
// other config
}
There are other examples at Request matchers (Caddyfile) — Caddy Documentation
I’d personally recommend rate_limit for such bots.
2 Likes
A quite longer list is at GitHub - ai-robots-txt/ai.robots.txt: A list of AI agents and robots to block. . There is a Caddyfile there defining a named matcher @aibots which can be referred to. The README outlines which user agents are included.
I also recommend using abort. This way, Caddy doesn’t bother waiting for handshakes to complete to actually answer any such client, it just drops the connection.
2 Likes
Technically the TLS handshake still happens (HTTP handling runs after TLS is done), but no HTTP response gets written when you use abort. So yeah, still more efficient.
4 Likes
jotteerr
(Joachim Trinkwitz)
March 28, 2026, 11:46am
6
Thank you all for your helpful answers. I’ve used this (with the list @techjedialex mentioned), but tried to replace respond with abort. I’m not sure I got the abort syntax right, the simple line
abort @forbidden_bots
doesn’t seem to work, I’m seeing bots in the list in my log files anyway.
Please post the complete Caddyfile as of now.
Here’s a working example of a snippet
(blocking) {
@aiuseragents header_regexp User-Agent “AddSearchBot|AI2Bot|AI2Bot-DeepResearchEval|Ai2Bot-Dolma|aiHitBot|amazon-kendra|Amazonbot|AmazonBuyForMe|Amzn-SearchBot|Amzn-User|Andibot|Anomura|anthropic-ai|ApifyBot|ApifyWebsiteContentCrawler|Applebot|Applebot-Extended|Aranet-SearchBot|atlassian-bot|Awario|AzureAI-SearchBot|bedrockbot|bigsur.ai|Bravebot|Brightbot\ 1.0|BuddyBot|Bytespider|CCBot|Channel3Bot|ChatGLM-Spider|ChatGPT\ Agent|ChatGPT-User|Claude-SearchBot|Claude-User|Claude-Web|ClaudeBot|Cloudflare-AutoRAG|CloudVertexBot|cohere-ai|cohere-training-data-crawler|Cotoyogi|Crawl4AI|Crawlspace|Datenbank\ Crawler|DeepSeekBot|Devin|Diffbot|DuckAssistBot|Echobot\ Bot|EchoboxBot|ExaBot|FacebookBot|facebookexternalhit|Factset_spyderbot|FirecrawlAgent|FriendlyCrawler|Gemini-Deep-Research|Google-Agent|Google-CloudVertexBot|Google-Extended|Google-Firebase|Google-NotebookLM|GoogleAgent-Mariner|GoogleOther|GoogleOther-Image|GoogleOther-Video|GPTBot|iAskBot|iaskspider|iaskspider/2.0|IbouBot|ICC-Crawler|ImagesiftBot|imageSpider|img2dataset|ISSCyberRiskCrawler|kagi-fetcher|Kangaroo\ Bot|KlaviyoAIBot|KunatoCrawler|laion-huggingface-processor|LAIONDownloader|LCC|LinerBot|Linguee\ Bot|LinkupBot|Manus-User|meta-externalagent|Meta-ExternalAgent|meta-externalfetcher|Meta-ExternalFetcher|meta-webindexer|MistralAI-User|MistralAI-User/1.0|MyCentralAIScraperBot|netEstate\ Imprint\ Crawler|NotebookLM|NovaAct|OAI-SearchBot|omgili|omgilibot|OpenAI|Operator|PanguBot|Panscient|panscient.com|Perplexity-User|PerplexityBot|PetalBot|PhindBot|Poggio-Citations|Poseidon\ Research\ Crawler|QualifiedBot|QuillBot|quillbot.com|SBIntuitionsBot|Scrapy|SemrushBot-OCOB|SemrushBot-SWA|ShapBot|Sidetrade\ indexer\ bot|Spider|TavilyBot|TerraCotta|Thinkbot|TikTokSpider|Timpibot|TwinAgent|VelenPublicWebCrawler|WARDBot|Webzio-Extended|webzio-extended|wpbot|WRTNBot|YaK|YandexAdditional|YandexAdditionalBot|YouBot|ZanistaBot”
handle @aiuseragents {
abort
}
}
import blocking
2 Likes
For your purpose, you can also consider blocking ASNs with plugins like caddy-defender or geo-blocking with caddy-maxmind-geolocation .
Another interesting approach is to block bots with JA4 fingerprinting . This approach is under-discussed. I can only find one plugin matt-/caddy-ja4 using this method.
2 Likes