Matching bot useragents

1. The problem I’m having:

I try to remove this more and more vm killing ai bots. i thought about using a header matching and abort the requests. but this doesnt work for Amazonbot. the other stuff is matching but not Amazonbot. what am i doing wrong?

2. Error messages and/or full log output:

this is was im still getting.

{
  "level": "info",
  "ts": 1725606203.9108431,
  "logger": "http.log.access",
  "msg": "handled request",
  "request": {
    "remote_ip": "3.224.220.101",
    "remote_port": "8362",
    "client_ip": "3.224.220.101",
    "proto": "HTTP/1.1",
    "method": "GET",
    "host": "git.xsfx.dev",
    "uri": "/xsteadfastx/wireguard-go/src/commit/d94bae834882e6579f09db46b60cf9a1c46dac8a/device/cookie_test.go",
    "headers": {
      "Connection": [
        "close"
      ],
      "User-Agent": [
        "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/600.2.5 (KHTML, like Gecko) Version/8.0.2 Safari/600.2.5 (Amazonbot/0.1; +https://developer.amazon.com/support/amazonbot)"
      ],
      "Accept-Encoding": [
        "gzip,deflate"
      ]
    },
    "tls": {
      "resumed": false,
      "version": 771,
      "cipher_suite": 49195,
      "proto": "",
      "server_name": "git.xsfx.dev"
    }
  },
  "bytes_read": 0,
  "user_id": "",
  "duration": 0.000705843,
  "size": 0,
  "status": 200,
  "resp_headers": {
    "Server": [
      "Caddy"
    ],
    "Alt-Svc": [
      "h3=\":443\"; ma=2592000"
    ],
    "Content-Length": [
      ""
    ]
  }
}

3. Caddy version:

2.8.4

4. How I installed and ran Caddy:

docker image

a. System environment:

docker

d. My complete Caddy config:


        admin 0.0.0.0:2019 {
                origins 10.100.100.2 10.100.100.4
        }
        acme_ca https://acme-v02.api.letsencrypt.org/directory
        email foo@bar.tld
        cache
}

git.xsfx.dev {
        @badbots {
                header User-Agent *facebookexternalhit*
                header User-Agent *meta-externalagent*
                header User-Agent *Amazonbot*
                header User-Agent *Bytespider*
        }

        abort @badbots

        log
        cache
        reverse_proxy gitea:3000
}

go.xsfx.dev {
        route /* {
                @goget query go-get=1
                respond @goget `<meta name="go-import" content="{host}{path} git https://git.xsfx.dev/xsteadfastx{path}">`
                redir https://git.xsfx.dev/xsteadfastx{path}
        }
}

i have some new observations. it works in that way that the requests doesnt get down to gitea. but still i wonder why status code 200 is in the caddy logs. im using abort here.

I don’t think this is particularly an issue.

As far as Caddy is concerned, a client made a request and Caddy handled that request, as you configured it, without any problems or declaring any other special status. It simply aborted the connection.

As far as the access log goes, that’s “200 OK” because Caddy did its job.

Same logic behind a totally un-configured route returning 200 OK; just because Caddy wasn’t configured to return any response doesn’t mean it wasn’t “successful” at finishing the request as it was configured to.

1 Like

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.