I’m sorry for deleting your template, but this is really a best-practise question and not a support/issue case I am having, so the template makes no sense…
As such just the important parts…
The aim is just that I want a simple Docker/podman/container healthcheck
for my Caddy service.
As a simple try, I tried using the /metrics
endpoint…
Note that you cannot use curl
, because that is not included in caddy’s Docker image.
1. Caddy version (caddy version
):
v2.4.3 h1:Y1FaV2N4WO3rBqxSYA8UZsZTQdN+PwcoOcAiZTM8C0I=
2. How I run Caddy:
podman-compose -t identity -p caddy up
b. Command:
c. Service/unit/compose file:
version: "3.7"
services:
caddy:
image: caddy
restart: unless-stopped
network_mode: "slirp4netns:port_handler=slirp4netns,enable_ipv6=true,allow_host_loopback=true"
ports:
- "80:80"
- "443:443"
# […]
- "2019:2019"
volumes:
- caddy_data:/data
- caddy_config:/config
# […]
environment:
- HOST_DOMAIN=host.containers.internal
# […]
healthcheck:
# https://stackoverflow.com/a/47722899/5008962
test: ["CMD", "wget", "--no-verbose", "--tries=1", "--spider", "http://localhost:2019/metrics", "||", "exit", "1"]
interval: 1m30s
timeout: 10s
retries: 3
start_period: 40s
labels:
- io.containers.autoupdate=registry
volumes:
caddy_data:
# always persist volume by forcing external creation
# https://docs.docker.com/compose/compose-file/compose-file-v3/#external
external: true
caddy_config:
d. My complete Caddyfile or JSON config:
{
admin off
# debug
}
# […]
# manually expose metrics as we disabled the admin API
:2019 {
metrics /metrics
}
# […]
Now the question
So, the question: What would you suggest as a generic/general healthcheck for caddy containers?
I.e. of course I could check also one service (e.g. that the server serves a file or so) or a reverse-proxy as a healthcheck, but IMHO this is too much, especially as it (for the reverse-proxy thing) also adds dependencies, i.e. not only checks the health of caddy, but other services as well…
As such, what is the best practice here?
And if “admin API” is the solution, what would you suggest if the admin API is disabled, as some users do?
Is using the metrics endpoint suitable for that?
I’m running this caddy mainly just as a reverse proxy and static file server…
Related error?
Though, with what I’m currently doing, I’m getting these strange errors at the metrics endpoint, and as far as I see only the healthcheck can cause these (see the two different error messages at the right):
{"level":"error","ts":1625875789.598645,"logger":"http.handlers.metrics","msg":"error encoding and sending metric family:write tcp [::1]:2019->[::1]:60652: write: broken pipe"}
{"level":"error","ts":1625875880.612273,"logger":"http.handlers.metrics","msg":"error encoding and sending metric family:write tcp [::1]:2019->[::1]:60654: write: connection reset by peer"}
{"level":"error","ts":1625875971.5551207,"logger":"http.handlers.metrics","msg":"error encoding and sending metric family:write tcp [::1]:2019->[::1]:60656: write: connection reset by peer"}
{"level":"error","ts":1625876062.5940106,"logger":"http.handlers.metrics","msg":"error encoding and sending metric family:write tcp [::1]:2019->[::1]:60658: write: broken pipe"}
Maybe it’s because wget
is doing a --spider
request, i.e. just checking whether the “file” is there and not actually downloading it?
Also
I saw this thread and the feature request for a graceful shutdown that are related, but that is way above my use case, I do not have a load balances.
I just want a simple solution/health check that works in 90% of the use cases and that makes sense from your perspective.