1. The problem I’m having:
While acquiring a certificate, I sometimes observe a hard failure that puts Caddy into a state in which it can’t proceed to acquire a certificate. The log is always related to a corrupted/malformed JSON file that fails to parse.
I’m aware that this particular failure mode has appeared a few times in the forums; but I’m raising it separately after doing some digging for the following reasons:
- I’m on 2.9.1, which follows some of the atomic file locking fixes that went into 2.9.0. I’ve observed the failure on fresh hosts without any state leftover from < 2.9.0 versions.
- The failures I’ve caught do not occur under extenuating circumstances like a full disk where you might expect something like a failed
fsync()
to write out a corrupt and/or empty file. - I haven’t observed this failure mode associated with something like a hard-kill (like sending a -9/
SIGKILL
) that might bailout during a write.
I can watch Caddy recover if I restart it after observing one of these failures - I haven’t caught a relevant log line yet, but I would assume that it’s doing something like re-initializing state and clearing out the invalid JSON file.
I’ll also note that I’ve been able to catch the invalid JSON files and they’re just empty (not half-written or structurally corrupted JSON).
So yeah, kind of a mystery! I can help debug/provide additional information, but one of the frustrating parts of this bug is that it isn’t 100% reproducible. It’s fairly easy to try and cause it, as it just needs to trigger an ACME acquisition with TLS enabled and a valid hostname configured. But it doesn’t occur consistently.
My short-term mitigation plan is to write a small sidecar to observe Caddy’s log output and trigger a throttled restart when the JSON parsing failure occurs, but it’d be great to get to the bottom of it.
2. Error messages and/or full log output:
Mar 26 16:33:12 caddytest.rock.associates caddy[841]: {"level":"info","ts":1743006792.1017861,"msg":"[ERROR] Keeping lock file fresh: unexpected end of JSON input - terminating lock maintenance (lockfile: /var/lib/caddy/.local/share/caddy/locks/issue_cert_caddytest.rock.associates.lock)"}
3. Caddy version:
Version 2.9.1 with following plugins:
- caddy-ratelimit
- A custom module that shouldn’t impact this code path (I’m generating CSP headers dynamically)
4. How I installed and ran Caddy:
Running as a system service in NixOS 24.11.
a. System environment:
- NixOS 24.11
- (systemd service)
b. Command:
(I added the escaped newlines)
ExecStart=/nix/store/sbdvablkfxw9izl7ap288f2a9mq6xg7r-caddy-2.9.1/bin/caddy \
run \
--config /etc/caddy/caddy_config \
--adapter caddyfile \
--resume
c. Service/unit/compose file:
(trimmed; I don't suspect this impacts the issue)
# /etc/systemd/system/caddy.service
# caddy.service
#
# For using Caddy with a config file.
#
# Make sure the ExecStart and ExecReload commands are correct
# for your installation.
#
# See https://caddyserver.com/docs/install for instructions.
#
# WARNING: This service does not use the --resume flag, so if you
# use the API to make changes, they will be overwritten by the
# Caddyfile next time the service is restarted. If you intend to
# use Caddy's API to configure it, add the --resume flag to the
# `caddy run` command or use the caddy-api.service file instead.
[Unit]
Description=Caddy
Documentation=https://caddyserver.com/docs/
After=network.target network-online.target
Requires=network-online.target
[Service]
Type=notify
User=caddy
Group=caddy
ExecStart=/nix/store/sbdvablkfxw9izl7ap288f2a9mq6xg7r-caddy-2.9.1/bin/caddy run --environ --config /etc/caddy/Caddyfile
ExecReload=/nix/store/sbdvablkfxw9izl7ap288f2a9mq6xg7r-caddy-2.9.1/bin/caddy reload --config /etc/caddy/Caddyfile --force
TimeoutStopSec=5s
LimitNOFILE=1048576
PrivateTmp=true
ProtectSystem=full
AmbientCapabilities=CAP_NET_ADMIN CAP_NET_BIND_SERVICE
[Install]
WantedBy=multi-user.target
# /nix/store/1hphahky29dwgz99jxiz28gvdby96qcm-system-units/caddy.service.d/overrides.conf
[Unit]
StartLimitBurst=10
StartLimitIntervalSec=14400
Upholds=caddy-merge.service
X-Reload-Triggers=/nix/store/3akl771clx1vz5s9xplx367grpg77dyx-X-Reload-Triggers-caddy
[Service]
Environment="LOCALE_ARCHIVE=/nix/store/fiinrcd99rnhgq9jws1pc9dk3dwzgmfd-glibc-locales-2.40-66/lib/locale/locale-archive"
Environment="PATH=/nix/store/9m68vvhnsq5cpkskphgw84ikl9m6wjwp-coreutils-9.5/bin:/nix/store/vc2d1bfy1a5y1195nq7k6p0zcm6q89nx-findutils-4.10.0/bin:/nix/store/qjsj5vnbfpbg6r7jhd7znfgmcy0arn8n-gnugrep-3.11/bin:/nix/store/3ks7b6p43dpvnlnxgvlcy2jaf1np37p2-gnused-4.9/bin:/nix/store/b5zj299v99y1r8l0s9j7k8fzv1xapcw2-systemd-256.10/bin:/nix/store/9m68vvhnsq5cpkskphgw84ikl9m6wjwp-coreutils-9.5/sbin:/nix/store/vc2d1bfy1a5y1195nq7k6p0zcm6q89nx-findutils-4.10.0/sbin:/nix/store/qjsj5vnbfpbg6r7jhd7znfgmcy0arn8n-gnugrep-3.11/sbin:/nix/store/3ks7b6p43dpvnlnxgvlcy2jaf1np37p2-gnused-4.9/sbin:/nix/store/b5zj299v99y1r8l0s9j7k8fzv1xapcw2-systemd-256.10/sbin"
Environment="TZDIR=/nix/store/l6mypzy4rvkxd5kwzs18d88syirislib-tzdata-2024b/share/zoneinfo"
EnvironmentFile=-/etc/default/caddy
ExecReload=
ExecReload=/nix/store/vpj6cl8k27n0ygyqjkcqdiqg46qa99vq-reload-caddy/bin/reload-caddy
ExecStart=
ExecStart=/nix/store/sbdvablkfxw9izl7ap288f2a9mq6xg7r-caddy-2.9.1/bin/caddy run --config /etc/caddy/caddy_config --adapter caddyfile --resume
Group=caddy
LogsDirectory=caddy
NoNewPrivileges=true
PrivateDevices=true
ProtectHome=true
ReadWritePaths=/var/lib/caddy
Restart=on-failure
RestartPreventExitStatus=1
RestartSec=5s
StateDirectory=caddy
User=caddy
[Install]
WantedBy=multi-user.target
d. My complete Caddy config:
(This is really long and, I don't suspect, includes the reason for failure). I include JSON here because I'm making a lot of API calls over the API.
{
"apps": {
"http": {
"servers": {
"srv0": {
"listen": [
":443"
],
"logs": {
"default_logger_name": "log0"
},
"protocols": [
"h1",
"h2",
"h2c"
],
"routes": [
{
"handle": [
{
"body": "Forbidden",
"handler": "static_response",
"status_code": 403
}
],
"match": [
{
"client_ip": {
"ranges": [
"174.126.31.196"
]
}
}
]
},
{
"handle": [
{
"handler": "headers",
"response": {
"set": {
"X-Frame-Options": [
"deny"
]
}
}
},
{
"handler": "subroute",
"routes": [
{
"handle": [
{
"handler": "vars",
"root": "/srv/assets"
}
]
},
{
"group": "group11",
"handle": [
{
"handler": "subroute",
"routes": [
{
"handle": [
{
"handle_response": [
{
"match": {
"status_code": [
2
]
}
},
{
"match": {
"status_code": [
401
]
},
"routes": [
{
"handle": [
{
"handler": "static_response",
"headers": {
"Location": [
"/login?redirect_to=/grafana"
]
},
"status_code": 302
},
{
"handler": "static_response",
"status_code": 302
}
]
}
]
}
],
"handler": "reverse_proxy",
"headers": {
"request": {
"set": {
"X-Forwarded-Method": [
"{http.request.method}"
],
"X-Forwarded-Uri": [
"{http.request.uri}"
]
}
}
},
"rewrite": {
"method": "GET",
"uri": "/-net/api/v0/user/me"
},
"upstreams": [
{
"dial": "127.0.0.1:3000"
}
]
}
],
"match": [
{
"not": [
{
"header": {
"Cookie": [
"*grafana_session*"
]
}
},
{
"header": {
"Cookie": [
"*oauth_state*"
]
}
},
{
"header": {
"Cookie": [
"*__Host-nonce*"
]
}
}
]
}
]
},
{
"handle": [
{
"handler": "reverse_proxy",
"upstreams": [
{
"dial": "127.0.0.1:3001"
}
]
}
]
}
]
}
],
"match": [
{
"path": [
"/grafana*"
]
}
]
},
{
"group": "group11",
"handle": [
{
"handler": "subroute",
"routes": [
{
"handle": [
{
"handler": "reverse_proxy",
"upstreams": [
{
"dial": "127.0.0.1:5556"
}
]
}
]
}
]
}
],
"match": [
{
"path": [
"/dex*"
]
}
]
},
{
"group": "group11",
"handle": [
{
"handler": "subroute",
"routes": [
{
"handle": [
{
"handler": "rate_limit",
"rate_limits": {
"remote_host": {
"key": "{http.request.remote.host}",
"max_events": 500,
"window": 20000000000
}
}
},
{
"handler": "reverse_proxy",
"upstreams": [
{
"dial": "127.0.0.1:3000"
}
]
}
]
}
]
}
],
"match": [
{
"path": [
"/ok",
"/-net/api/v0/admin_message_bus",
"/-net/api/v0/all_sync_connections",
"/-net/api/v0/api_keys",
"/-net/api/v0/api_keys/*",
"/-net/api/v0/automerge_document",
"/-net/api/v0/backups/stream",
"/-net/api/v0/client_config",
"/-net/api/v0/client_config/*",
"/-net/api/v0/client_config/device/*",
"/-net/api/v0/collection",
"/-net/api/v0/collection/*",
"/-net/api/v0/controller/*",
"/-net/api/v0/device",
"/-net/api/v0/device/*",
"/-net/api/v0/device/auth_decision",
"/-net/api/v0/device/auth_decision/*",
"/-net/api/v0/device/network_control/interface_manager",
"/-net/api/v0/device_group",
"/-net/api/v0/device_group/*",
"/-net/api/v0/device_group/*/list",
"/-net/api/v0/dns_block_list",
"/-net/api/v0/dns_block_list/*",
"/-net/api/v0/group",
"/-net/api/v0/group/*",
"/-net/api/v0/group/*/set_membership",
"/-net/api/v0/group/*/list",
"/-net/api/v0/kvs/*/contents",
"/-net/api/v0/license",
"/-net/api/v0/license/*",
"/-net/api/v0/messages",
"/-net/api/v0/messages/*",
"/-net/api/v0/oidc_ok",
"/-net/api/v0/oidc_finish",
"/-net/api/v0/oidc_start",
"/-net/api/v0/ok",
"/-net/api/v0/organization",
"/-net/api/v0/organization/config",
"/-net/api/v0/organization/config/telemetry",
"/-net/api/v0/organization/config/support_enabled",
"/-net/api/v0/organization/controller",
"/-net/api/v0/organization/controller/*",
"/-net/api/v0/organization/controllers",
"/-net/api/v0/organization/dns/*",
"/-net/api/v0/organization/ipv4",
"/-net/api/v0/organization/ipv4/*",
"/-net/api/v0/organization/ipv6",
"/-net/api/v0/organization/ipv6/*",
"/-net/api/v0/platform",
"/-net/api/v0/policy",
"/-net/api/v0/policy/*",
"/-net/api/v0/policy/resource/*",
"/-net/api/v0/policy/resource_group/*",
"/-net/api/v0/route_exclusion",
"/-net/api/v0/route_exclusion/*",
"/-net/api/v0/site",
"/-net/api/v0/site/*",
"/-net/api/v0/site/*/bgp_config",
"/-net/api/v0/site/*/range",
"/-net/api/v0/site/*/range/*",
"/-net/api/v0/status",
"/-net/api/v0/sync_connections",
"/-net/api/v0/tasks",
"/-net/api/v0/tasks/*",
"/-net/api/v0/tasks/*/run_once",
"/-net/api/v0/telemetry/sample",
"/-net/api/v0/telemetry/submit",
"/-net/api/v0/user/*",
"/-net/api/v0/users",
"/-net/api/v0/openapi.json",
"/-net/setup",
"/-net/setup/*",
"/-net/swagger-ui"
]
}
]
},
{
"group": "group11",
"handle": [
{
"handler": "subroute",
"routes": [
{
"handle": [
{
"handler": "static_response",
"status_code": 421
}
]
}
]
}
],
"match": [
{
"path": [
"/-net*"
]
}
]
},
{
"group": "group11",
"handle": [
{
"handler": "subroute",
"routes": [
{
"handle": [
{
"handler": "rewrite",
"strip_path_prefix": "/sos"
},
{
"handle_response": [
{
"match": {
"status_code": [
2
]
}
},
{
"match": {
"status_code": [
401
]
},
"routes": [
{
"handle": [
{
"handler": "static_response",
"headers": {
"Location": [
"/login?redirect_to=/"
]
},
"status_code": 302
},
{
"handler": "static_response",
"status_code": 302
}
]
}
]
}
],
"handler": "reverse_proxy",
"headers": {
"request": {
"set": {
"X-Forwarded-Method": [
"{http.request.method}"
],
"X-Forwarded-Uri": [
"{http.request.uri}"
]
}
}
},
"rewrite": {
"method": "GET",
"uri": "/-net/api/v0/user/me"
},
"upstreams": [
{
"dial": "127.0.0.1:3000"
}
]
},
{
"handler": "reverse_proxy",
"upstreams": [
{
"dial": "127.0.0.1:911"
}
]
}
]
}
]
}
],
"match": [
{
"path": [
"/sos*"
]
}
]
},
{
"group": "group11",
"handle": [
{
"handler": "subroute",
"routes": [
{
"handle": [
{
"handler": "subroute",
"routes": [
{
"handle": [
{
"handler": "headers",
"response": {
"set": {
"Cache-Control": [
"no-cache, must-revalidate"
]
}
}
}
]
},
{
"group": "group9",
"handle": [
{
"handler": "rewrite",
"uri": "/index.html"
}
]
}
]
}
],
"match": [
{
"not": [
{
"file": {
"try_files": [
"{http.request.uri.path}"
]
}
}
]
}
]
},
{
"handle": [
{
"asset_path": "/srv/assets",
"csp_directives": {
"base-uri": [
"'self'"
],
"connect-src": [
"'self'",
"https:",
"wss:"
],
"frame-ancestors": [
"'none'"
],
"object-src": [
"'none'"
]
},
"handler": "cspdynamic"
},
{
"handler": "file_server",
"hide": [
"/nix/store/pk8wsmw1wn6jc6z2nvxs65v9msqa8w52-Caddyfile-formatted/Caddyfile"
]
}
]
}
]
}
]
}
]
}
],
"match": [
{
"host": [
"caddytest.rock.associates"
]
}
]
},
{
"handle": [
{
"handler": "reverse_proxy",
"upstreams": [
{
"dial": "127.0.0.1:3000"
}
]
}
],
"match": [
{
"path": [
"/-net/api/v0/ok"
]
},
{
"host": [
"*"
],
"path": [
"/-net/api/v0/ok"
]
}
]
}
]
}
}
},
"pki": {
"certificate_authorities": {
"local": {
"install_trust": false
}
}
},
"tls": {
"automation": {
"on_demand": {
"permission": {
"endpoint": "http://127.0.0.1:3000/assigned_address",
"module": "http"
}
},
"policies": [
{
"issuers": [
{
"module": "acme"
}
],
"subjects": [
"caddytest.rock.associates"
]
},
{
"issuers": [
{
"module": "internal"
}
],
"subjects": [
"*"
]
},
{
"issuers": [
{
"module": "internal"
}
],
"on_demand": true
}
]
}
}
},
"logging": {
"logs": {
"default": {
"exclude": [
"http.log.access.log0"
],
"level": "ERROR"
},
"log0": {
"encoder": {
"fields": {
"request>headers>X-Bt-Signature": {
"filter": "delete"
},
"request>headers>X-Bt-Signer": {
"filter": "delete"
},
"request>headers>X-Bt-Sync-Psk": {
"filter": "delete"
},
"request>uri": {
"actions": [
{
"parameter": "oidc_finish",
"type": "hash"
}
],
"filter": "query"
}
},
"format": "filter",
"wrap": {
"format": "json"
}
},
"include": [
"http.log.access.log0"
],
"writer": {
"output": "stderr"
}
}
}
}
}
5. Links to relevant resources:
N/A
Edits: spelling and grammar.