OCSP Stapling error on previously working setup

1. The problem I’m having:

Hello,

I am using Caddy as a simple reverse proxy. I have 9 websites that it routes traffic to and acquires the certificates for, and everything was working just fine before today.

I just switches ISPs, and unlike before my Modem is now in bridge mode. I am using PFSense as my router, and both 443 and 80 are correctly being routed to the machine that Caddy is running on.

My problem is odd. Caddy is timing out when it is starting, getting some OCSP timeout that takes 20 seconds for each site. If I have 4 of my sites enabled, it manages to start. If I enable a 5th site, the load gets stopped by the OS for taking too long.

2. Error messages and/or full log output:

Aug 25 14:00:08 Caddy systemd[1]: Starting Caddy...
Aug 25 14:00:08 Caddy caddy[511]: caddy.HomeDir=/var/lib/caddy
Aug 25 14:00:08 Caddy caddy[511]: caddy.AppDataDir=/var/lib/caddy/.local/share/caddy
Aug 25 14:00:08 Caddy caddy[511]: caddy.AppConfigDir=/var/lib/caddy/.config/caddy
Aug 25 14:00:08 Caddy caddy[511]: caddy.ConfigAutosavePath=/var/lib/caddy/.config/caddy/autosave.json
Aug 25 14:00:08 Caddy caddy[511]: caddy.Version=v2.7.6 h1:w0NymbG2m9PcvKWsrXO6EEkY9Ru4FJK8uQbYcev1p3A=
Aug 25 14:00:08 Caddy caddy[511]: runtime.GOOS=linux
Aug 25 14:00:08 Caddy caddy[511]: runtime.GOARCH=amd64
Aug 25 14:00:08 Caddy caddy[511]: runtime.Compiler=gc
Aug 25 14:00:08 Caddy caddy[511]: runtime.NumCPU=1
Aug 25 14:00:08 Caddy caddy[511]: runtime.GOMAXPROCS=1
Aug 25 14:00:08 Caddy caddy[511]: runtime.Version=go1.22.3
Aug 25 14:00:08 Caddy caddy[511]: os.Getwd=/
Aug 25 14:00:08 Caddy caddy[511]: LANG=C
Aug 25 14:00:08 Caddy caddy[511]: PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
Aug 25 14:00:08 Caddy caddy[511]: NOTIFY_SOCKET=/run/systemd/notify
Aug 25 14:00:08 Caddy caddy[511]: HOME=/var/lib/caddy
Aug 25 14:00:08 Caddy caddy[511]: LOGNAME=caddy
Aug 25 14:00:08 Caddy caddy[511]: USER=caddy
Aug 25 14:00:08 Caddy caddy[511]: INVOCATION_ID=8e25a157d17a44b5a74cd6c744c3936c
Aug 25 14:00:08 Caddy caddy[511]: JOURNAL_STREAM=9:37985658
Aug 25 14:00:08 Caddy caddy[511]: {"level":"info","ts":1724594408.0484037,"msg":"using provided configuration","config_file":"/etc/caddy/Caddyfile","config_adapter":""}
Aug 25 14:00:08 Caddy caddy[511]: {"level":"info","ts":1724594408.052025,"logger":"admin","msg":"admin endpoint started","address":"localhost:2019","enforce_origin":false,"origins":["//localhost:2019","//[::1]:2019","//127.0.0.1:2019"]}
Aug 25 14:00:08 Caddy caddy[511]: {"level":"info","ts":1724594408.0522199,"logger":"http.auto_https","msg":"server is listening only on the HTTPS port but has no TLS connection policies; adding one to enable TLS","server_name":"srv0","https_port":443}
Aug 25 14:00:08 Caddy caddy[511]: {"level":"info","ts":1724594408.0523098,"logger":"http.auto_https","msg":"enabling automatic HTTP->HTTPS redirects","server_name":"srv0"}
Aug 25 14:00:08 Caddy caddy[511]: {"level":"info","ts":1724594408.053334,"logger":"http","msg":"enabling HTTP/3 listener","addr":":443"}
Aug 25 14:00:08 Caddy caddy[511]: {"level":"info","ts":1724594408.0534556,"msg":"failed to sufficiently increase receive buffer size (was: 208 kiB, wanted: 2048 kiB, got: 416 kiB). See https://github.com/quic-go/quic-go/wiki/UDP-Buffer-Sizes for details."}
Aug 25 14:00:08 Caddy caddy[511]: {"level":"info","ts":1724594408.0535793,"logger":"http.log","msg":"server running","name":"srv0","protocols":["h1","h2","h3"]}
Aug 25 14:00:08 Caddy caddy[511]: {"level":"info","ts":1724594408.0536675,"logger":"http.log","msg":"server running","name":"remaining_auto_https_redirects","protocols":["h1","h2","h3"]}
Aug 25 14:00:08 Caddy caddy[511]: {"level":"info","ts":1724594408.0537338,"logger":"http","msg":"enabling automatic TLS certificate management","domains":["amileo.aidansmith.dev","ninbyo.aidansmith.dev","www.aidansmith.dev","aidansmith.dev","testbot.aidansmith.dev"]}
Aug 25 14:00:08 Caddy caddy[511]: {"level":"info","ts":1724594408.05499,"logger":"tls.cache.maintenance","msg":"started background certificate maintenance","cache":"0xc00041ec00"}
Aug 25 14:00:08 Caddy caddy[511]: {"level":"warn","ts":1724594408.0857677,"logger":"tls","msg":"storage cleaning happened too recently; skipping for now","storage":"FileStorage:/var/lib/caddy/.local/share/caddy","instance":"454939bb-fef4-4dff-a8fd-37810a7453b2","try_again":1724680808.0857663,"try_again_in":86399.999999483}
Aug 25 14:00:08 Caddy caddy[511]: {"level":"info","ts":1724594408.0859008,"logger":"tls","msg":"finished cleaning storage units"}
Aug 25 14:00:28 Caddy caddy[511]: {"level":"warn","ts":1724594428.0688853,"logger":"tls","msg":"stapling OCSP","error":"no OCSP stapling for [amileo.aidansmith.dev]: making OCSP request: Post \"http://e5.o.lencr.org\": dial tcp: lookup e5.o.lencr.org on 206.248.154.22:53: read udp 192.168.1.10:59775->206.248.154.22:53: i/o timeout","identifiers":["amileo.aidansmith.dev"]}
Aug 25 14:00:48 Caddy caddy[511]: {"level":"warn","ts":1724594448.0739279,"logger":"tls","msg":"stapling OCSP","error":"no OCSP stapling for [ninbyo.aidansmith.dev]: making OCSP request: Post \"http://e6.o.lencr.org\": dial tcp: lookup e6.o.lencr.org on 206.248.154.22:53: read udp 192.168.1.10:42306->206.248.154.22:53: i/o timeout","identifiers":["ninbyo.aidansmith.dev"]}
Aug 25 14:01:08 Caddy caddy[511]: {"level":"warn","ts":1724594468.078856,"logger":"tls","msg":"stapling OCSP","error":"no OCSP stapling for [www.aidansmith.dev]: making OCSP request: Post \"http://e6.o.lencr.org\": dial tcp: lookup e6.o.lencr.org on 206.248.154.22:53: read udp 192.168.1.10:38704->206.248.154.22:53: i/o timeout","identifiers":["www.aidansmith.dev"]}
Aug 25 14:01:28 Caddy caddy[511]: {"level":"warn","ts":1724594488.081776,"logger":"tls","msg":"stapling OCSP","error":"no OCSP stapling for [aidansmith.dev]: making OCSP request: Post \"http://e5.o.lencr.org\": dial tcp: lookup e5.o.lencr.org on 206.248.154.22:53: read udp 192.168.1.10:47920->206.248.154.22:53: i/o timeout","identifiers":["aidansmith.dev"]}
Aug 25 14:01:38 Caddy systemd[1]: caddy.service: Start operation timed out. Terminating.
Aug 25 14:01:38 Caddy caddy[511]: {"level":"info","ts":1724594498.1430132,"msg":"shutting down apps, then terminating","signal":"SIGTERM"}
Aug 25 14:01:38 Caddy caddy[511]: {"level":"warn","ts":1724594498.1430733,"msg":"exiting; byeee!! <F0><9F><91><8B>","signal":"SIGTERM"}
Aug 25 14:01:43 Caddy systemd[1]: caddy.service: State 'stop-sigterm' timed out. Killing.
Aug 25 14:01:43 Caddy systemd[1]: caddy.service: Killing process 511 (caddy) with signal SIGKILL.
Aug 25 14:01:43 Caddy systemd[1]: caddy.service: Main process exited, code=killed, status=9/KILL
Aug 25 14:01:43 Caddy systemd[1]: caddy.service: Failed with result 'timeout'.
Aug 25 14:01:43 Caddy systemd[1]: Failed to start Caddy.

3. Caddy version:

Caddy v2.7.6

4. How I installed and ran Caddy:

a. System environment:

Running as a service on Ubuntu Bionic Beaver

b. Command:

systemctl start caddy

c. Service/unit/compose file:

# caddy.service
#
# For using Caddy with a config file.
#
# Make sure the ExecStart and ExecReload commands are correct
# for your installation.
#
# See https://caddyserver.com/docs/install for instructions.
#
# WARNING: This service does not use the --resume flag, so if you
# use the API to make changes, they will be overwritten by the
# Caddyfile next time the service is restarted. If you intend to
# use Caddy's API to configure it, add the --resume flag to the
# `caddy run` command or use the caddy-api.service file instead.

[Unit]
Description=Caddy
Documentation=https://caddyserver.com/docs/
After=network.target network-online.target
Requires=network-online.target

[Service]
Type=notify
User=caddy
Group=caddy
ExecStart=/usr/bin/caddy run --environ --config /etc/caddy/Caddyfile
ExecReload=/usr/bin/caddy reload --config /etc/caddy/Caddyfile --force
TimeoutStopSec=5s
LimitNOFILE=1048576
LimitNPROC=512
PrivateTmp=true
ProtectSystem=full
AmbientCapabilities=CAP_NET_ADMIN CAP_NET_BIND_SERVICE

[Install]
WantedBy=multi-user.target

d. My complete Caddy config:

Caddy config is spread over multiple different files.

Base File

# The Caddyfile is an easy way to configure your Caddy web server.
#
# Unless the file starts with a global options block, the first
# uncommented line is always the address of your site.
#
# To use your own domain name (with automatic HTTPS), first make
# sure your domain's A/AAAA DNS records are properly pointed to
# this machine's public IP, then replace ":80" below with your
# domain name.

import /var/www/caddy/*.caddy

# Refer to the Caddy docs for more information:
# https://caddyserver.com/docs/caddyfile

Sample site file (they are all the same, just with different addresses). There are 9 sites total.

ninbyo.aidansmith.dev {
	reverse_proxy	http://192.168.1.9:30003
}

5. Links to relevant resources:

Howdy @gryphon. Real quick first up:

Gonna recommend you start by upgrading Caddy, the latest release is v2.8.4 as of this post.

As for the issues:

I can dig e5.o.lencr.org immediately, so something is going wrong with DNS resolution on your end. What is 206.248.154.22? Why is it not responding to DNS requests on UDP port 53 from your Caddy host?

This is the kind of problem that can cause a whole host of other problems, too, not just OCSP stapling on startup. What if it can’t resolve the LetsEncrypt/ZeroSSL endpoints? What if it can’t resolve reverse proxy upstreams?

If you absolutely must get Caddy up and running without fixing the problem, I think you can turn of OCSP stapling with a global option, see: Global options (Caddyfile) — Caddy Documentation.

2 Likes

It appears that for some reason, this specific box is having issues with DNS resolving. I have absolutely no idea why that is the case, and I don’t know where the heck it got that IP from. It was in /etc/resolv.conf, but even after changing that to 1.1.1.1 and restarting it’s trying to go to that unknown IP. Time to dig in and figure out why.

1 Like

One other thing of note - it was pointed out to me that, in fact, OCSP stapling is non-blocking.

That is to say, Caddy isn’t waiting on that process to finish. The stapling process happens in a goroutine. The server itself starts up immediately anyway, regardless of whether OCSP stapling is enabled or not, or whether it’s timing out or not. Caddy IS running.

That implies there’s something else weird going on if systemd thinks Caddy isn’t starting. The very fact it’s trying to manage stapling means it’s already started.

3 Likes

Proxmox had a DNS server set for all machines, and I have absolutely no idea where it came from. Not sure why this was the only machine that had things go wrong. Changing that DNS server has fixed the issue.

Thanks for pointing me in the right direction. I probably would have eventually stumbled upon that, but it would have taken me a while to notice that 206.248.154.22 was supposedly a nameserver.

1 Like

I’d be willing to bet that this Caddy was not setup in the proper way. This is an older setup that I’m doing the bare minimum for until I can entirely redo my stack. Hopefully it will last until then, and hopefully that time will come soon lol.

1 Like

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.