I have a multi-tenant SaaS with ~400 domains which has been working amazing for years. Caddy data is stored to a managed Redis DB, as was recommended to me in this forum previously.
1. The problem I’m having:
Recently, it seems that the certificates need to be reissued every time my Caddy container is rebuilt/restarted, as if the certs are stored locally in the Docker container rather than in an offsite persistent location.
I see the caddy-tlsredis Caddy extension I’m using is now deprecated, and while I only noticed this issue today, I’m not actually sure when the problem started. Hosting only a few hundred low-traffic domains means that it’s possible Caddy has been quickly and quietly reissuing all ~400 certificates every time my Caddy container has been restarted, without hitting rate limits.
Today, though, I rebuilt the Caddy server a few times in a row, likely stacking up my CA requests, hitting the rate limits, and making this noticeable.
- I know it’s a third-party extension, but does anyone know if
caddy-tlsredis
stopped working? - Would upgrading to caddy-storage-redis fix the issue?
- Did I never actually have persistent storage setup correctly in the first place?
2. Error messages and/or full log output:
Here’s a log entry when a domain is inaccessible, even though the cert was previously issued and saved to Redis. I’m unsure which rate limiter I’m hitting. There’s no mention of LE or ZeroSSL, so maybe it’s the internal limiter?
2024/04/06 03:04:09.005 DEBUG events event {"name": "tls_get_certificate", "id": "41a4302a-ba3c-4eb6-8653-ac19a33e9546", "origin": "tls", "data": {"client_hello":{"CipherSuites":[49195,49199,49196,49200,52393,52392,49161,49171,49162,49172,156,157,47,53,49170,10,4865,4866,4867],"ServerName":"www.moonrovrland.com","SupportedCurves":[29,23,24,25],"SupportedPoints":"AA==","SignatureSchemes":[2052,1027,2055,2053,2054,1025,1281,1537,1283,1539,513,515],"SupportedProtos":null,"SupportedVersions":[772,771],"Conn":{}}}}
2024/04/06 03:04:09.005 DEBUG tls.handshake no matching certificates and no custom selection logic {"identifier": "www.moonrovrland.com"}
2024/04/06 03:04:09.005 DEBUG tls.handshake no matching certificates and no custom selection logic {"identifier": "*.moonrovrland.com"}
2024/04/06 03:04:09.005 DEBUG tls.handshake no matching certificates and no custom selection logic {"identifier": "*.*.com"}
2024/04/06 03:04:09.005 DEBUG tls.handshake no matching certificates and no custom selection logic {"identifier": "*.*.*"}
2024/04/06 03:04:09.011 DEBUG tls response from ask endpoint {"domain": "www.moonrovrland.com", "url": "http://dashboard:3000/api/domain.check?domain=www.moonrovrland.com", "status": 200}
2024/04/06 03:04:09.011 DEBUG http.stdlib http: TLS handshake error from 10.124.0.6:12950: certificate is not allowed for server name www.moonrovrland.com: decision func: on-demand rate limit exceeded
And when the request for the domain is made some time later, presumably when the rate limiter catches up:
2024/04/06 03:06:41.775 DEBUG events event {"name": "tls_get_certificate", "id": "bfc43e86-c1df-40f5-a753-3ace00552269", "origin": "tls", "data": {"client_hello":{"CipherSuites":[35466,4865,4866,4867,49195,49199,49196,49200,52393,52392,49171,49172,156,157,47,53],"ServerName":"www.moonrovrland.com","SupportedCurves":[31354,29,23,24],"SupportedPoints":"AA==","SignatureSchemes":[1027,2052,1025,1283,2053,1281,2054,1537],"SupportedProtos":["h2","http/1.1"],"SupportedVersions":[23130,772,771],"Conn":{}}}}
2024/04/06 03:06:41.776 DEBUG tls.handshake no matching certificates and no custom selection logic {"identifier": "www.moonrovrland.com"}
2024/04/06 03:06:41.776 DEBUG tls.handshake no matching certificates and no custom selection logic {"identifier": "*.moonrovrland.com"}
2024/04/06 03:06:41.776 DEBUG tls.handshake no matching certificates and no custom selection logic {"identifier": "*.*.com"}
2024/04/06 03:06:41.776 DEBUG tls.handshake no matching certificates and no custom selection logic {"identifier": "*.*.*"}
2024/04/06 03:06:41.782 DEBUG tls response from ask endpoint {"domain": "www.moonrovrland.com", "url": "http://dashboard:3000/api/domain.check?domain=www.moonrovrland.com", "status": 200}
2024/04/06 03:06:41.782 DEBUG tls.handshake all external certificate managers yielded no certificates and no errors {"remote_ip": "10.124.0.6", "remote_port": "18436", "sni": "www.moonrovrland.com"}
2024/04/06 03:06:41.786 DEBUG tls loading managed certificate {"domain": "www.moonrovrland.com", "expiration": "2024/05/20 04:00:07.000", "issuer_key": "acme-v02.api.letsencrypt.org-directory", "storage": "{\"address\":\"REDACTED\",\"host\":\"REDACTED\",\"port\":\"REDACTED\",\"db\":0,\"username\":\"default\",\"password\":\"REDACTED\",\"timeout\":5,\"key_prefix\":\"caddytls\",\"value_prefix\":\"caddy-storage-redis\",\"aes_key\":\"\",\"tls_enabled\":true,\"tls_insecure\":true}"}
2024/04/06 03:06:41.914 DEBUG tls.cache added certificate to cache {"subjects": ["www.moonrovrland.com"], "expiration": "2024/05/20 04:00:07.000", "managed": true, "issuer_key": "acme-v02.api.letsencrypt.org-directory", "hash": "7a8eb46e1ba4f8f5924ae929f18dc7b45f785f1412e20bc4421fde94230afd21", "cache_size": 532, "cache_capacity": 10000}
2024/04/06 03:06:41.914 DEBUG events event {"name": "cached_managed_cert", "id": "8d005111-9aa5-4adb-9a80-0e2de4262688", "origin": "tls", "data": {"sans":["www.moonrovrland.com"]}}
2024/04/06 03:06:41.914 DEBUG tls.handshake loaded certificate from storage {"remote_ip": "10.124.0.6", "remote_port": "18436", "subjects": ["www.moonrovrland.com"], "managed": true, "expiration": "2024/05/20 04:00:07.000", "hash": "7a8eb46e1ba4f8f5924ae929f18dc7b45f785f1412e20bc4421fde94230afd21"}
2024/04/06 03:06:41.935 DEBUG http.handlers.reverse_proxy selected upstream {"dial": "website:3000", "total_upstreams": 1}
2024/04/06 03:06:42.050 DEBUG http.handlers.reverse_proxy upstream roundtrip {"upstream": "website:3000", "duration": 0.114481533, "request": {"remote_ip": "10.124.0.6", "remote_port": "18436", "client_ip": "10.124.0.6", "proto": "HTTP/2.0", "method": "GET", "host": "www.moonrovrland.com", "uri": "/", "headers": {"Sec-Fetch-Dest": ["document"], "Sec-Ch-Ua": ["\"Google Chrome\";v=\"123\", \"Not:A-Brand\";v=\"8\", \"Chromium\";v=\"123\""], "X-Forwarded-Host": ["www.moonrovrland.com"], "Sec-Ch-Ua-Platform": ["\"macOS\""], "Accept-Language": ["en-US,en;q=0.9"], "Accept": ["text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7"], "Accept-Encoding": ["gzip, deflate, br, zstd"], "Sec-Fetch-User": ["?1"], "Sec-Fetch-Mode": ["navigate"], "Sec-Ch-Ua-Mobile": ["?0"], "Upgrade-Insecure-Requests": ["1"], "User-Agent": ["Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/123.0.0.0 Safari/537.36"], "Sec-Fetch-Site": ["none"], "X-Forwarded-For": ["10.124.0.6"], "X-Forwarded-Proto": ["https"], "Cache-Control": ["max-age=0"]}, "tls": {"resumed": false, "version": 772, "cipher_suite": 4865, "proto": "h2", "server_name": "www.moonrovrland.com"}}, "headers": {"Content-Encoding": ["gzip"], "Connection": ["keep-alive"], "Keep-Alive": ["timeout=5"], "X-Ratelimit-Remaining": ["946"], "Etag": ["W/\"15cc8-BnquPNU4T/HRaoyoyBACt/afwVc\""], "Vary": ["Accept-Encoding"], "Content-Type": ["text/html; charset=utf-8"], "X-Ratelimit-Limit": ["1000"], "Date": ["Sat, 06 Apr 2024 03:06:41 GMT"], "X-Ratelimit-Reset": ["1712372843"]}, "status": 200}
While I’m not very familiar with Redis, Caddy is the only thing using Redis and the DB is full of keys that reference the domains, so I assume things are getting saved there.
3. Caddy version:
v2.7.4
I would test with 2.7.6, but doing so would require a container restart and risks breaking all the hosted websites again. Hoping to get some insight before having to do that.
4. How I installed and ran Caddy:
It’s running in a dedicated Docker container, as a reverse proxy, and managed by Docker Compose (docker-compose.yml included in a later answer). I’m using the aforementioned Redis extension + Cloudflare for wildcard subdomain cert management.
# Dockerfile
# Start with Caddy Builder
FROM caddy:2.7.4-builder-alpine AS builder
# Setup CloudFlare and TLS Redis Plugins
RUN xcaddy build \
--with github.com/caddy-dns/cloudflare@a9d3ae2690a1d232bc9f8fc8b15bd4e0a6960eec \
--with github.com/gamalan/caddy-tlsredis@master
# Build Caddy
FROM caddy:2.7.4-alpine
COPY --from=builder /usr/bin/caddy /usr/bin/caddy
a. System environment:
The container OS/version/etc should be explained by the Docker config files in the adjacent answers, but the host machine’s specs are:
- Ubuntu: 20.04.2
- Docker: 20.10.7
- Docker Compose: 1.27.4
b. Command:
Outside the Docker configs already mentioned, the commands should be taken care of by the Docker image.
c. Service/unit/compose file:
# docker-compose.yml
version: '3.5'
services:
server:
build: ./server
restart: unless-stopped
container_name: server
env_file:
- ./.env
ports:
- 80:80
- 443:443
volumes:
- ./server/caddy-${ENV}:/etc/caddy # Config
website: REDACTED
dashboard: REDACTED
d. My complete Caddy config:
# Caddyfile
{
email REDACTED
admin 0.0.0.0:2019
on_demand_tls {
ask http://dashboard:3000/api/domain.check
interval 2m
burst 5
}
storage redis {
host {$REDIS_HOST}
port {$REDIS_PORT}
username {$REDIS_USERNAME}
password {$REDIS_PASSWORD}
db {$REDIS_DB}
tls_enabled {$REDIS_TLS}
}
log {
output file /var/log/caddy/access.log {
roll_size 100MiB
roll_uncompressed
roll_keep 5
roll_keep_for 48h
}
format console
level DEBUG
}
}
(headers) {
header -x-powered-by
}
# Subdomains
*.REDACTED.com {
tls {
dns cloudflare {$CLOUDFLARE_TOKEN}
}
import headers
reverse_proxy website:3000
}
# Custom Domains
https:// {
tls {
on_demand
}
import headers
reverse_proxy website:3000
}