Background
I have multiple servers within my homelab serving different traffic over https. Even I already have a separate network for trusted traffic, I would like to lean more to a zero trust approach and preferably encrypt all traffic within the network. Therefore, I would need to run multiple reverse proxies (caddy) to terminate tls on all those servers.
It would be also great to have a single server for management and distribution of the certificate to different reverse proxies to ease management workload of servers. I also prefer to use wildcard certificates in a privacy standpoint - less subdomains (and hence the names of the services that I’m running) will leak in certificate transparency logs.
Currently, I run an LXC container with acme.sh which does the work of obtaining wildcard certificates with DNS challenge with a domain hosted on cloudflare. The certificates obtained are served over SFTP. The individual reverse proxies (clients) are periodically configured to fetch the certificates over SFTP, install them to a predefined location and run commands for reloading the reverse proxies. The client configuration is automated with ansible.
Yesterday, I was casually browsing the caddy documentation and came across this: tls (Caddyfile directive) — Caddy Documentation
With the get_certificate http module of the tls directive, I can have caddy to get the externally managed certificates from an http(s) endpoint instead of a file. That seemed interesting and probed me into further searching. There was few posts and articles about this feature. I could just find this forum post: Get_certificate locally hosted acme.sh/lighttpd
The post showed a PHP implementation of a server handling the requests for certificates from caddy’s get_certificate http directive. I don’t like PHP. I would also like to implement https (so that the traffic for obtaining certificate and key can be encrypted) and auth (so that only the servers that need the certificates can get them). Maybe FastAPI + Caddy + acme.sh? Well, after a full day of further digging into the documentation and prototyping, I found that Caddy can do it all: certificate management and serving the certificate via https with auth. I have implemented a testing server using Caddy. I think it is worth sharing the implementation as it showcases the use of requests matchers, placeholders and variables, and handlers.
Implementation
Certificate Management and Distribution Server
- Using a Debian Trixie lxc container on Proxmox, install distribution’s caddy using
sudo apt install caddy - Download a custom build of caddy with the
dns.providers.cloudflaremodule and setup a dpkg divert as described in Build from source — Caddy Documentation - Obtain an API token from cloudflare for managing the DNS entries of the domain. Edit the SystemD service of Caddy to pass the token as an environment variable into caddy with the instructions here: Keep Caddy Running — Caddy Documentation
- Edit the caddyfile (
/etc/caddy/Caddyfile) as below, replaceexample.comwith the domain:
# Obtain certificate for both the base and wildcard domain
# Requested matchers and handlers are configured to handle only acme.example.com for request of certificates
# See https://caddyserver.com/docs/caddyfile/patterns#wildcard-certificates
example.com, *.example.com {
# TLS configuration for using Cloudflare and DNS challenge to obtaining the certificate
# The resolvers option is set to query cloudflare DNS directly instead of the DNS server of the local network
tls {
dns cloudflare {env.CF_API_TOKEN}
resolvers 1.1.1.1 1.0.0.1
}
# match requests with host acme.example.com
@host_acme_example_com {
host acme.example.com
}
# match query parameter authkey
# replace the authkey with a real random string!
@authorized query authkey=AVeryLongRandomStringUsedAsAnAuthKey
# a map that sets the domain_path placeholder
# according to the server_name query parameter set by caddy
# when requested via the get_certificate http directive
# note by prepending the source with ~, regex matching is used
# this allows mapping the wildcard certificate to all subdomains
# add more mapping heres as required
map {query.server_name} {domain_path} {
example.com example.com
~.+\.example\.com wildcard_.example.com
}
vars certs_base_path ".local/share/caddy/certificates/acme-v02.api.letsencrypt.org-directory"
# If the domain_path placeholder is set by the above map directive
# Then the request has a server name with the certificate managed by our server
@is_server_name_managed vars_regexp {domain_path} .+
# Serve a static response with template
header Content-Type text/plain
templates
# The handles that encapsulate the logic of the server
# If the requests has the correct host, and correct authkey and server_name query params
# Then respond with the certificate and key concatenated together
# The certificate and key files are stored in Caddy's data directory
# as per https://caddyserver.com/docs/conventions#data-directory
# If the server_name is not one of the domains managed by us
# Respond 204 as per https://caddyserver.com/docs/modules/tls.get_certificate.http
handle @host_acme_example_com {
handle @authorized {
handle @is_server_name_managed {
respond <<EOF
{{ printf "%s/%s/%s/%[3]s.crt" (env "HOME") (ph "http.vars.certs_base_path") (ph "domain_path") | readFile }}
{{ printf "%s/%s/%s/%[3]s.key" (env "HOME") (ph "http.vars.certs_base_path") (ph "domain_path") | readFile }}
EOF 200
}
handle {
respond 204
}
}
handle {
respond "Not Authorized" 403
}
}
# Fallback for otherwise unhandled domains
handle {
abort
}
}
- Restart caddy (
sudo systemctl restart caddy)
Certificate distribution client
# Example client caddyfile
{
# Disable certificate management from caddy
# Seems that caddy will try to obtain an LE certificate in some scenarios
# Like when the get_certificate endpoint fails
auto_https disable_certs
}
(my_tls) {
tls {
# on_demand only required for caddy version < 2.7.0
on_demand
get_certificate http https://acme.example.com/?authkey=AVeryLongRandomStringUsedAsAnAuthKey
}
}
example.com, www.example.com {
reverse_proxy localhost:8000
import my_tls
}
anotherapp.example.com {
reverse_proxy localhost:8001
import my_tls
}
Epilogue
I suppose this suits more in the Wiki category instead of Showcase, but this is my first post in this forum so I can’t make a post there yet. It is a showcase too anyway.
As this is a PoC done within a day, there are improvements to be made and thus suggestions are welcome. In particular, the authkey may be stored in an environment variable instead of being hardcoded in the caddy file. There might be ways to hash that apart from using basic auth, but I haven’t looked into that yet.
PS
I just noticed that the “client” caddy won’t cache the certificates with get_certificate and will send a request for certificate to the “certificate server” every time a request to itself is received.
Two issues stem with this:
- Performance impact as an additional request is made on every request
- If the certificate server is down, all “client” caddy servers can’t serve any requests anymore
Using cache-handler plugin as mentioned here may be a solution, but that involves a custom module which complicates setup
Another alternative is periodically downloading the certificate from the “certificate server” outside of caddy, and serving that via local http within caddy.