Why Caddy's ACME integration is better than separate utilities and cron jobs

I had an interesting experience “in the wild” today that illustrates an important reason why Caddy’s ACME support is better than other ACME clients in a tangible, measurable way that results in more uptime and reliability.

I needed to create a www. subdomain to a site I manage. After adding the DNS records (A and AAAA) at Cloudflare, I updated the site’s Caddyfile with a www. site:

www.example.com {
    redir https://example.com{uri}
}

I reloaded Caddy and saw that… I couldn’t connect to the site. But it wasn’t a cert error: it was a DNS lookup error (NXDOMAIN).

Huh. I had added and changed records on Cloudflare before, with nearly instantaneous results. I guess I forgot that DNS records take time to propagate sometimes.

But maybe it was just my network.

I logged into the server and tried a dig and was disappointed to see that it also saw NXDOMAIN.

The TTL was 1m. :frowning: Wasn’t really sure what else to do other than wait.

Then I checked the Caddy logs. Sure enough, it failed to obtain a certificate, because Let’s Encrypt and ZeroSSL also got NXDOMAIN.

But that’s where the magic was happening. Caddy was backing off and retrying. Not any ordinary backoff either: but with CAs that aren’t encumbered by rate limits. When Caddy gets an error from Let’s Encrypt, it switches to LE’s staging endpoint while it retries. This prevents the “too many failed authorizations recently” rate limit! As far as I know, only Caddy does this by default. (Correct me if that’s now changed.)

I simply let it keep running while I waited for the DNS to update.

And several hours later, I noticed the DNS record was up.

Shortly after that, Caddy’s retry loop woke from sleep and tried again, and sure enough, it succeeded on LE staging, so it immediately went back to LE’s production CA and got a certificate successfully. :tada:

Here’s the end of the log, where it failed, and then later succeeded:

Nov 03 20:58:32 localhost caddy[743]: {"level":"error","ts":1699045112.9578817,"logger":"http.acme_client","msg":"validating authorization","identifier":"www.example.com","problem":{"type":"urn:ietf:params:acme:error:dns","title":"","detail":"DNS problem: NXDOMAIN looking up A for www.example.com - check that a DNS record exists for this domain; DNS problem: NXDOMAIN looking up AAAA for www.example.com - check that a DNS record exists for this domain","instance":"","subproblems":[]},"order":"https://acme-staging-v02.api.letsencrypt.org/acme/order/123456/12018241794","attempt":2,"max_attempts":3}
Nov 03 20:58:32 localhost caddy[743]: {"level":"error","ts":1699045112.9579022,"logger":"tls.obtain","msg":"could not get certificate from issuer","identifier":"www.example.com","issuer":"acme-v02.api.letsencrypt.org-directory","error":"HTTP 400 urn:ietf:params:acme:error:dns - DNS problem: NXDOMAIN looking up A for www.example.com - check that a DNS record exists for this domain; DNS problem: NXDOMAIN looking up AAAA for www.example.com - check that a DNS record exists for this domain"}
Nov 03 20:58:34 localhost caddy[743]: {"level":"info","ts":1699045114.250768,"logger":"http.acme_client","msg":"trying to solve challenge","identifier":"www.example.com","challenge_type":"http-01","ca":"https://acme.zerossl.com/v2/DV90"}
Nov 03 20:58:40 localhost caddy[743]: {"level":"error","ts":1699045120.4625328,"logger":"http.acme_client","msg":"challenge failed","identifier":"www.example.com","challenge_type":"http-01","problem":{"type":"","title":"","detail":"","instance":"","subproblems":[]}}
Nov 03 20:58:40 localhost caddy[743]: {"level":"error","ts":1699045120.462563,"logger":"http.acme_client","msg":"validating authorization","identifier":"www.example.com","problem":{"type":"","title":"","detail":"","instance":"","subproblems":[]},"order":"https://acme.zerossl.com/v2/DV90/order/mTz9ooJGs07doKiMwtk8cg","attempt":1,"max_attempts":3}
Nov 03 20:58:40 localhost caddy[743]: {"level":"error","ts":1699045120.462581,"logger":"tls.obtain","msg":"could not get certificate from issuer","identifier":"www.example.com","issuer":"acme.zerossl.com-v2-DV90","error":"HTTP 0  - "}
Nov 03 20:58:40 localhost caddy[743]: {"level":"error","ts":1699045120.4626012,"logger":"tls.obtain","msg":"will retry","error":"[www.example.com] Obtain: [www.example.com] solving challenge: www.example.com: [www.example.com] authorization failed: HTTP 0  -  (ca=https://acme.zerossl.com/v2/DV90)","attempt":11,"retrying_in":10800,"elapsed":10913.648692574,"max_duration":2592000}
Nov 03 23:58:40 localhost caddy[743]: {"level":"info","ts":1699055920.4628372,"logger":"tls.obtain","msg":"obtaining certificate","identifier":"www.example.com"}
Nov 03 23:58:40 localhost caddy[743]: {"level":"info","ts":1699055920.7463784,"logger":"http.acme_client","msg":"trying to solve challenge","identifier":"www.example.com","challenge_type":"http-01","ca":"https://acme-staging-v02.api.letsencrypt.org/directory"}
Nov 03 23:58:40 localhost caddy[743]: {"level":"info","ts":1699055920.9155586,"logger":"http","msg":"served key authentication","identifier":"www.example.com","challenge":"http-01","remote":"[2600:1f16:13c:c400:6c17:1c15:6cae:9f75]:56616","distributed":false}
Nov 03 23:58:40 localhost caddy[743]: {"level":"info","ts":1699055920.9231799,"logger":"http","msg":"served key authentication","identifier":"www.example.com","challenge":"http-01","remote":"[2600:1f14:a8b:500:d286:f81d:2bc4:5af5]:19304","distributed":false}
Nov 03 23:58:41 localhost caddy[743]: {"level":"info","ts":1699055921.0895143,"logger":"http","msg":"served key authentication","identifier":"www.example.com","challenge":"http-01","remote":"[2600:3000:2710:300::21]:11630","distributed":false}
Nov 03 23:58:41 localhost caddy[743]: {"level":"info","ts":1699055921.3745823,"logger":"http.acme_client","msg":"authorization finalized","identifier":"www.example.com","authz_status":"valid"}
Nov 03 23:58:41 localhost caddy[743]: {"level":"info","ts":1699055921.3746095,"logger":"http.acme_client","msg":"validations succeeded; finalizing order","order":"https://acme-staging-v02.api.letsencrypt.org/acme/order/123456/7890"}
Nov 03 23:58:44 localhost caddy[743]: {"level":"info","ts":1699055924.5692718,"logger":"http.acme_client","msg":"successfully downloaded available certificate chains","count":2,"first_url":"https://acme-staging-v02.api.letsencrypt.org/acme/cert/f00b4d"}
Nov 03 23:58:45 localhost caddy[743]: {"level":"info","ts":1699055925.0794523,"logger":"http","msg":"waiting on internal rate limiter","identifiers":["www.example.com"],"ca":"https://acme-v02.api.letsencrypt.org/directory","account":"caddy@zerossl.com"}
Nov 03 23:58:45 localhost caddy[743]: {"level":"info","ts":1699055925.0794888,"logger":"http","msg":"done waiting on internal rate limiter","identifiers":["www.example.com"],"ca":"https://acme-v02.api.letsencrypt.org/directory","account":"caddy@zerossl.com"}
Nov 03 23:58:45 localhost caddy[743]: {"level":"info","ts":1699055925.2289474,"logger":"http.acme_client","msg":"trying to solve challenge","identifier":"www.example.com","challenge_type":"http-01","ca":"https://acme-v02.api.letsencrypt.org/directory"}
Nov 03 23:58:45 localhost caddy[743]: {"level":"info","ts":1699055925.4423006,"logger":"http","msg":"served key authentication","identifier":"www.example.com","challenge":"http-01","remote":"[2600:1f16:269:da02:b2c2:bd01:e48:fbc9]:46942","distributed":false}
Nov 03 23:58:45 localhost caddy[743]: {"level":"info","ts":1699055925.4966943,"logger":"http","msg":"served key authentication","identifier":"www.example.com","challenge":"http-01","remote":"[2600:3000:2710:200::84]:56743","distributed":false}
Nov 03 23:58:45 localhost caddy[743]: {"level":"info","ts":1699055925.5315127,"logger":"http","msg":"served key authentication","identifier":"www.example.com","challenge":"http-01","remote":"[2600:1f14:804:fd02:1c4d:e9ea:6941:fe53]:25032","distributed":false}
Nov 03 23:58:45 localhost caddy[743]: {"level":"info","ts":1699055925.8528426,"logger":"http.acme_client","msg":"authorization finalized","identifier":"www.example.com","authz_status":"valid"}
Nov 03 23:58:45 localhost caddy[743]: {"level":"info","ts":1699055925.8528686,"logger":"http.acme_client","msg":"validations succeeded; finalizing order","order":"https://acme-v02.api.letsencrypt.org/acme/order/1234/56789"}
Nov 03 23:58:46 localhost caddy[743]: {"level":"info","ts":1699055926.3509934,"logger":"http.acme_client","msg":"successfully downloaded available certificate chains","count":2,"first_url":"https://acme-v02.api.letsencrypt.org/acme/cert/f00b4d"}
Nov 03 23:58:46 localhost caddy[743]: {"level":"info","ts":1699055926.3522024,"logger":"tls.obtain","msg":"certificate obtained successfully","identifier":"www.example.com"}
Nov 03 23:58:46 localhost caddy[743]: {"level":"info","ts":1699055926.3528235,"logger":"tls.obtain","msg":"releasing lock","identifier":"www.example.com"}

I think that’s a good example of how a production-ready ACME client should behave, and it’s something that all Caddy sites get for free, automatically, without any configuration. I hope it serves you well!

8 Likes

As a user, my experience is that the provisioning of HTTS usually take 30sec or several minutes in other platform. I am curious how come Caddy did it so fast.

When I use Caddy, it’s like subseconds. Is ACME integration the secret sauce?

(I also replied on Twitter)

At minimum, ACME clients are constrained by the speed of the CA. Let’s Encrypt is very fast, usually within a couple seconds it’s done.

ACME is a protocol that involves polling. It depends how the clients implement their polling. Caddy (CertMagic/acmez) tries pretty quickly with exponential backoff and honors the Retry-After header. That could be another reason.

Otherwise, I’m not sure why other ACME clients would take that long. :man_shrugging: