Thanks for the reply @Whitestrake. The example test command was helpful, especially the bit about the staging server.
I did as suggested, using a copy of the Caddyfile for testing. 9 out of 11 domains obtained certificates via the DNS challenge without incident, but the other 2 domains had what seemed like transient issues with Cloudflare’s API:
2018/05/02 21:13:33 [example1.com] failed to get certificate: acme: Error 403 - urn:ietf:params:acme:error:unauthorized - No TXT record found at _acme-challenge.example1.com
2018/05/02 21:17:36 [www.example2.com] failed to get certificate: acme: Error 400 - urn:ietf:params:acme:error:dns - DNS problem: NXDOMAIN looking up TXT for _acme-challenge.www.example2.com
At this point, Caddy fails to start and drops me back into my shell. Since this is just a test using the CLI, I just re-ran the command, and it obtained those 2 certificates without issue.
However, this has not been the most reassuring test. 2 out of 11 domains failed on the first run, due to issues I can’t really control, and Caddy apparently just stops, as opposed to serving traffic for the 9 domains it obtained certs for.
Also, at this point, my production instance of Caddy is still using the existing certs obtained with HTTP or TLS-SNI challenges. And you seem to be telling me there’s nothing I can do about that (using Caddy’s toolset) until the certs expire? This seems problematic for a few reasons:
-
I’ve not really truly tested the production infrastructure. Ex: I used acme staging servers.
-
I’d like to know the production configuration will work now, while all this is fresh in my mind, not in 60 days.
-
If a problem does arise when the production configuration hits the natural renewal cycle, I won’t be sitting in front of the server watching logs, and from my test, it seems like Caddy will fail and stop serving traffic for all domains. (I will enable monitoring, but let’s treat that as an aside).
For #3, perhaps systemd will restart the service? I’m not sure, and this seems like another reason to test the production configuration now.
So is there really no way to force renewal of the certificates? Certbot has this option via the “–force-renewal” argument, which I’ve successfully used in the past. I am aware of the caveats with regard to rate limits.
I think I can probably hack around this by deleting the existing certificates on disk, but I’d rather not resort to that if possible.