Let's Encrypt rate limit after update of container


(Tanguy ⧓ Herrmann) #1

I got the second email about TLS-SNI-01 today, and I thought I should do something about it, so without reading docs, I updated my container to latest version 0.11.2 from abiosoft.
It seems I should have waited for 0.11.3 :smiley: But anyway, I got rate limited by let’s encrypt. I have 22 domain that I use LE on and the log is below.

My problem is, I don’t know how to get out of the rate limit. I don’t know how to configure caddy to avoid this scenario in the future.

I mean, this setup has been working pretty well for more that 3 years, so it’s more than OK, it’s just that I’m gonna face downtime for some of my clients and I don’t know what to do.

log


(Sugarcube) #2

Caddy stores the certificates. If you use a containter, they are normally stored inside the container. You have to make sure they are persisted outside of the container by mounting that path on the file sytem.
I don’t know your container in particular, but in the link you provide it appears to be explained exactly how to configure it.


(Nicolas) #3

If I were you, I would roll back to 0.11.1 until a new version comes out. That way I did today, since I had the same error with the newest version.

I guess the update won’t be long, but until then, it will be fine with the previous version.


(Tanguy ⧓ Herrmann) #4

Well, the container worked for quite a few years, so I did persist the certificate in a volume as explained in the doc. (But I did get bit by it back in the days)
This case was a little different.

As @nicolinux suggested, I rolled back to 0.11.1, just in case, and my method to really avoid the rate limiting was to comment my whole Caddyfile and uncomment a few, with priorities on essential services, and then uncomment more a few hours after, etc etc.
But it is not scalable. I will wait for more suggestion.
Shouldn’t caddy handle the rate limit by itself by querying hostname by hostname so as to avoid to be ratelimited. (Even if the service starts with only 3 domain activated at first, and enable more and more each time it can?). Or should I use some special configuration ton have a SAN (regrouping at least all the subdomain of a same domain in the same cert?) Or maybe do some wildcard?


(Matt Holt) #5

You probably just hit this bug, which has already been patched on master: https://github.com/mholt/caddy/issues/2400 but it’s impossible to be sure without any actual log output.


(Matt Holt) #6

@dolanor @sugarcube @nicolinux If you could all help test out this change, to make sure it works for you, otherwise we will be rolling out a release anyway and that would not be good if it was still broken: https://github.com/mholt/caddy/pull/2452


(Tanguy ⧓ Herrmann) #7

I used the PR-2452 to try it, and in fact, it throttled correctly.
One way to improve it would be to have a log telling us that caddy is waiting before calling Let’s Encrypt again. I thought it didn’t work since no services were running, but then I went to bed, and the next day, it was running fine. Looking through the log, 2h later after the start, it continued certificate generation and started the service once done.

...
2019/02/03 02:29:25 [INFO][sub.domain3.com] Obtain: Certificate already exists in storage
2019/02/03 03:29:21 [INFO][FileStorage:/root/.caddy] Scanning for stale OCSP staples
2019/02/03 03:29:21 [INFO][FileStorage:/root/.caddy] Done checking OCSP staples
2019/02/03 04:29:19 [INFO][FileStorage:/root/.caddy] Lock for 'cert_acme:domain.com:https://acme-v02.api.letsencrypt.org/directory' is stale; removing then retrying: 
ks/cert_acmedomain.comhttpsacme-v02.api.letsencrypt.orgdirectory.lock
2019/02/03 04:29:19 [INFO] [domain.com] acme: Obtaining bundled SAN certificate
...

I pushed my abiosoft/caddy image to my hub : dolanor/caddy:pr-2542 so others can try.


(Nicolas) #8

Sorry, I did not had time to test. But I updated Caddy this morning on the server that was problematic, and it worked fine.

Thank you for the update !