Debugging letsencrypt vs zerossl

1. The problem I’m having:

I got an email from letsencrypt that my cert is expiring. And I was “what?”

Sure, the site is currently down - but caddy is available.

So I checked the cert:

curl -v https://paleocoran.de
* Host paleocoran.de:443 was resolved.
* IPv6: (none)
* IPv4: 188.34.187.82
*   Trying 188.34.187.82:443...
* ALPN: curl offers h2,http/1.1
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
* TLSv1.3 (IN), TLS handshake, Certificate (11):
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
* TLSv1.3 (IN), TLS handshake, Finished (20):
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.3 (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / TLS_AES_128_GCM_SHA256 / x25519 / id-ecPublicKey
* ALPN: server accepted h2
* Server certificate:
*  subject: CN=paleocoran.de
*  start date: Dec  2 00:00:00 2024 GMT
*  expire date: Mar  2 23:59:59 2025 GMT
*  subjectAltName: host "paleocoran.de" matched cert's "paleocoran.de"
*  issuer: C=AT; O=ZeroSSL; CN=ZeroSSL ECC Domain Secure Site CA
*  SSL certificate verify ok.
*   Certificate level 0: Public key type EC/prime256v1 (256/128 Bits/secBits), signed using ecdsa-with-SHA384
*   Certificate level 1: Public key type EC/secp384r1 (384/192 Bits/secBits), signed using ecdsa-with-SHA384
*   Certificate level 2: Public key type EC/secp384r1 (384/192 Bits/secBits), signed using ecdsa-with-SHA384

And there it says “ZeroSSL” not “Let’s Encrypt”.

Why? I might have upgraded caddy a while back - but other than that nothing has changed.

The other domains on the same caddy instances are still on “Let’s Encrypt”.

Any idea what is going on?

2. Error messages and/or full log output:

{"logger":"using provided configuration","config_file":"/etc/caddy/Caddyfile","config_adapter":"caddyfile"}
{"logger":"Caddyfile input is not formatted; run 'caddy fmt --overwrite' to fix inconsistencies","adapter":"caddyfile","file":"/etc/caddy/Caddyfile","line":2}
{"logger":"admin","msg":"admin endpoint started","address":":2019","enforce_origin":false,"origins":["//:2019"]}
{"logger":"admin","msg":"admin endpoint on open interface; host checking disabled","address":":2019"}
{"logger":"tls.cache.maintenance","msg":"started background certificate maintenance","cache":"0xc00043ec80"}
{"logger":"http.auto_https","msg":"server is listening only on the HTTPS port but has no TLS connection policies; adding one to enable TLS","server_name":"srv0","https_port":443}
{"logger":"http.auto_https","msg":"enabling automatic HTTP->HTTPS redirects","server_name":"srv0"}
{"logger":"http","msg":"enabling HTTP/3 listener","addr":":443"}
{"logger":"http.log","msg":"server running","name":"srv0","protocols":["h1","h2","h3"]}
{"logger":"http.log","msg":"server running","name":"remaining_auto_https_redirects","protocols":["h1","h2","h3"]}
{"logger":"http","msg":"enabling automatic TLS certificate management","domains":["paleocoran.de","edit.corpuscoranicum.org","www.corpuscoranicum.org","corpuscoranicum.org"]}
{"logger":"tls","msg":"cleaning storage unit","storage":"FileStorage:/data/caddy"}
{"logger":"tls","msg":"finished cleaning storage units"}
{"logger":"autosaved config (load with --resume flag)","file":"/config/caddy/autosave.json"}
{"logger":"serving initial configuration"}
{"logger":"tls.cache.maintenance","msg":"advancing OCSP staple","identifiers":["paleocoran.de"],"from":1737618567,"to":1737920966}
{"logger":"tls.issuance.acme","msg":"looking up info for HTTP challenge","host":"paleocoran.de","remote_addr":"10.42.1.1:39239","user_agent":"Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)","error":"no information found to solve challenge for identifier: paleocoran.de"}
{"logger":"tls.issuance.zerossl","msg":"looking up info for HTTP challenge","host":"paleocoran.de","remote_addr":"10.42.1.1:39239","user_agent":"Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)","error":"no information found to solve challenge for identifier: paleocoran.de"}
{"logger":"tls.issuance.acme","msg":"looking up info for HTTP challenge","host":"paleocoran.de","remote_addr":"10.42.1.1:9526","user_agent":"Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)","error":"no information found to solve challenge for identifier: paleocoran.de"}
{"logger":"tls.issuance.zerossl","msg":"looking up info for HTTP challenge","host":"paleocoran.de","remote_addr":"10.42.1.1:9526","user_agent":"Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)","error":"no information found to solve challenge for identifier: paleocoran.de"}
{"logger":"tls","msg":"tls-alpn challenge","remote_addr":"10.42.1.1:37342","server_name":"paleocoran.de","error":"no information found to solve challenge for identifier: paleocoran.de"}
{"logger":"tls.issuance.acme","msg":"looking up info for HTTP challenge","host":"paleocoran.de","remote_addr":"10.42.1.1:2739","user_agent":"acme.zerossl.com/v2/DV90","error":"no information found to solve challenge for identifier: paleocoran.de"}
{"logger":"tls.issuance.zerossl","msg":"looking up info for HTTP challenge","host":"paleocoran.de","remote_addr":"10.42.1.1:2739","user_agent":"acme.zerossl.com/v2/DV90","error":"no information found to solve challenge for identifier: paleocoran.de"}
{"logger":"tls.issuance.acme","msg":"looking up info for HTTP challenge","host":"paleocoran.de","remote_addr":"10.42.1.1:1279","user_agent":"acme.zerossl.com/v2/DV90","error":"no information found to solve challenge for identifier: paleocoran.de"}
{"logger":"tls.issuance.zerossl","msg":"looking up info for HTTP challenge","host":"paleocoran.de","remote_addr":"10.42.1.1:1279","user_agent":"acme.zerossl.com/v2/DV90","error":"no information found to solve challenge for identifier: paleocoran.de"}
{"logger":"tls","msg":"storage cleaning happened too recently; skipping for now","storage":"FileStorage:/data/caddy","instance":"44a8fc3f-483b-4420-b0c2-d3d44fe7fe43","try_again":1737470935.6951663,"try_again_in":86399.999999343}
{"logger":"tls","msg":"finished cleaning storage units"}
{"logger":"tls.issuance.acme","msg":"looking up info for HTTP challenge","host":"paleocoran.de","remote_addr":"10.42.1.1:34519","user_agent":"Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)","error":"no information found to solve challenge for identifier: paleocoran.de"}
{"logger":"tls.issuance.zerossl","msg":"looking up info for HTTP challenge","host":"paleocoran.de","remote_addr":"10.42.1.1:34519","user_agent":"Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)","error":"no information found to solve challenge for identifier: paleocoran.de"}
{"logger":"tls.issuance.acme","msg":"looking up info for HTTP challenge","host":"paleocoran.de","remote_addr":"10.42.1.1:32795","user_agent":"Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)","error":"no information found to solve challenge for identifier: paleocoran.de"}
{"logger":"tls.issuance.zerossl","msg":"looking up info for HTTP challenge","host":"paleocoran.de","remote_addr":"10.42.1.1:32795","user_agent":"Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)","error":"no information found to solve challenge for identifier: paleocoran.de"}
{"logger":"tls","msg":"tls-alpn challenge","remote_addr":"10.42.1.1:1510","server_name":"paleocoran.de","error":"no information found to solve challenge for identifier: paleocoran.de"}
{"logger":"tls.issuance.acme","msg":"looking up info for HTTP challenge","host":"paleocoran.de","remote_addr":"10.42.1.1:26742","user_agent":"acme.zerossl.com/v2/DV90","error":"no information found to solve challenge for identifier: paleocoran.de"}
{"logger":"tls.issuance.zerossl","msg":"looking up info for HTTP challenge","host":"paleocoran.de","remote_addr":"10.42.1.1:26742","user_agent":"acme.zerossl.com/v2/DV90","error":"no information found to solve challenge for identifier: paleocoran.de"}
{"logger":"tls.issuance.acme","msg":"looking up info for HTTP challenge","host":"paleocoran.de","remote_addr":"10.42.1.1:45775","user_agent":"acme.zerossl.com/v2/DV90","error":"no information found to solve challenge for identifier: paleocoran.de"}
{"logger":"tls.issuance.zerossl","msg":"looking up info for HTTP challenge","host":"paleocoran.de","remote_addr":"10.42.1.1:45775","user_agent":"acme.zerossl.com/v2/DV90","error":"no information found to solve challenge for identifier: paleocoran.de"}

3. Caddy version:

image: caddy:2.7.6-alpine

4. How I installed and ran Caddy:

k3s, manifest below

a. System environment:

k3s on nixOS, x86

b. Command:

c. Service/unit/compose file:

      containers:
        - name: caddy
          image: caddy:2.7.6-alpine
          ports:
            - name: http
              containerPort: 80
            - name: https
              containerPort: 443
            - name: admin
              containerPort: 2019
          volumeMounts:
            - name: caddy-config
              mountPath: /etc/caddy/Caddyfile
              subPath: Caddyfile
            - name: caddy-data
              mountPath: /data
            - name: cdn-storage
              mountPath: /srv/volumes/cdn

d. My complete Caddy config:

{
	email "tcurdt@redacted.org"
	admin :2019
	servers {
		metrics
	}
	# debug
	# acme_ca https://acme-staging-v02.api.letsencrypt.org/directory
}

www.redacted.org {
	handle /.well-known/acme-challenge/* {
		respond "{http.request.uri.path}" 200
	}
	handle {
		redir https://redacted.org{uri} permanent
	}
}

redacted.org {
	header -server

	@exists {
		file /srv/volumes/cdn/maintenance
	}

	handle @exists {
		respond "We will be back shortly" 503 {
			close
		}
	}

	handle_path /digitallibrary/servlet/* {
		rewrite * /digilib{path}
		reverse_proxy https://digilib.redacted.de:443 {
			header_up Host {upstream_hostport}
		}
	}

	# otherwise pass on to proxy

	handle {
		reverse_proxy http://cc.live-cc.svc.cluster.local:80
	}
}

edit.redacted.org {
	header -server

	reverse_proxy http://cc-edit.live-cc.svc.cluster.local:80
}

paleocoran.de {
	header -server

	reverse_proxy http://pc.live-pc.svc.cluster.local:80
}

5. Links to relevant resources:

That’s normal. It’s possible that Let’s Encrypt server was down/unavailable at the time of renewal, so Caddy tried the next CA it supports by default, i.e. ZeroSSL. By default, Caddy picks one CA/issuer at random, and falls back to the next one if the first choice fails to provider a certificate. It’s part of the robustness design.

1 Like

Oh, I didn’t think it was picking at random.
Is there a way to configure to only use one? (and retry)

I didn’t see it here

You use the issuer sub-directive of tls or cert_issuer global option to configure only 1. Any reason for disliking having a backup? Case in point, Caddy tried with ZeroSSL because Let’s Encrypt was not available at the time of re-issuance. If, for any reason, Let’s Encrypt (or ZeroSSL) experience extended downtime, you run the risk of having expired certificates (though Caddy tries to renew the certs 2/3 of validity before expiry, which is a long time).

2 Likes

I would rather try a couple of times (say, once a day?) with letsencrypt than have it switch providers on the first fail and have it send me an expire notice (which makes it unclear whether there is a problem or not).

On the other hand, I am also tracking expiry of certs in monitoring - so it’s no that bad.

But understand the config - how would I disable zerossl?

{
	cert_issuer acme {
		...
	}
	cert_issuer zerossl {
		...
	}
}

It seems like I can only add it?

Would specifying one cert_issuer acme override the default?

Yes.

1 Like

Why though? If there’s a problem renewing a cert, we should not just wait around hoping it will resolve itself when we can do something about it.

Retrying is doing something about it. Seems like retrying needs to be done no matter how many providers are configured.

And as I state above, switching providers can cause some confusion.
Whether that’s a big deal or not is another matter.

I may have misunderstood, but I thought you were suggesting that we wait a whole day before retrying at all, and with the same CA, rather than renewing the cert as soon as possible. That would certainly be likely to lead to downtime in some cases.

Let’s Encrypt announced this week that they won’t be sending expiration emails anymore, so, that should help avoid confusion.

I am not sure that’s a likely outcome. Once the cert enters the “expires soon” phase that’s like n days of trying. With n=14 that’s a lot of tries before this could cause downtime. And it also does not have to be every 24h. The frequency could be adjusted to whatever the cert providers allow.

There is no guarantee the next provider will succeed. So some sort of retry behaviour would need to exist anyway. And it’s a modulo operation away to implement this as described.

Lol

Well, that kind of solved the problem in a totally different way :slight_smile:

FYI, many people will be using 6-day certs soon, so n < ~2.

I don’t think there’s anything we need to change or will change from this, but let me know if anything else comes up or if I’m totally missing the point…

Why is that?
Can you provide some details? Just curious.

I don’t think it matters how long the certs are valid for.
I still think something like this would be nice:

renewWhenLessThan = 3.days
retryInterval = 5.hours

timeLeft = timeUntilCertExpired - timeNow
if (timeLeft < renewWhenLessThan) {
  renewAttempt = floor(timeLeft / retryInterval)
  if (renewAttempt not already tried) {
    try to renew cert
    mark renewAttempt as tried
  }
}

I was kind of expecting the renewal to work like that already.
But it’s more likely that I am the one missing the point :slight_smile:

That’s kinda how the logic works right now? We just start trying to renew early and often, with gradual backoff in case of error. The only thing is that we maximise our chances of successfully keeping the website online by trying for any cert we can get. If the original provider comes back online before we succeed with an alternative provider, no problem, we just renew as normal. If you don’t want that behaviour, you can prevent Caddy from using an alternative provider, but for the most part maximising the chances of keeping the site up is more important to us than maintaining a uniform certificate provider.

2 Likes

Great. I couldn’t imagine it to be different TBH.

Is the retry behaviour (and especially the gradual backoff) configurable?
Just curious. I didn’t find that.

That sure is the right priority :slight_smile:

I would be slightly reluctant to switch to 6 day cert for the same reason though.
Of course the shorter the better - but it feels weird they chose 6-days. That’s quite a decrease.

With LE no longer sending the emails there really isn’t anything to change for me. And even then I could just turn on exactly one provider. I just wasn’t sure about the exact retry behaviour.

Thanks for clarifying this.

I don’t think the retry behaviour is user-configurable. If I remember rightly, it’s part of CertMagic (which powers Caddy’s Automatic HTTPS). Caddy’s behaviour is documented a little more here: https://caddyserver.com/docs/automatic-https#errors

1 Like

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.