Caddy as TLS reverse proxy with custom and automatic certs

1. The problem I’m having:

Caddy is used as a drop-in TLS proxy for a web service, which was previously running its own HTTPS server. It is configured to use automatic renewal of certificates, but also provided a custom certificate, so it has a valid one on first start.

For service switchover, the DNS A record is changed from the old server to the new deployment. This works without problems. Caddy picks up the custom cert and it is visible in the web browser.

From the logs, I can see that the automatic cert has been created too. It can be downloaded from the cert storage and checks out.

My question is, what will happen when the custom cert expires? Will Caddy pick up and deliver the new cert on-the-fly?

I have already read the documentation and searched this forum and the web for a conclusive answer. I think the more general questions to this issue are:

  • What is the expected/actual behavior of using a global auto_https ignore_loaded_certs with a custom cert per site? Does Caddy keep a list of certs per site and switch to another if one expires?
  • Is there a recommended procedure to use Caddy as a TLS proxy with automatic cert renewal but still have zero downtime when it goes live? I suppose with the DNS challenge Caddy could request a new cert before it is assigned its designated domain name, but this has some downsides I’d like to avoid.

2. Error messages and/or full log output:

Here is part of the full log since the last restart (minus health check messages).

{"level":"info","ts":1723465295.0529082,"logger":"tls.cache.maintenance","msg":"started background certificate maintenance","cache":"0xc00021d100"}
{"level":"info","ts":1723465295.2162435,"logger":"http.auto_https","msg":"enabling automatic HTTP->HTTPS redirects","server_name":"srv0"}
{"level":"info","ts":1723465295.2285855,"logger":"http.log","msg":"server running","name":"remaining_auto_https_redirects","protocols":["h1","h2","h3"]}
{"level":"info","ts":1723465295.2295237,"logger":"http","msg":"enabling HTTP/3 listener","addr":":443"}
{"level":"info","ts":1723465295.2398233,"logger":"http.handlers.reverse_proxy.health_checker.active","msg":"HTTP request failed","host":"localhost:8080","error":"Get \"http://localhost:8080/ping\": dial tcp 127.0.0.1:8080: connect: connection refused"}
{"level":"info","ts":1723465295.2399693,"logger":"http.handlers.reverse_proxy.health_checker.active","msg":"HTTP request failed","host":"localhost:8080","error":"Get \"http://localhost:8080/ping\": dial tcp 127.0.0.1:8080: connect: connection refused"}
{"level":"info","ts":1723465295.2401812,"logger":"http.log","msg":"server running","name":"srv0","protocols":["h1","h2","h3"]}
{"level":"info","ts":1723465295.2402072,"logger":"http","msg":"enabling automatic TLS certificate management","domains":["prod.example.com","test.example.com"]}
{"level":"info","ts":1723465295.3985739,"logger":"tls","msg":"storage cleaning happened too recently; skipping for now","storage":"FileStorage:/data/caddy","instance":"62949b15-1235-4147-89ec-9fe53aa77551","try_again":1723551695.3985674,"try_again_in":86399.9999981}
{"level":"info","ts":1723465295.409549,"logger":"tls.obtain","msg":"acquiring lock","identifier":"prod.example.com"}
{"level":"info","ts":1723465295.4095747,"logger":"tls","msg":"finished cleaning storage units"}
{"level":"info","ts":1723465295.4694626,"logger":"tls.obtain","msg":"lock acquired","identifier":"prod.example.com"}
{"level":"info","ts":1723465295.4697268,"logger":"tls.obtain","msg":"obtaining certificate","identifier":"prod.example.com"}
{"level":"info","ts":1723465295.6137,"logger":"http","msg":"waiting on internal rate limiter","identifiers":["prod.example.com"],"ca":"https://acme-v02.api.letsencrypt.org/directory","account":""}
{"level":"info","ts":1723465295.6137733,"logger":"http","msg":"done waiting on internal rate limiter","identifiers":["prod.example.com"],"ca":"https://acme-v02.api.letsencrypt.org/directory","account":""}
{"level":"info","ts":1723465295.6137993,"logger":"http","msg":"using ACME account","account_id":"https://acme-v02.api.letsencrypt.org/acme/acct/1883892876","account_contact":[]}
{"level":"info","ts":1723465296.5240521,"logger":"http.acme_client","msg":"trying to solve challenge","identifier":"prod.example.com","challenge_type":"tls-alpn-01","ca":"https://acme-v02.api.letsencrypt.org/directory"}
{"level":"error","ts":1723465297.5014439,"logger":"http.acme_client","msg":"challenge failed","identifier":"prod.example.com","challenge_type":"tls-alpn-01","problem":{"type":"urn:ietf:params:acme:error:unauthorized","title":"","detail":"Cannot negotiate ALPN protocol \"acme-tls/1\" for tls-alpn-01 challenge","instance":"","subproblems":[]}}
{"level":"error","ts":1723465297.5016065,"logger":"http.acme_client","msg":"validating authorization","identifier":"prod.example.com","problem":{"type":"urn:ietf:params:acme:error:unauthorized","title":"","detail":"Cannot negotiate ALPN protocol \"acme-tls/1\" for tls-alpn-01 challenge","instance":"","subproblems":[]},"order":"https://acme-v02.api.letsencrypt.org/acme/order/1883892876/295629792376","attempt":1,"max_attempts":3}
{"level":"info","ts":1723465298.900897,"logger":"http.acme_client","msg":"trying to solve challenge","identifier":"prod.example.com","challenge_type":"http-01","ca":"https://acme-v02.api.letsencrypt.org/directory"}
{"level":"error","ts":1723465309.5895202,"logger":"http.acme_client","msg":"challenge failed","identifier":"prod.example.com","challenge_type":"http-01","problem":{"type":"urn:ietf:params:acme:error:connection","title":"","detail":"20.4.144.254: Fetching http://prod.example.com/.well-known/acme-challenge/1KP-Am0zS7_hDzZhTJ-Ho5pfyY0J5hFXqWqcxfg_UFc: Timeout during connect (likely firewall problem)","instance":"","subproblems":[]}}
{"level":"error","ts":1723465309.5909123,"logger":"http.acme_client","msg":"validating authorization","identifier":"prod.example.com","problem":{"type":"urn:ietf:params:acme:error:connection","title":"","detail":"20.4.144.254: Fetching http://prod.example.com/.well-known/acme-challenge/1KP-Am0zS7_hDzZhTJ-Ho5pfyY0J5hFXqWqcxfg_UFc: Timeout during connect (likely firewall problem)","instance":"","subproblems":[]},"order":"https://acme-v02.api.letsencrypt.org/acme/order/1883892876/295629798906","attempt":2,"max_attempts":3}
{"level":"error","ts":1723465309.5910184,"logger":"tls.obtain","msg":"could not get certificate from issuer","identifier":"prod.example.com","issuer":"acme-v02.api.letsencrypt.org-directory","error":"HTTP 400 urn:ietf:params:acme:error:connection - 20.4.144.254: Fetching http://prod.example.com/.well-known/acme-challenge/1KP-Am0zS7_hDzZhTJ-Ho5pfyY0J5hFXqWqcxfg_UFc: Timeout during connect (likely firewall problem)"}
{"level":"error","ts":1723465309.5912375,"logger":"tls.obtain","msg":"will retry","error":"[prod.example.com] Obtain: [prod.example.com] solving challenge: prod.example.com: [prod.example.com] authorization failed: HTTP 400 urn:ietf:params:acme:error:connection - 20.4.144.254: Fetching http://prod.example.com/.well-known/acme-challenge/1KP-Am0zS7_hDzZhTJ-Ho5pfyY0J5hFXqWqcxfg_UFc: Timeout during connect (likely firewall problem) (ca=https://acme-v02.api.letsencrypt.org/directory)","attempt":1,"retrying_in":60,"elapsed":14.121656495,"max_duration":2592000}
{"level":"info","ts":1723465369.6090128,"logger":"tls.obtain","msg":"obtaining certificate","identifier":"prod.example.com"}
{"level":"info","ts":1723465369.7944353,"logger":"http","msg":"using ACME account","account_id":"https://acme-staging-v02.api.letsencrypt.org/acme/acct/158898753","account_contact":[]}
{"level":"info","ts":1723465370.694774,"logger":"http.acme_client","msg":"trying to solve challenge","identifier":"prod.example.com","challenge_type":"http-01","ca":"https://acme-staging-v02.api.letsencrypt.org/directory"}
{"level":"error","ts":1723465381.584563,"logger":"http.acme_client","msg":"challenge failed","identifier":"prod.example.com","challenge_type":"http-01","problem":{"type":"urn:ietf:params:acme:error:connection","title":"","detail":"20.4.144.254: Fetching http://prod.example.com/.well-known/acme-challenge/eYKdSh_P_SZnzRXccVhQTg4JgjIjx-UqoD7uIKzCCiA: Timeout during connect (likely firewall problem)","instance":"","subproblems":[]}}
{"level":"error","ts":1723465381.5846756,"logger":"http.acme_client","msg":"validating authorization","identifier":"prod.example.com","problem":{"type":"urn:ietf:params:acme:error:connection","title":"","detail":"20.4.144.254: Fetching http://prod.example.com/.well-known/acme-challenge/eYKdSh_P_SZnzRXccVhQTg4JgjIjx-UqoD7uIKzCCiA: Timeout during connect (likely firewall problem)","instance":"","subproblems":[]},"order":"https://acme-staging-v02.api.letsencrypt.org/acme/order/158898753/18366061993","attempt":1,"max_attempts":3}
{"level":"info","ts":1723465382.921561,"logger":"http.acme_client","msg":"trying to solve challenge","identifier":"prod.example.com","challenge_type":"tls-alpn-01","ca":"https://acme-staging-v02.api.letsencrypt.org/directory"}
{"level":"error","ts":1723465383.9595618,"logger":"http.acme_client","msg":"challenge failed","identifier":"prod.example.com","challenge_type":"tls-alpn-01","problem":{"type":"urn:ietf:params:acme:error:unauthorized","title":"","detail":"Cannot negotiate ALPN protocol \"acme-tls/1\" for tls-alpn-01 challenge","instance":"","subproblems":[]}}
{"level":"error","ts":1723465383.9597347,"logger":"http.acme_client","msg":"validating authorization","identifier":"prod.example.com","problem":{"type":"urn:ietf:params:acme:error:unauthorized","title":"","detail":"Cannot negotiate ALPN protocol \"acme-tls/1\" for tls-alpn-01 challenge","instance":"","subproblems":[]},"order":"https://acme-staging-v02.api.letsencrypt.org/acme/order/158898753/18366064323","attempt":2,"max_attempts":3}
{"level":"error","ts":1723465383.959792,"logger":"tls.obtain","msg":"could not get certificate from issuer","identifier":"prod.example.com","issuer":"acme-v02.api.letsencrypt.org-directory","error":"HTTP 403 urn:ietf:params:acme:error:unauthorized - Cannot negotiate ALPN protocol \"acme-tls/1\" for tls-alpn-01 challenge"}
{"level":"error","ts":1723465383.961113,"logger":"tls.obtain","msg":"will retry","error":"[prod.example.com] Obtain: [prod.example.com] solving challenge: prod.example.com: [prod.example.com] authorization failed: HTTP 403 urn:ietf:params:acme:error:unauthorized - Cannot negotiate ALPN protocol \"acme-tls/1\" for tls-alpn-01 challenge (ca=https://acme-staging-v02.api.letsencrypt.org/directory)","attempt":2,"retrying_in":120,"elapsed":88.491520814,"max_duration":2592000}
{"level":"info","ts":1723465503.9794903,"logger":"tls.obtain","msg":"obtaining certificate","identifier":"prod.example.com"}
{"level":"info","ts":1723465504.2005625,"logger":"http","msg":"using ACME account","account_id":"https://acme-staging-v02.api.letsencrypt.org/acme/acct/158898753","account_contact":[]}
{"level":"info","ts":1723465504.6667373,"logger":"http.acme_client","msg":"trying to solve challenge","identifier":"prod.example.com","challenge_type":"http-01","ca":"https://acme-staging-v02.api.letsencrypt.org/directory"}
{"level":"error","ts":1723465515.5165007,"logger":"http.acme_client","msg":"challenge failed","identifier":"prod.example.com","challenge_type":"http-01","problem":{"type":"urn:ietf:params:acme:error:connection","title":"","detail":"20.4.144.254: Fetching http://prod.example.com/.well-known/acme-challenge/r8-meEJ96t4OrATyNokiYYtYN-sCf1aqnznLq9IIYEc: Timeout during connect (likely firewall problem)","instance":"","subproblems":[]}}
{"level":"error","ts":1723465515.5166357,"logger":"http.acme_client","msg":"validating authorization","identifier":"prod.example.com","problem":{"type":"urn:ietf:params:acme:error:connection","title":"","detail":"20.4.144.254: Fetching http://prod.example.com/.well-known/acme-challenge/r8-meEJ96t4OrATyNokiYYtYN-sCf1aqnznLq9IIYEc: Timeout during connect (likely firewall problem)","instance":"","subproblems":[]},"order":"https://acme-staging-v02.api.letsencrypt.org/acme/order/158898753/18366090373","attempt":1,"max_attempts":3}
{"level":"info","ts":1723465516.8350718,"logger":"http.acme_client","msg":"trying to solve challenge","identifier":"prod.example.com","challenge_type":"tls-alpn-01","ca":"https://acme-staging-v02.api.letsencrypt.org/directory"}
{"level":"error","ts":1723465517.866599,"logger":"http.acme_client","msg":"challenge failed","identifier":"prod.example.com","challenge_type":"tls-alpn-01","problem":{"type":"urn:ietf:params:acme:error:unauthorized","title":"","detail":"Cannot negotiate ALPN protocol \"acme-tls/1\" for tls-alpn-01 challenge","instance":"","subproblems":[]}}
[...]
{"level":"info","ts":1724357372.7809446,"logger":"tls.obtain","msg":"obtaining certificate","identifier":"prod.example.com"}
{"level":"info","ts":1724357372.9168098,"logger":"http","msg":"using ACME account","account_id":"https://acme-staging-v02.api.letsencrypt.org/acme/acct/158898753","account_contact":[]}
{"level":"info","ts":1724357373.6754532,"logger":"http.acme_client","msg":"trying to solve challenge","identifier":"prod.example.com","challenge_type":"tls-alpn-01","ca":"https://acme-staging-v02.api.letsencrypt.org/directory"}
{"level":"error","ts":1724357374.6854925,"logger":"http.acme_client","msg":"challenge failed","identifier":"prod.example.com","challenge_type":"tls-alpn-01","problem":{"type":"urn:ietf:params:acme:error:unauthorized","title":"","detail":"Cannot negotiate ALPN protocol \"acme-tls/1\" for tls-alpn-01 challenge","instance":"","subproblems":[]}}
{"level":"error","ts":1724357374.6856935,"logger":"http.acme_client","msg":"validating authorization","identifier":"prod.example.com","problem":{"type":"urn:ietf:params:acme:error:unauthorized","title":"","detail":"Cannot negotiate ALPN protocol \"acme-tls/1\" for tls-alpn-01 challenge","instance":"","subproblems":[]},"order":"https://acme-staging-v02.api.letsencrypt.org/acme/order/158898753/18578202803","attempt":1,"max_attempts":3}
{"level":"info","ts":1724357376.038395,"logger":"http.acme_client","msg":"trying to solve challenge","identifier":"prod.example.com","challenge_type":"http-01","ca":"https://acme-staging-v02.api.letsencrypt.org/directory"}
{"level":"error","ts":1724357386.8997538,"logger":"http.acme_client","msg":"challenge failed","identifier":"prod.example.com","challenge_type":"http-01","problem":{"type":"urn:ietf:params:acme:error:connection","title":"","detail":"20.4.247.5: Fetching http://prod.example.com/.well-known/acme-challenge/CU388W6wMlFMQBcmzprGXqtwAPSR4DPXyp3Z6gbcy7s: Timeout during connect (likely firewall problem)","instance":"","subproblems":[]}}
{"level":"error","ts":1724357386.899984,"logger":"http.acme_client","msg":"validating authorization","identifier":"prod.example.com","problem":{"type":"urn:ietf:params:acme:error:connection","title":"","detail":"20.4.247.5: Fetching http://prod.example.com/.well-known/acme-challenge/CU388W6wMlFMQBcmzprGXqtwAPSR4DPXyp3Z6gbcy7s: Timeout during connect (likely firewall problem)","instance":"","subproblems":[]},"order":"https://acme-staging-v02.api.letsencrypt.org/acme/order/158898753/18578203243","attempt":2,"max_attempts":3}
{"level":"error","ts":1724357386.9000409,"logger":"tls.obtain","msg":"could not get certificate from issuer","identifier":"prod.example.com","issuer":"acme-v02.api.letsencrypt.org-directory","error":"HTTP 400 urn:ietf:params:acme:error:connection - 20.4.247.5: Fetching http://prod.example.com/.well-known/acme-challenge/CU388W6wMlFMQBcmzprGXqtwAPSR4DPXyp3Z6gbcy7s: Timeout during connect (likely firewall problem)"}
{"level":"error","ts":1724357386.9001331,"logger":"tls.obtain","msg":"will retry","error":"[prod.example.com] Obtain: [prod.example.com] solving challenge: prod.example.com: [prod.example.com] authorization failed: HTTP 400 urn:ietf:params:acme:error:connection - 20.4.247.5: Fetching http://prod.example.com/.well-known/acme-challenge/CU388W6wMlFMQBcmzprGXqtwAPSR4DPXyp3Z6gbcy7s: Timeout during connect (likely firewall problem) (ca=https://acme-staging-v02.api.letsencrypt.org/directory)","attempt":5,"retrying_in":600,"elapsed":672.013475418,"max_duration":2592000}
{"level":"info","ts":1724357986.9134827,"logger":"tls.obtain","msg":"obtaining certificate","identifier":"prod.example.com"}
{"level":"info","ts":1724357987.0888367,"logger":"http","msg":"using ACME account","account_id":"https://acme-staging-v02.api.letsencrypt.org/acme/acct/158898753","account_contact":[]}
{"level":"info","ts":1724357987.9418023,"logger":"http.acme_client","msg":"trying to solve challenge","identifier":"prod.example.com","challenge_type":"tls-alpn-01","ca":"https://acme-staging-v02.api.letsencrypt.org/directory"}
{"level":"info","ts":1724357988.646631,"logger":"tls","msg":"served key authentication certificate","server_name":"prod.example.com","challenge":"tls-alpn-01","remote":"10.92.0.26:53605","distributed":false}
{"level":"info","ts":1724357989.1863937,"logger":"tls","msg":"served key authentication certificate","server_name":"prod.example.com","challenge":"tls-alpn-01","remote":"10.92.0.24:58715","distributed":false}
{"level":"info","ts":1724357989.2366307,"logger":"tls","msg":"served key authentication certificate","server_name":"prod.example.com","challenge":"tls-alpn-01","remote":"10.92.0.26:53609","distributed":false}
{"level":"info","ts":1724357989.3482773,"logger":"tls","msg":"served key authentication certificate","server_name":"prod.example.com","challenge":"tls-alpn-01","remote":"10.92.0.26:53610","distributed":false}
{"level":"info","ts":1724357989.424147,"logger":"tls","msg":"served key authentication certificate","server_name":"prod.example.com","challenge":"tls-alpn-01","remote":"10.92.0.24:58716","distributed":false}
{"level":"info","ts":1724357989.8467963,"logger":"http.acme_client","msg":"authorization finalized","identifier":"prod.example.com","authz_status":"valid"}
{"level":"info","ts":1724357989.8468966,"logger":"http.acme_client","msg":"validations succeeded; finalizing order","order":"https://acme-staging-v02.api.letsencrypt.org/acme/order/158898753/18578365693"}
{"level":"info","ts":1724357993.528086,"logger":"http.acme_client","msg":"got renewal info","names":["prod.example.com"],"window_start":1729453248.3333333,"window_end":1729626048.3333333,"selected_time":1729513609,"recheck_after":1724379593.5280771,"explanation_url":""}
{"level":"info","ts":1724357993.8590724,"logger":"http.acme_client","msg":"got renewal info","names":["prod.example.com"],"window_start":1729453248.3333333,"window_end":1729626048.3333333,"selected_time":1729508927,"recheck_after":1724379593.8590658,"explanation_url":""}
{"level":"info","ts":1724357993.8591976,"logger":"http.acme_client","msg":"successfully downloaded available certificate chains","count":2,"first_url":"https://acme-staging-v02.api.letsencrypt.org/acme/cert/2bcb230c4e863dc9a768b1d7f6fd2edd657e"}
{"level":"info","ts":1724357993.9236412,"logger":"http","msg":"waiting on internal rate limiter","identifiers":["prod.example.com"],"ca":"https://acme-v02.api.letsencrypt.org/directory","account":""}
{"level":"info","ts":1724357993.9237194,"logger":"http","msg":"done waiting on internal rate limiter","identifiers":["prod.example.com"],"ca":"https://acme-v02.api.letsencrypt.org/directory","account":""}
{"level":"info","ts":1724357993.9237547,"logger":"http","msg":"using ACME account","account_id":"https://acme-v02.api.letsencrypt.org/acme/acct/1883892876","account_contact":[]}
{"level":"info","ts":1724357994.810976,"logger":"http.acme_client","msg":"trying to solve challenge","identifier":"prod.example.com","challenge_type":"tls-alpn-01","ca":"https://acme-v02.api.letsencrypt.org/directory"}
{"level":"info","ts":1724357995.4973617,"logger":"tls","msg":"served key authentication certificate","server_name":"prod.example.com","challenge":"tls-alpn-01","remote":"10.92.0.26:53634","distributed":false}
{"level":"info","ts":1724357996.1374078,"logger":"tls","msg":"served key authentication certificate","server_name":"prod.example.com","challenge":"tls-alpn-01","remote":"10.92.0.26:53637","distributed":false}
{"level":"info","ts":1724357996.2003682,"logger":"tls","msg":"served key authentication certificate","server_name":"prod.example.com","challenge":"tls-alpn-01","remote":"10.92.0.26:53638","distributed":false}
{"level":"info","ts":1724357996.3524294,"logger":"tls","msg":"served key authentication certificate","server_name":"prod.example.com","challenge":"tls-alpn-01","remote":"10.92.0.26:53640","distributed":false}
{"level":"info","ts":1724357996.4743264,"logger":"tls","msg":"served key authentication certificate","server_name":"prod.example.com","challenge":"tls-alpn-01","remote":"10.92.0.26:53642","distributed":false}
{"level":"info","ts":1724357997.0896807,"logger":"http.acme_client","msg":"authorization finalized","identifier":"prod.example.com","authz_status":"valid"}
{"level":"info","ts":1724357997.0898,"logger":"http.acme_client","msg":"validations succeeded; finalizing order","order":"https://acme-v02.api.letsencrypt.org/acme/order/1883892876/298565966576"}
{"level":"info","ts":1724357998.4080057,"logger":"http.acme_client","msg":"got renewal info","names":["prod.example.com"],"window_start":1729453256,"window_end":1729626056,"selected_time":1729480303,"recheck_after":1724379598.4079993,"explanation_url":""}
{"level":"info","ts":1724357998.7498388,"logger":"http.acme_client","msg":"got renewal info","names":["prod.example.com"],"window_start":1729453256,"window_end":1729626056,"selected_time":1729585291,"recheck_after":1724379598.7498305,"explanation_url":""}
{"level":"info","ts":1724357998.7499456,"logger":"http.acme_client","msg":"successfully downloaded available certificate chains","count":2,"first_url":"https://acme-v02.api.letsencrypt.org/acme/cert/042906a14e9d8a413d4aa1ec5c2502287008"}
{"level":"info","ts":1724357998.9449778,"logger":"tls.obtain","msg":"certificate obtained successfully","identifier":"prod.example.com","issuer":"acme-v02.api.letsencrypt.org-directory"}
{"level":"info","ts":1724357998.9457204,"logger":"tls.obtain","msg":"releasing lock","identifier":"prod.example.com"}

3. Caddy version:

v2.8.4 h1:q3pe0wpBj1OcHFZ3n/1nl4V4bxBrYoSoab7rL9BMYNk=

4. How I installed and ran Caddy:

a. System environment:

Caddy and the web service are run from docker images as an Azure container group on Linux.
The container group is managed using Github workflows and Terraform as part of a blue/green deployment strategy.

b. Command:

See dockerfile below.

c. Service/unit/compose file:

The dockerfile to build the Caddy image for deployment:

FROM caddy:2.8-alpine

COPY config/caddy/Caddyfile /etc/caddy/

# add --environ flag for debugging
CMD ["caddy", "run", "--config", "/etc/caddy/Caddyfile", "--adapter", "caddyfile", "--environ"]

d. My complete Caddy config:

The relevant config sections:

{
	auto_https ignore_loaded_certs
}

https://prod.example.com,
https://test.example.com {
	tls /cert/cert.pem /cert/key.pem

	@old_api {
		method POST PUT
		path /*
	}

	handle_path /* {
		rewrite * /predictions/model

		reverse_proxy @old_api http://localhost:8080 {
			health_uri /ping
		}
	}
}

5. Links to relevant resources:

No, you need to reload Caddy (with the --force flag since your config text would not have changed) if you update the cert/key files to have Caddy reload them from disk.

This is only useful if you have for example a wildcard cert from file you’re loading into your config which covers multiple subdomains, but you still want Caddy to issue a cert for other subdomains which should not use that wildcard cert. That’s not you’re doing, so you can remove that option.

Yeah, let Caddy use ACME to issue certs instead of handing Caddy your own files from disk. That’s the best way to ensure it’s automated.

1 Like

Thank you for your reply. I’m not sure we are talking about the exact same issue, so please let me clarify.

The new cert was automatically created by Caddy, and is stored in the default location (below ‘/data/caddy’). The custom cert is unchanged (and stored below ‘/cert’).

Without the ignore_loaded_certs option, Caddy will not create automatic certs for sites where a custom cert is configured. The reason to use the custom cert in the first place is to reduce down-time of our service.

The use case is as follows:

  • Deploy web service and Caddy as TLS proxy in staging; the DNS record does not point to the deployment yet, so ACME challenges will fail, and no automatic certs are issued
  • Switch-over DNS record; now ACME challenges can work, but Caddy will most likely be waiting before requesting a cert due to back-off
  • Caddy tries to obtain a new cert and will eventually succeed

Since I do not want our service to be unavailable for an arbitrary amount of time after switchover, I am interested if there is a cleaner way to do this. Using the custom cert was the only solution I could find so far, but not sure how it works out.

Once there is an automatic cert it can be shared between deployments, of course. I might just do that if the current config does not work as I hope it will. That would be a bit of a kludge though, as the config would have to be changed after initial deployment.

Do you have more in your config than what you showed? Please don’t show partial config, it’s misleading.

1 Like

Ahh, so you have an old deployment you want to cutover from.

Are you able to share TLS asset storage? That would solve your issue probably instantly, as Caddy instances coordinate asset maintenance in a fleet automatically. This would be the best solution for a bunch of reasons unless it’s infeasible.

Alternatively, could you reverse-proxy /.well-known/acme-challenge* to the new Caddy instance via IP address, and disable TLS-ALPN challenges? That would let HTTP-01 challenges get through to the new one without impeding the old one.

And finally, have you considered On-Demand TLS? That will prevent Caddy from trying to get certs until the first handshakes start coming in - which would probably be after the DNS cutover - at which point it will attempt to dynamically acquire the required certificates.

1 Like

TLS asset storage for automatic certs is shared between blue/green deployments. This way it is possible to seamlessly switch between deployments using Caddy. However, this does not work for cutover from an old deployment which had its own HTTPS server (hence no Caddy TLS assets).

I guess I could set up a reverse proxy for the HTTP-01 challenge, but that sounds more complicated than just disabling the custom cert in the config once an automatic one is issued. Also, disabling TLS-ALPN challenge would mean losing it as a fallback.

I’ve looked briefly into On-Demand TLS, but it does not suit our use-case, and the documentation recommends against using it for anything else, as it is rather involved to set up.

Nothing that could be relevant to the issue, really. Here is the complete config:

{
	auto_https ignore_loaded_certs

	log file {
		format json
		output file /var/log/access.log {
			roll_size 100mb
			roll_keep 5
			roll_keep_for 720h
		}
	}
}

https://prod.example.com,
https://test.example.com {
	tls /cert/cert.pem /cert/key.pem
	  
	@new_api {
		method POST PUT
		path /predictions/*
	}

	@old_api {
		method POST PUT
		path /*
	}

	handle_path /predict/mdl2 {
		rewrite * /predictions/model2

		reverse_proxy @new_api http://localhost:8080 {
			health_uri /ping
		}
	}

	handle_path /predict/mdl1 {
		rewrite * /predictions/model

		reverse_proxy @new_api http://localhost:8080 {
			health_uri /ping
		}
	}

	handle_path /* {
		rewrite * /predictions/model

		reverse_proxy @old_api http://localhost:8080 {
			health_uri /ping
		}
	}

	header {
		Access-Control-Allow-Headers *
		Access-Control-Allow-Methods *
		Access-Control-Allow-Origin *
	}
	@options {
		method OPTIONS
	}
	respond @options 204

}

If there are no other options to handle this use-case, I suppose it might be worthwhile to test how Caddy handles this scenario when the custom cert becomes invalid. In the worst case, I’ll have to disable the custom cert in the config now that the automatic cert is issued.

I’ll report back on the issue in about two weeks, so I’d suggest keeping that topic open until then.

For some reason I just did not consider cutting over from non-Caddy to Caddy, so good point there.

If reverse proxying is more difficult than configuring a new Caddy deployment and then reloading it after cutover, then it’s probably a non-starter too. Notably you could just leave TLS-ALPN enabled, since Caddy will try it, and if it fails, try HTTP-01 instead (assuming it didn’t try HTTP-01 first). But that’s moot in the case of complexity of the system you’re cutting over from.

It’s curious to hear that On-Demand TLS doesn’t suit your use case, though… It actually sounds EXACTLY like your use case as you’ve described it here:

This is pretty much perfectly in On-Demand TLS’ wheelhouse. Nearly a textbook example, even - let me explain a little. The docs state, emphasis mine:

On-demand TLS is useful if:

  • you do not know all the domain names when you start or reload your server,
  • domain names might not be properly configured right away (DNS records not yet set),
  • you are not in control of the domain names (e.g. they are customer domains).

If you’re after a zero-downtime method of getting your TLS asset on the first handshake after the DNS cutover - this is it. On-Demand will defer certificate acquisition until that first request comes in - put it on hold, complete the challenge, get the cert, and serve it to that same very first request. It therefore will not be on wait due to exponential failure back-off, you don’t even need to provision a custom cert at all (removing the requirement to reconfigure and reload Caddy post-deployment post-cutover).

It’s also very very easy to set up just for the cert deferring part if you already know your domains and don’t need an external ask endpoint, you can just run your own in-situ:

{
  on_demand_tls {
    ask http://localhost:9123
  }
}

example.com, foo.example.com {
  tls {
    on_demand
  }
}

# Simpler version for single domains/short lists
# Relies on implicit 200 for unconfigured routes
http://localhost:9123 {
  @denied not query domain=example.com domain=foo.example.com
  respond @denied 401
}

# Map version to make it easy to add/remove domains
# from the approved list if you're expecting a lot of them
# http://localhost:9123 {
#   map {query.domain} {ask_response} {
#     default 401
#     example.com 200
#     foo.example.com 200
#   }
#   respond {ask_response}
# }
3 Likes

Using Caddy itself as the ask endpoint does look very convenient to set up. We might just use that in the future, and simply check that a cert is issued on first request.

From the docs I got the impression that on-demand TLS is targeted only at providers hosting a large number of domains. Also it was not clear if the use-case for on-demand TLS covers any of the conditions you mentioned or all together, since we know all domain names in advance and are in control of them as well.

1 Like

Hmm. Yeah, that whole section is definitely not meant to sound prohibitive, like “you shouldn’t use it unless you have this exact use case”, but more descriptive, like “these are the things it’s good at”.

Is there a way you would have opted to write some of that section in order to convey that better to your own reading of it?

2 Likes

We have now tested and deployed this configuration, and it works nicely. It only took a couple of seconds to acquire a cert from LetsEncrypt upon the first request. Since we can also share the on-demand certs between deployments, this works very well for our use-case.

And to answer my initial question, with our previous configuration as listed above, the manual cert was not replaced with an automatic cert once it expired. I think it would be a neat feature if it did, but I will leave that for the devs to decide.

1 Like

What do you mean? Show evidence.

1 Like

I would suggest changing the wording of “On-demand TLS is useful if:” to something similar to “On-demand TLS is useful if one or more of the following apply:”.

It might also be worthwhile to mention in the docs that on-demand TLS allows for switching between domain names without reconfiguration, as in our (somewhat special) use-case. The configuration with the ask endpoint would make a good example too, in my opinion.

That’s here Global options (Caddyfile) — Caddy Documentation

1 Like

With the initial configuration (custom cert + auto_https ignore_loaded_certs),
the custom cert was used even after it expired.
Logs show that an automatic cert was generated, and I can inspect it and see that it is generated correctly.
Afaics logs do not show that the custom cert has expired, or which cert is served upon requests, so that’s all the details I have.
Anyway, we are using on-demand TLS now, which works quite nicely.

Just a quick update: we are using the on-demand TLS setup now, and it works without problems. The delay on the first request is quite short, as it only takes a few seconds to generate the certificate. This method is also very convenient for fully automated deployments.

Thank you for the quick and helpful replies :slight_smile:

1 Like

I am not sure what you mean exactly, but you want a configuration for manually-managed certs to automatically manage a cert? Can you clarify?

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.