502 Errors with frontend/backend/nextcloud

1. Caddy version (caddy version):

v2.4.6 and v2.4.6

2. How I run Caddy:

eastcapitol.us is an external domain. y8s.casa is purely internal.
Dynamic dns forwards my domain test.eastcapitol.us to my home IP.
OPNsense firewall forwards ports 80 and 443 to caddy.y8s.casa.
caddy.y8s.casa’s caddy file reverse_proxies https://cloud.y8s.casa
cloud.y8s.casa (aka test.eastcapitol.us) is running nextcloud per the install instructions elsewhere on this site.

a. System environment:

frontend is debian 10 buster
backend is debian 11 bullseye
backend nextcloud is 23.0.0.10
php 8.0 running fpm

b. Command:

N/A, automated

c. Service/unit/compose file:

[Unit]
Description=Caddy
Documentation=https://caddyserver.com/docs/
After=network.target network-online.target
Requires=network-online.target

[Service]
Type=notify
User=caddy
Group=caddy
ExecStart=/usr/bin/caddy run --environ --config /etc/caddy/Caddyfile
ExecReload=/usr/bin/caddy reload --config /etc/caddy/Caddyfile
TimeoutStopSec=5s
LimitNOFILE=1048576
LimitNPROC=512
PrivateTmp=true
ProtectSystem=full
AmbientCapabilities=CAP_NET_BIND_SERVICE

[Install]
WantedBy=multi-user.target


d. My complete Caddyfile or JSON config:

Frontend Caddy File:

{
        debug
        email email@example.com
}

caddy.y8s.casa {
        acme_server
        tls internal
}

test.eastcapitol.us {
        reverse_proxy https://cloud.y8s.casa {
                header_up Host {http.reverse_proxy.upstream.hostport}
                header_up X-Forwarded-Host {host}
        }
}

Backend Caddy File:

{
	debug
	acme_ca https://caddy.y8s.casa/acme/local/directory
	acme_ca_root /etc/ssl/certs/root.crt
}

cloud.y8s.casa {
	#        tls {
	#                ca https://caddy.y8s.casa/acme/local/directory
	#                ca_root /etc/ssl/certs/root.crt
	#        }

	root * /var/www/nextcloud
	file_server
	log {
		output file /var/log/caddy/nextcloud.log
		format single_field common_log
	}

	php_fastcgi 127.0.0.1:9000 {
		env PATH /bin
	}

	header {
		# enable HSTS
		Strict-Transport-Security max-age=31536000;
		# from nextcloud hardening guide
	}

	redir /.well-known/carddav /remote.php/dav 301
	redir /.well-known/caldav /remote.php/dav 301

	# .htaccess / data / config / ... shouldn't be accessible from outside
	@forbidden {
		path /.htaccess
		path /data/*
		path /config/*
		path /db_structure
		path /.xml
		path /README
		path /3rdparty/*
		path /lib/*
		path /templates/*
		path /occ
		path /console.php
	}

	respond @forbidden 404
}

3. The problem I’m having:

My nextcloud app started reporting “error uploading, bad gateway” occasionally. Eventually it became a permanent issue.
At this point I can no longer visit test.eastcapitol.us because it gives me a 502 error from any location (home or away or via mobile network).
curl -v shows the same thing. 502 Bad gateway.

4. Error messages and/or full log output:

here is the output when I curl -v https://test.eastcapitol.us:
journalctl -fu caddy on frontend:

Feb 17 14:37:06 iot caddy[17628]: {"level":"debug","ts":1645126626.2382948,"logger":"http.handlers.reverse_proxy","msg":"upstream roundtrip","upstream":"cloud.y8s.casa:443","request":{"remote_addr":"173.8.14.69:59945","proto":"HTTP/1.1","method":"GET","host":"cloud.y8s.casa:443","uri":"/","headers":{"X-Forwarded-Host":["test.eastcapitol.us"],"X-Forwarded-Proto":["https"],"X-Forwarded-For":["173.8.14.69"],"User-Agent":["curl/7.79.1"],"Accept":["*/*"]},"tls":{"resumed":false,"version":771,"cipher_suite":49196,"proto":"http/1.1","proto_mutual":true,"server_name":"test.eastcapitol.us"}},"duration":0.031191595,"error":"x509: certificate signed by unknown authority (possibly because of \"x509: ECDSA verification failure\" while trying to verify candidate authority certificate \"Caddy Local Authority - 2022 ECC Root\")"}
Feb 17 14:37:06 iot caddy[17628]: {"level":"error","ts":1645126626.2415526,"logger":"http.log.error","msg":"x509: certificate signed by unknown authority (possibly because of \"x509: ECDSA verification failure\" while trying to verify candidate authority certificate \"Caddy Local Authority - 2022 ECC Root\")","request":{"remote_addr":"173.8.14.69:59945","proto":"HTTP/1.1","method":"GET","host":"test.eastcapitol.us","uri":"/","headers":{"User-Agent":["curl/7.79.1"],"Accept":["*/*"]},"tls":{"resumed":false,"version":771,"cipher_suite":49196,"proto":"http/1.1","proto_mutual":true,"server_name":"test.eastcapitol.us"}},"duration":0.034522308,"status":502,"err_id":"kpbi7sdpm","err_trace":"reverseproxy.statusError (reverseproxy.go:861)"}
Feb 17 14:37:06 iot caddy[17628]: {"level":"error","ts":1645126626.2425354,"logger":"http.log.access","msg":"handled request","request":{"remote_addr":"173.8.14.69:59945","proto":"HTTP/1.1","method":"GET","host":"test.eastcapitol.us","uri":"/","headers":{"User-Agent":["curl/7.79.1"],"Accept":["*/*"]},"tls":{"resumed":false,"version":771,"cipher_suite":49196,"proto":"http/1.1","proto_mutual":true,"server_name":"test.eastcapitol.us"}},"common_log":"173.8.14.69 - - [17/Feb/2022:14:37:06 -0500] \"GET / HTTP/1.1\" 502 0","duration":0.034522308,"size":0,"status":502,"resp_headers":{"Server":["Caddy"]}}

simultaneous journalctl -fu caddy on the backend:

Feb 17 14:37:06 cloud caddy[1629]: {"level":"debug","ts":1645126626.2108588,"logger":"tls.handshake","msg":"choosing certificate","identifier":"cloud.y8s.casa","num_choices":1}
Feb 17 14:37:06 cloud caddy[1629]: {"level":"debug","ts":1645126626.211041,"logger":"tls.handshake","msg":"default certificate selection results","identifier":"cloud.y8s.casa","subjects":["cloud.y8s.casa"],"managed":true,"issuer_key":"caddy.y8s.casa-acme-local-directory","hash":"97cf8fafc44f629267c7c1ae14f7e58bd12a15e56db9a391390d9319e616e88c"}
Feb 17 14:37:06 cloud caddy[1629]: {"level":"debug","ts":1645126626.2110798,"logger":"tls.handshake","msg":"matched certificate in cache","subjects":["cloud.y8s.casa"],"managed":true,"expiration":1645150671,"hash":"97cf8fafc44f629267c7c1ae14f7e58bd12a15e56db9a391390d9319e616e88c"}
Feb 17 14:37:06 cloud caddy[1629]: {"level":"debug","ts":1645126626.2340019,"logger":"http.stdlib","msg":"http: TLS handshake error from 10.10.10.40:53414: remote error: tls: bad certificate"}

5. What I already tried:

  • I have cleared out the certs on both machines from
    frontend:
    /var/lib/caddy/.local/share/caddy/pki/authorities/local
    /var/lib/caddy/.local/share/caddy/certificates
    backend:
    /etc/ssl/certs/root.crt

  • reloaded caddy and re-copied root.crt from /var/lib/caddy/.local/share/caddy/pki/authorities/local on the front end to /etc/ssl/certs on the backend.

  • I have fiddled around with nextcloud’s config.php URLs, trusted proxies, overwrites, etc.

  • I have created temporary names using just the caddy “respond” directive (and got nothing back)

  • I have restarted php8.0-fpm and caddy a zillion times and rebooted both machines

Best I can guess is that there is some issue with the backend machine obtaining a correct cert from the front end but I’m not sure what else to do or if perhaps I did not properly eliminate all the existing certs for caddy to renew them.

6. Links to relevant resources:

Final note

There were other things running on the frontend that were able to obtain certs just fine. For example if I had homeassistant running with the following config, it works just great:

test.eastcapitol.us {
  reverse_proxy localhost:8123
}

I can also obtain internal-only domain certs all day long using hetzner dns challenges.

Thanks for your help!

I think you might be running into this issue, i.e. the intermediate cert for acme_server isn’t reloaded by acme_server after it gets renewed, so (for now) you’ll need to periodically force reload your front Caddy to make sure it has the most recent one loaded.

Interesting. So how do I fix it (even temporarily) in the meantime?

I just did a caddy reload --force --config /etc/caddy/Caddyfile but still see the 502.

Certs are all current.

Frontend:

4 drwx------ 2 caddy caddy 4096 Feb 16 17:21 .
4 drwx------ 3 caddy caddy 4096 Feb 16 17:04 ..
4 -rw------- 1 caddy caddy  680 Feb 16 17:21 intermediate.crt
4 -rw------- 1 caddy caddy  227 Feb 16 17:21 intermediate.key
4 -rw------- 1 caddy caddy  627 Feb 16 17:21 root.crt
4 -rw------- 1 caddy caddy  227 Feb 16 17:21 root.key

Backend:

4 -rw-r--r-- 1 root root 627 Feb 16 17:21 root.crt

You’ll probably need to delete the cert on the backend to force it to get reissued, since it might have been issued by an expired intermediate cert.

To be clear, the problem is that in-memory, the acme_server directive “remembers” the old intermediate cert after it gets renewed. What’s on disk will always looks correct, the bug is in-memory. Reloading the front Caddy will force acme_server to reload the intermediate cert from storage.

so let me see if I can write this out as instructions.

  1. Delete the root.crt from the back end.
  2. restart the backend caddy to reissue
  3. recopy the root cert from front to backend
  4. force reload the frontend caddy to load the correct intermediate cert

I just want to get this right as it seems the order matters.

Yep

Probably not necessary, unless you deleted the root CA from the frontend and caused it to be re-issued. It has a 5 year lifetime iirc.

Yes, and do this daily in a cron or something. Intermediate certs have a 7 day lifetime, and get renewed after approx. 4.66 days (after 2/3 of the lifetime). Leaf certs have even shorter lifetime. So as long as you reload within a day of the intermediate expiring and being renewed, you should be fine.

Hold up.
In the tutorial I have to manually copy the root.crt from the frontend to the backend /etc/ssl/certs dir. If I remove it in step 1 above, why wouldn’t I have to recopy it?

I’m still missing something here. Maybe if you provide the steps instead of me, I’ll have an easier time. I tried what I thought was correct and am still getting the same TLS handshake error with “bad certificate”.

All the intermediate and root certs are refreshed and the root.crt on both machines match. I have force reloaded both machines and restarted caddy on both machines.

Thanks again!

1 Like

Oh sorry, misread that.

What I was saying is delete the leaf certs from the backend, not the root cert. The root cert will remain good. It’s the leaf certs that need to get re-issued. Those will be in the acme directory in Caddy’s storage.

1 Like

so do I just obliterate the contents of the acme dir and reload?

Yeah, on the backend.

Ok, I did, but they have not reappeared after restarting everything.

What’s in your logs?

Frontend:

Feb 18 09:33:38 iot caddy[23373]: {"level":"debug","ts":1645194818.0479314,"logger":"http.handlers.reverse_proxy","msg":"upstream roundtrip","upstream":"cloud.y8s.casa:443","request":{"remote_addr":"173.8.14.69:61151","proto":"HTTP/1.1","method":"GET","host":"cloud.y8s.casa:443","uri":"/","headers":{"X-Forwarded-Proto":["https"],"X-Forwarded-For":["173.8.14.69"],"User-Agent":["curl/7.79.1"],"Accept":["*/*"],"X-Forwarded-Host":["test.eastcapitol.us"]},"tls":{"resumed":false,"version":771,"cipher_suite":49196,"proto":"http/1.1","proto_mutual":true,"server_name":"test.eastcapitol.us"}},"duration":0.019397671,"error":"x509: certificate has expired or is not yet valid: current time 2022-02-18T09:33:38-05:00 is after 2022-02-18T10:25:54Z"}
Feb 18 09:33:38 iot caddy[23373]: {"level":"error","ts":1645194818.0482728,"logger":"http.log.error","msg":"x509: certificate has expired or is not yet valid: current time 2022-02-18T09:33:38-05:00 is after 2022-02-18T10:25:54Z","request":{"remote_addr":"173.8.14.69:61151","proto":"HTTP/1.1","method":"GET","host":"test.eastcapitol.us","uri":"/","headers":{"User-Agent":["curl/7.79.1"],"Accept":["*/*"]},"tls":{"resumed":false,"version":771,"cipher_suite":49196,"proto":"http/1.1","proto_mutual":true,"server_name":"test.eastcapitol.us"}},"duration":0.020228262,"status":502,"err_id":"bazyprk9m","err_trace":"reverseproxy.statusError (reverseproxy.go:861)"}

backend:

Feb 18 09:33:38 cloud caddy[1620]: {"level":"debug","ts":1645194818.0468225,"logger":"tls.handshake","msg":"choosing certificate","identifier":"cloud.y8s.casa","num_choices":1}
Feb 18 09:33:38 cloud caddy[1620]: {"level":"debug","ts":1645194818.0472898,"logger":"tls.handshake","msg":"default certificate selection results","identifier":"cloud.y8s.casa","subjects":["cloud.y8s.casa"],"managed":true,"issuer_key":"caddy.y8s.casa-acme-local-directory","hash":"d99effb6eea719a3624529d0b184b95c0e1dd28d7973f41ab71fd37a324d5ec8"}
Feb 18 09:33:38 cloud caddy[1620]: {"level":"debug","ts":1645194818.0473847,"logger":"tls.handshake","msg":"matched certificate in cache","subjects":["cloud.y8s.casa"],"managed":true,"expiration":1645179954,"hash":"d99effb6eea719a3624529d0b184b95c0e1dd28d7973f41ab71fd37a324d5ec8"}
Feb 18 09:33:38 cloud caddy[1620]: {"level":"debug","ts":1645194818.056496,"logger":"http.stdlib","msg":"http: TLS handshake error from 10.10.10.40:37894: remote error: tls: bad certificate"}

So I don’t have to delete any of the root or intermediate certs again, correct?
What about the /var/lib/caddy/.local/share/caddy/acme on the FRONTend?

Maybe drastic, but is there any harm in just removing both machines /var/lib/caddy/* and reissuing everything?

Yeah you should only need to remove stuff from the backend only. It’s the backend’s TLS certificate that’s expired. That’s strange though. Please read back in your backend’s logs, find the last time it issued a certificate for that domain you tried. When did it do it? Was there any errors in issuance?

Here’s what a restart of the service on the backend looks like:

Feb 18 10:10:32 cloud systemd[1]: Starting Caddy...
Feb 18 10:10:33 cloud caddy[5464]: caddy.HomeDir=/var/lib/caddy
Feb 18 10:10:33 cloud caddy[5464]: caddy.AppDataDir=/var/lib/caddy/.local/share/caddy
Feb 18 10:10:33 cloud caddy[5464]: caddy.AppConfigDir=/var/lib/caddy/.config/caddy
Feb 18 10:10:33 cloud caddy[5464]: caddy.ConfigAutosavePath=/var/lib/caddy/.config/caddy/autosave.json
Feb 18 10:10:33 cloud caddy[5464]: caddy.Version=v2.4.6 h1:HGkGICFGvyrodcqOOclHKfvJC0qTU7vny/7FhYp9hNw=
Feb 18 10:10:33 cloud caddy[5464]: runtime.GOOS=linux
Feb 18 10:10:33 cloud caddy[5464]: runtime.GOARCH=amd64
Feb 18 10:10:33 cloud caddy[5464]: runtime.Compiler=gc
Feb 18 10:10:33 cloud caddy[5464]: runtime.NumCPU=2
Feb 18 10:10:33 cloud caddy[5464]: runtime.GOMAXPROCS=2
Feb 18 10:10:33 cloud caddy[5464]: runtime.Version=go1.17.2
Feb 18 10:10:33 cloud caddy[5464]: os.Getwd=/
Feb 18 10:10:33 cloud caddy[5464]: LANG=en_US.UTF-8
Feb 18 10:10:33 cloud caddy[5464]: PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
Feb 18 10:10:33 cloud caddy[5464]: NOTIFY_SOCKET=/run/systemd/notify
Feb 18 10:10:33 cloud caddy[5464]: HOME=/var/lib/caddy
Feb 18 10:10:33 cloud caddy[5464]: LOGNAME=caddy
Feb 18 10:10:33 cloud caddy[5464]: USER=caddy
Feb 18 10:10:33 cloud caddy[5464]: INVOCATION_ID=b5a1fb39deb449bba6c270fc82d730d0
Feb 18 10:10:33 cloud caddy[5464]: JOURNAL_STREAM=8:35959
Feb 18 10:10:33 cloud caddy[5464]: {"level":"info","ts":1645197033.0483146,"msg":"using provided configuration","config_file":"/etc/caddy/Caddyfile","config_adapter":""}
Feb 18 10:10:33 cloud caddy[5464]: {"level":"warn","ts":1645197033.0636997,"msg":"input is not formatted with 'caddy fmt'","adapter":"caddyfile","file":"/etc/caddy/Caddyfile","line":7}
Feb 18 10:10:33 cloud caddy[5464]: {"level":"warn","ts":1645197033.067639,"logger":"caddy.logging.encoders.single_field","msg":"the 'single_field' encoder is deprecated and will be removed soon!"}
Feb 18 10:10:33 cloud caddy[5464]: {"level":"info","ts":1645197033.0703583,"logger":"admin","msg":"admin endpoint started","address":"tcp/localhost:2019","enforce_origin":false,"origins":["localhost:2019","[::1]:2019","127.0.0.1:2019"]}
Feb 18 10:10:33 cloud caddy[5464]: {"level":"info","ts":1645197033.0711641,"logger":"tls.cache.maintenance","msg":"started background certificate maintenance","cache":"0xc0001a87e0"}
Feb 18 10:10:33 cloud caddy[5464]: {"level":"info","ts":1645197033.0722432,"logger":"http","msg":"server is listening only on the HTTPS port but has no TLS connection policies; adding one to enable TLS","server_name":"srv0","https_port":443}
Feb 18 10:10:33 cloud caddy[5464]: {"level":"info","ts":1645197033.0726316,"logger":"http","msg":"enabling automatic HTTP->HTTPS redirects","server_name":"srv0"}
Feb 18 10:10:33 cloud caddy[5464]: {"level":"info","ts":1645197033.0752678,"logger":"tls","msg":"cleaning storage unit","description":"FileStorage:/var/lib/caddy/.local/share/caddy"}
Feb 18 10:10:33 cloud caddy[5464]: {"level":"info","ts":1645197033.0762537,"logger":"tls","msg":"finished cleaning storage units"}
Feb 18 10:10:33 cloud caddy[5464]: {"level":"debug","ts":1645197033.077,"logger":"http","msg":"starting server loop","address":"[::]:80","http3":false,"tls":false}
Feb 18 10:10:33 cloud caddy[5464]: {"level":"debug","ts":1645197033.0775847,"logger":"http","msg":"starting server loop","address":"[::]:443","http3":false,"tls":true}
Feb 18 10:10:33 cloud caddy[5464]: {"level":"info","ts":1645197033.0779214,"logger":"http","msg":"enabling automatic TLS certificate management","domains":["cloud.y8s.casa"]}
Feb 18 10:10:33 cloud caddy[5464]: {"level":"warn","ts":1645197033.0796323,"logger":"tls","msg":"stapling OCSP","error":"no OCSP stapling for [cloud.y8s.casa]: no OCSP server specified in certificate"}
Feb 18 10:10:33 cloud caddy[5464]: {"level":"debug","ts":1645197033.0800273,"logger":"tls.cache","msg":"added certificate to cache","subjects":["cloud.y8s.casa"],"expiration":1645179954,"managed":true,"issuer_key":"caddy.y8s.casa-acme-local-directory","hash":"d99effb6eea719a3624529d0b184b95c0e1dd28d7973f41ab71fd37a324d5ec8","cache_size":1,"cache_capacity":10000}
Feb 18 10:10:33 cloud caddy[5464]: {"level":"info","ts":1645197033.0811322,"logger":"tls.renew","msg":"acquiring lock","identifier":"cloud.y8s.casa"}
Feb 18 10:10:33 cloud caddy[5464]: {"level":"info","ts":1645197033.1011677,"msg":"autosaved config (load with --resume flag)","file":"/var/lib/caddy/.config/caddy/autosave.json"}
Feb 18 10:10:33 cloud caddy[5464]: {"level":"info","ts":1645197033.1019442,"msg":"serving initial configuration"}
Feb 18 10:10:33 cloud systemd[1]: Started Caddy.
Feb 18 10:10:33 cloud caddy[5464]: {"level":"info","ts":1645197033.1464634,"logger":"tls.renew","msg":"lock acquired","identifier":"cloud.y8s.casa"}
Feb 18 10:10:33 cloud caddy[5464]: {"level":"info","ts":1645197033.147103,"logger":"tls.renew","msg":"renewing certificate","identifier":"cloud.y8s.casa","remaining":-17079.147099447}
Feb 18 10:10:33 cloud caddy[5464]: {"level":"warn","ts":1645197033.1694987,"logger":"tls.issuance.acme.acme_client","msg":"HTTP request failed; retrying","url":"https://caddy.y8s.casa/acme/local/directory","error":"performing request: Get \"https://caddy.y8s.casa/acme/local/directory\": x509: certificate signed by unknown authority (possibly because of \"x509: ECDSA verification failure\" while trying to verify candidate authority certificate \"Caddy Local Authority - 2022 ECC Root\")"}
Feb 18 10:10:33 cloud caddy[5464]: {"level":"warn","ts":1645197033.4378345,"logger":"tls.issuance.acme.acme_client","msg":"HTTP request failed; retrying","url":"https://caddy.y8s.casa/acme/local/directory","error":"performing request: Get \"https://caddy.y8s.casa/acme/local/directory\": x509: certificate signed by unknown authority (possibly because of \"x509: ECDSA verification failure\" while trying to verify candidate authority certificate \"Caddy Local Authority - 2022 ECC Root\")"}
Feb 18 10:10:33 cloud caddy[5464]: {"level":"warn","ts":1645197033.7024932,"logger":"tls.issuance.acme.acme_client","msg":"HTTP request failed; retrying","url":"https://caddy.y8s.casa/acme/local/directory","error":"performing request: Get \"https://caddy.y8s.casa/acme/local/directory\": x509: certificate signed by unknown authority (possibly because of \"x509: ECDSA verification failure\" while trying to verify candidate authority certificate \"Caddy Local Authority - 2022 ECC Root\")"}
Feb 18 10:10:33 cloud caddy[5464]: {"level":"error","ts":1645197033.7027595,"logger":"tls.renew","msg":"could not get certificate from issuer","identifier":"cloud.y8s.casa","issuer":"caddy.y8s.casa-acme-local-directory","error":"registering account [] with server: provisioning client: performing request: Get \"https://caddy.y8s.casa/acme/local/directory\": x509: certificate signed by unknown authority (possibly because of \"x509: ECDSA verification failure\" while trying to verify candidate authority certificate \"Caddy Local Authority - 2022 ECC Root\")"}
Feb 18 10:10:33 cloud caddy[5464]: {"level":"error","ts":1645197033.702909,"logger":"tls.renew","msg":"will retry","error":"[cloud.y8s.casa] Renew: registering account [] with server: provisioning client: performing request: Get \"https://caddy.y8s.casa/acme/local/directory\": x509: certificate signed by unknown authority (possibly because of \"x509: ECDSA verification failure\" while trying to verify candidate authority certificate \"Caddy Local Authority - 2022 ECC Root\")","attempt":1,"retrying_in":60,"elapsed":0.556371264,"max_duration":2592000}
Feb 18 10:11:33 cloud caddy[5464]: {"level":"info","ts":1645197093.7066252,"logger":"tls.renew","msg":"renewing certificate","identifier":"cloud.y8s.casa","remaining":-17139.706064293}
Feb 18 10:11:33 cloud caddy[5464]: {"level":"warn","ts":1645197093.726447,"logger":"tls.issuance.acme.acme_client","msg":"HTTP request failed; retrying","url":"https://caddy.y8s.casa/acme/local/directory","error":"performing request: Get \"https://caddy.y8s.casa/acme/local/directory\": x509: certificate signed by unknown authority (possibly because of \"x509: ECDSA verification failure\" while trying to verify candidate authority certificate \"Caddy Local Authority - 2022 ECC Root\")"}
Feb 18 10:11:33 cloud caddy[5464]: {"level":"warn","ts":1645197093.9925673,"logger":"tls.issuance.acme.acme_client","msg":"HTTP request failed; retrying","url":"https://caddy.y8s.casa/acme/local/directory","error":"performing request: Get \"https://caddy.y8s.casa/acme/local/directory\": x509: certificate signed by unknown authority (possibly because of \"x509: ECDSA verification failure\" while trying to verify candidate authority certificate \"Caddy Local Authority - 2022 ECC Root\")"}
Feb 18 10:11:34 cloud caddy[5464]: {"level":"warn","ts":1645197094.2588966,"logger":"tls.issuance.acme.acme_client","msg":"HTTP request failed; retrying","url":"https://caddy.y8s.casa/acme/local/directory","error":"performing request: Get \"https://caddy.y8s.casa/acme/local/directory\": x509: certificate signed by unknown authority (possibly because of \"x509: ECDSA verification failure\" while trying to verify candidate authority certificate \"Caddy Local Authority - 2022 ECC Root\")"}
Feb 18 10:11:34 cloud caddy[5464]: {"level":"error","ts":1645197094.259805,"logger":"tls.renew","msg":"could not get certificate from issuer","identifier":"cloud.y8s.casa","issuer":"caddy.y8s.casa-acme-local-directory","error":"registering account [] with server: provisioning client: performing request: Get \"https://caddy.y8s.casa/acme/local/directory\": x509: certificate signed by unknown authority (possibly because of \"x509: ECDSA verification failure\" while trying to verify candidate authority certificate \"Caddy Local Authority - 2022 ECC Root\")"}
Feb 18 10:11:34 cloud caddy[5464]: {"level":"error","ts":1645197094.2604587,"logger":"tls.renew","msg":"will retry","error":"[cloud.y8s.casa] Renew: registering account [] with server: provisioning client: performing request: Get \"https://caddy.y8s.casa/acme/local/directory\": x509: certificate signed by unknown authority (possibly because of \"x509: ECDSA verification failure\" while trying to verify candidate authority certificate \"Caddy Local Authority - 2022 ECC Root\")","attempt":2,"retrying_in":120,"elapsed":61.113916498,"max_duration":2592000}

Okay well it looks like the backend doesn’t trust the cert for the caddy.y8s.casa domain of the frontend. Maybe you do need to re-copy the root cert from the frontend to the backend at this point, if at some point you did cause the frontend to regenerate the CA.

the root certs are identical. I even diffed them to make sure.

-rw------- 1 caddy caddy 627 Feb 17 20:54 /etc/ssl/certs/root.crt

permissions/ownership/location are correct? does it matter?

Sorry for chippin’ in late.

Since the creation of the guide I have had this issue where the certificate won’t renew properly. The ONLY way I could make this work was to delete on the front-end ~/.local/share/caddy/certificates/acme-v02.api.letsencrypt.org-directory and on the back-end ~/.local/share/caddy/certificates. After deleting both, reload Front-end and then Back-end and your good until it expires again.

Back when I wrote the guide I used a test environment using caddy start instead of systemd. I guessed that broke the setup and my intention was to restart but you know… time flies. SInce I’m stuck at home now anyway I decided to give it a spin.

First thing I tried was to drop caddy start and instead set it up with systemd. This worked until the certificate expired. So I deleted the folders again but now in /var/lib/caddy.... This didn’t work!

I thought I broke something else but when I use caddy start, it works again using the old data paths.

Then I noticed the same error message in the logs as found by @francislavoie here

I created a new root.crt and imported it to the back-end without a change.

Still looking…

Umm, make sure the caddy user has access to read the root CA cert – that might make the difference.

all the root certs on both machines are owned by caddy:caddy… I will try Robbert’s suggestion.

edit: caddy start didn’t really change things (used on both)