AWS Global Accelerator - Reverse proxy certificates not issuing

1. Caddy version (caddy version):

v2.4.0

2. How I run Caddy:

On EC2 servers behind a load balancer (global accelerator)

a. System environment:

Ubuntu 20.04

b. Command:

systemctl start caddy

d. My complete Caddyfile or JSON config:

//This is tunneled through a local server so I can test things
CADDY_PROXY_TARGET=ba07514839c9.ngrok.io
{
    on_demand_tls {
        ask https://{$CADDY_PROXY_TARGET}/user-domain-check
    }

    storage dynamodb caddy_ssl_certificates
}

:80 {
    respond /health "Im healthy!" 200
}

:443 {
    tls jack@amplify.link {
        on_demand
    }

    reverse_proxy https://{$CADDY_PROXY_TARGET} {
        header_up Host {$CADDY_PROXY_TARGET}
        header_up User-Custom-Domain {host}
        header_up X-Forwarded-Port {server_port}
        health_timeout 5s
    }
}

3. The problem I’m having:

When I point the domain I want to work with caddy at the AWS Global accelerator static IP addresses then generating certificates fail. However, if I point the domain directly at the EC2 instance (bypassing the load balancer) then it works correctly. As I have customers in different areas of the world I’m trying to use the load balancer so that customers can point to that, without having to know the IP addresses of the individual servers.

4. Error messages and/or full log output:

This is the output when trying to curl https://jstowey.co.uk when it is pointing at the load balancer:

Jul 28 22:24:54 ip-172-31-28-236 caddy[19723]: {"level":"info","ts":1627511094.9651167,"logger":"tls.obtain","msg":"acquiring lock","identifier":"jstowey.co.uk"}
Jul 28 22:24:54 ip-172-31-28-236 caddy[19723]: {"level":"info","ts":1627511094.9727457,"logger":"tls.obtain","msg":"lock acquired","identifier":"jstowey.co.uk"}
Jul 28 22:24:54 ip-172-31-28-236 caddy[19723]: {"level":"info","ts":1627511094.9833117,"logger":"tls.issuance.acme","msg":"waiting on internal rate limiter","identifiers":["jstowey.co.uk"]}
Jul 28 22:24:54 ip-172-31-28-236 caddy[19723]: {"level":"info","ts":1627511094.983341,"logger":"tls.issuance.acme","msg":"done waiting on internal rate limiter","identifiers":["jstowey.co.uk"]}
Jul 28 22:24:55 ip-172-31-28-236 caddy[19723]: {"level":"info","ts":1627511095.7641287,"logger":"tls.issuance.acme","msg":"waiting on internal rate limiter","identifiers":["jstowey.co.uk"]}
Jul 28 22:24:55 ip-172-31-28-236 caddy[19723]: {"level":"info","ts":1627511095.764165,"logger":"tls.issuance.acme","msg":"done waiting on internal rate limiter","identifiers":["jstowey.co.uk"]}
Jul 28 22:24:57 ip-172-31-28-236 caddy[19723]: {"level":"info","ts":1627511097.1118493,"logger":"tls.issuance.acme.acme_client","msg":"trying to solve challenge","identifier":"jstowey.co.uk","challenge_type":"http-01","ca":"https://acme.zerossl.com/v2/DV90"}
Jul 28 22:26:24 ip-172-31-28-236 caddy[19723]: {"level":"warn","ts":1627511184.9694316,"logger":"tls.issuance.acme.acme_client","msg":"HTTP request failed; retrying","url":"https://acme.zerossl.com/v2/DV90/authz/WWZNRn-zH9cDUpV2zgZksg","error":"performing request: Post \"https://acme.zerossl.com/v2/DV90/authz/WWZNRn-zH9cDUpV2zgZksg\": context deadline exceeded"}
Jul 28 22:26:24 ip-172-31-28-236 caddy[19723]: {"level":"error","ts":1627511184.9694834,"logger":"tls.issuance.acme.acme_client","msg":"deactivating authorization","identifier":"jstowey.co.uk","authz":"https://acme.zerossl.com/v2/DV90/authz/WWZNRn-zH9cDUpV2zgZksg","error":"request to https://acme.zerossl.com/v2/DV90/authz/WWZNRn-zH9cDUpV2zgZksg failed after 1 attempts: context deadline exceeded"}
Jul 28 22:26:24 ip-172-31-28-236 caddy[19723]: {"level":"error","ts":1627511184.969506,"logger":"tls.obtain","msg":"will retry","error":"[jstowey.co.uk] Obtain: [jstowey.co.uk] solving challenges: [jstowey.co.uk] context deadline exceeded (order=https://acme.zerossl.com/v2/DV90/order/2usq0fU0rF-vN051crVTVA) (ca=https://acme.zerossl.com/v2/DV90)","attempt":1,"retrying_in":60,"elapsed":89.996702151,"max_duration":2592000}
Jul 28 22:26:24 ip-172-31-28-236 caddy[19723]: {"level":"info","ts":1627511184.9695163,"logger":"tls.obtain","msg":"releasing lock","identifier":"jstowey.co.uk"}

5. What I already tried:

I’ve tried a number of different configurations, but can’t tell much from what the logs show. I’ve tried updating the hostname of both my EC2 servers to the global accelerator DNS name, but that didn’t seem to fix anything.

6. Links to relevant resources:

How to add unlimited custom domains to Laravel Vapor | Laravel News - I followed this guide to get to this point: and it works for the most part, it’s just this final hurdle that’s causing issues.

On Demand SSL on ports 80,443 with health checks? - I reviewed this ticket for answers, but looks like a slightly different issue, my health checks come back fine.

That’s strange. Looks like Caddy’s request to ZeroSSL timed out. Doesn’t look like there’s any service disruption https://status.zerossl.com/

Caddy should fallback to Let’s Encrypt if there’s a problem with ZeroSSL though. Can you post your full logs?

Also, you might want to upgrade to v2.4.3, there’s been some fixes since (none of which should affect issuance here I think but still worth upgrading)

Thanks for taking the time to look at this, appreciate it.

Here’s a link to the logs with debug mode enabled (I assume that’s what you mean, apologies if not).

https://pastebin.com/raw/Y4AfhUU4

I haven’t upgraded yet, but will try that if there doesn’t seem anything glaringly obvious from the logs.

Thanks

Okay found the issue, it was nothing to do with Caddy and everything to do with how AWS was configured, I re-read the solution found here: https://caddy.community/t/on-demand-ssl-on-ports-80-443-with-health-checks/8312/17

I checked my configuration again and could see that I was only listening on port 443, but not port 80 on the load balancer, so it couldn’t route requests properly. I’ve now added another endpoint group, as described in the ticket above and it works!

Thanks for the help :slight_smile:

1 Like

Great, thanks for following up!

Looking at your logs, I’m finding it strange that the TLS-ALPN challenge also failed when it was attempted with Let’s Encrypt. That should’ve worked for you regardless of this port 80 issue, because that challenge is done over port 443.

I do see these logs:

Jul 29 08:17:54 aee677f33d81d9e28 caddy[20905]: {"level":"info","ts":1627546674.0869188,"logger":"tls.issuance.acme.acme_client","msg":"trying to solve challenge","identifier":"jstowey.co.uk","challenge_type":"tls-alpn-01","ca":"https://acme-v02.api.letsencrypt.org/directory"}
Jul 29 08:17:54 aee677f33d81d9e28 caddy[20905]: {"level":"debug","ts":1627546674.0914476,"logger":"http.stdlib","msg":"http: TLS handshake error from 127.0.0.1:49666: EOF"}

TLS handshake error: EOF usually means the client closed the connection before the end of the handshake. This could be for any number of reasons, as determined by the client.

It’s confusing because these messages soon after seem to indicate it worked fine. The EOF might be a red herring, a request not from Let’s Encrypt.

Jul 29 08:17:54 aee677f33d81d9e28 caddy[20905]: {"level":"debug","ts":1627546674.2602398,"logger":"tls.issuance.acme.acme_client","msg":"challenge accepted","identifier":"jstowey.co.uk","challenge_type":"tls-alpn-01"}
Jul 29 08:17:54 aee677f33d81d9e28 caddy[20905]: {"level":"info","ts":1627546674.395267,"logger":"tls","msg":"served key authentication certificate","server_name":"jstowey.co.uk","challenge":"tls-alpn-01","remote":"18.192.36.99:17908","distributed":false}

Then, Caddy polls a few times for the order status, and gets this message:

Jul 29 08:17:55 aee677f33d81d9e28 caddy[20905]: {"level":"error","ts":1627546675.8485832,"logger":"tls.issuance.acme.acme_client","msg":"challenge failed","identifier":"jstowey.co.uk","challenge_type":"tls-alpn-01","status_code":400,"problem_type":"urn:ietf:params:acme:error:tls","error":"remote error: tls: internal error"}

This is an error coming from Let’s Encrypt, which is unfortunately nondescript.

Anyways, if you could keep an eye on your logs and watch for tls-alpn in them, hopefully you get successful issuances with that challenge from now on. Both the HTTP and TLS-ALPN challenges should work. If it doesn’t, I’d like to try and understand why.

2 Likes

This topic was automatically closed after 30 days. New replies are no longer allowed.