On Demand TLS, if ask endpoint takes long, it does nothing

1. Caddy version (caddy version):

v2.4.6 h1:HGkGICFGvyrodcqOOclHKfvJC0qTU7vny/7FhYp9hNw=

2. How I run Caddy:

caddy_windows_amd64_custom.exe run

a. System environment:

Windows 10

b. Command:

caddy_windows_amd64_custom.exe run

d. My complete Caddyfile or JSON config:

{
    on_demand_tls {
        ask http://localhost:3000/caddy/allowed-domain
    }
}

https:// {
    tls {
        on_demand
    }

    # Proxy to the main server
    reverse_proxy / http://127.0.0.1:3000 {
        # Add headers any proxy would expect
        header_up X-Real-IP {remote}
        header_up X-Forwarded-For {remote}
        header_up X-Forwarded-Port {server_port}

        # We should never take more than 5s to load
        health_timeout 5s
    }
}

https://*.discordpage.com {
	tls security@obelous.com {
	    dns cloudflare MY_CLOUDFLARE_TOKEN
	}

	reverse_proxy / http://127.0.0.1:3000 {
		# Confirm the request came from our Caddy proxy
		#header_upstream PageHost {host}

		# Add headers any proxy would expect
		header_up X-Real-IP {remote}
		header_up X-Forwarded-For {remote}
		header_up X-Forwarded-Port {server_port}

		# We should never take more than 5s to load
		health_timeout 5s
	}
}

3. The problem I’m having:

I am using on_demand tls for my customers however, my “ask” endpoint takes around 10 seconds to complete, which causes caddy to do absolutely nothing (not even any messages about it trying to issue certs, failing challenges, nothing)

However upon changing my endpoint to return 200 instantly, on_demand tls functions just fine.

4. Error messages and/or full log output:

C:\Users\Casper\Pictures\UbisoftConnect>caddy_windows_amd64_custom.exe run
2022/01/10 01:46:04.983 ←[34mINFO←[0m   using adjacent Caddyfile
2022/01/10 01:46:04.985 ←[33mWARN←[0m   input is not formatted with 'caddy fmt' {"adapter": "caddyfile", "file": "Caddyfile", "line": 2}
2022/01/10 01:46:04.994 ←[34mINFO←[0m   admin   admin endpoint started  {"address": "tcp/localhost:2019", "enforce_origin": false, "origins": ["[::1]:2019", "127.0.0.1:2019", "localhost:2019"]}
2022/01/10 01:46:04.994 ←[34mINFO←[0m   tls.cache.maintenance   started background certificate maintenance      {"cache": "0xc00037a4d0"}
2022/01/10 01:46:04.995 ←[34mINFO←[0m   http    enabling automatic HTTP->HTTPS redirects        {"server_name": "srv0"}
2022/01/10 01:46:04.997 ←[34mINFO←[0m   tls     cleaning storage unit   {"description": "FileStorage:C:\\Users\\Casper\\AppData\\Roaming\\Caddy"}
2022/01/10 01:46:04.998 ←[34mINFO←[0m   http    enabling automatic TLS certificate management   {"domains": ["*.discordpage.com"]}
2022/01/10 01:46:05.002 ←[34mINFO←[0m   autosaved config (load with --resume flag)      {"file": "C:\\Users\\Casper\\AppData\\Roaming\\Caddy\\autosave.json"}
2022/01/10 01:46:05.002 ←[34mINFO←[0m   serving initial configuration
2022/01/10 01:46:05.002 ←[34mINFO←[0m   tls     finished cleaning storage units

5. What I already tried:

N/A

I will be unreachable for a couple hours, sorry if I do not reply. It is currently 3 AM where I live.

Thank you for understanding.

If you turn on the debug global option, you’ll probably see logs about this, I’m thinking it’ll be a TLS handshake failed error in the logs.

Caddy has a timeout for the TLS handshake, so if issuance in total takes longer than 3 minutes, then the issuance attempt will be canceled.

I just dug into the code, and I found that there’s a set 10 second timeout for the ask endpoint (see below).

What are you doing in your ask endpoint that would cause it to take that long? Really, it should just be a quick database query to check if the domain is found in some allow-list.

1 Like

What are you doing in your ask endpoint that would cause it to take that long? Really, it should just be a quick database query to check if the domain is found in some allow-list.

Well, I’m doing a check with letsdebug.net to ensure that it meets the requirements before returning 200 since I don’t want to try with lets encrypt when it might fail (because dns propagation, etc?).

Unless I don’t need that?

Typically what I’ve seen people do is build that into the process of getting your users signed up on your app, then if successful add it to a database. Then your ask endpoint becomes just a fast lookup.

Ah, okay. I’ll redo it that way then.

Yeah, don’t try and be too clever. Leave the validation to the CA and just check the hostname for acceptability.