Caddy v2.7.0-beta.2 "tls internal" renewal

1. The problem I’m having:

Hello,

Since we switched from 2.4.6 to 2.7.x beta on some boxes, I noticed an issue with “tls internal” certs for the IP Addresses endpoints.

Certificates files for these internal tls targets are residing in the “local” certificates folder, which seems good.

/var/lib/caddy/.local/share/caddy
[root@web23 caddy]# find -name "94*"
./certificates/local/94.103.96.188
./certificates/local/94.103.96.188/94.103.96.188.crt
./certificates/local/94.103.96.188/94.103.96.188.key
./certificates/local/94.103.96.188/94.103.96.188.json
[root@web23 caddy]# cd certificates/local
[root@web23 local]# ls -la
total 0
drwx------ 7 caddy caddy  99 Jun 25 10:00 .
drwx------ 5 caddy caddy  97 Oct 28  2021 ..
drwx------ 2 caddy caddy  52 Oct 14  2021 --1
drwx------ 2 caddy caddy  70 Oct 14  2021 127.0.0.1
drwx------ 2 caddy caddy 100 Oct 14  2021 2a00-a500-0-96--188
drwx------ 2 caddy caddy  82 Oct 14  2021 94.103.96.188
drwx------ 2 caddy caddy  70 Jun 25 10:00 localhost

However, when caddy want to renew them (every 12h), it seems it tries to renew them through LE / ZeroSSL, which fails because it does not qualify for public certs.
Also I see some on_demand logs line about these certificates, when for these, I’ve explicitly set “tls internal” (without the on_demand option)

{"level":"info","ts":1687649361.350319,"logger":"tls.cache.maintenance","msg":"certificate expires soon; queuing for renewal","identifiers":["2a00:a500:0:96::188"],"remaining":13829.649681449}
{"level":"info","ts":1687662561.3478105,"logger":"tls.cache.maintenance","msg":"certificate expires soon; queuing for renewal","identifiers":["94.103.96.188"],"remaining":13941.652192865}
{"level":"info","ts":1687648816.401833,"logger":"tls.on_demand","msg":"obtaining new certificate","remote_ip":"2620:96:e000:b0cc:e:2:2:6","remote_port":"34478","server_name":"2a00:a500:0:96::188"}
{"level":"info","ts":1687648816.4040043,"logger":"tls.obtain","msg":"acquiring lock","identifier":"2a00:a500:0:96::188"}
{"level":"info","ts":1687648816.4086325,"logger":"tls.obtain","msg":"lock acquired","identifier":"2a00:a500:0:96::188"}
{"level":"info","ts":1687648816.408801,"logger":"tls.obtain","msg":"obtaining certificate","identifier":"2a00:a500:0:96::188"}
{"level":"error","ts":1687648816.4125872,"logger":"tls.obtain","msg":"will retry","error":"[2a00:a500:0:96::188] Obtain: subject does not qualify for a public certificate: 2a00:a500:0:96::188","attempt":1,"retrying_in":60,"elapsed":0.003934779,"max_durati$
{"level":"info","ts":1687648876.4133291,"logger":"tls.obtain","msg":"obtaining certificate","identifier":"2a00:a500:0:96::188"}
{"level":"error","ts":1687648876.4137444,"logger":"tls.obtain","msg":"will retry","error":"[2a00:a500:0:96::188] Obtain: subject does not qualify for a public certificate: 2a00:a500:0:96::188","attempt":2,"retrying_in":120,"elapsed":60.005093677,"max_dura$
{"level":"info","ts":1687648996.402827,"logger":"tls.obtain","msg":"releasing lock","identifier":"2a00:a500:0:96::188"}
{"level":"error","ts":1687648996.4030983,"logger":"tls.on_demand","msg":"renewing certificate on-demand failed","subjects":["2a00:a500:0:96::188"],"not_after":1687663191,"error":"context canceled"}
{"level":"error","ts":1687648996.404658,"logger":"tls.on_demand","msg":"renewing certificate on-demand failed","subjects":["2a00:a500:0:96::188"],"not_after":1687663191,"error":"no matching certificate to load for : open /var/lib/caddy/.local/share/caddy/$
{"level":"info","ts":1687648998.1338716,"logger":"tls.on_demand","msg":"obtaining new certificate","remote_ip":"2602:80d:1000:b0cc:e:2:5:6","remote_port":"50130","server_name":"2a00:a500:0:96::188"}
{"level":"info","ts":1687648998.1342003,"logger":"tls.obtain","msg":"acquiring lock","identifier":"2a00:a500:0:96::188"}
{"level":"info","ts":1687648998.135236,"logger":"tls.obtain","msg":"lock acquired","identifier":"2a00:a500:0:96::188"}
{"level":"info","ts":1687648998.1353586,"logger":"tls.obtain","msg":"obtaining certificate","identifier":"2a00:a500:0:96::188"}
{"level":"error","ts":1687648998.135599,"logger":"tls.obtain","msg":"will retry","error":"[2a00:a500:0:96::188] Obtain: subject does not qualify for a public certificate: 2a00:a500:0:96::188","attempt":1,"retrying_in":60,"elapsed":0.00034584,"max_duration$
{"level":"info","ts":1687649058.1365676,"logger":"tls.obtain","msg":"obtaining certificate","identifier":"2a00:a500:0:96::188"}
{"level":"error","ts":1687649058.1368542,"logger":"tls.obtain","msg":"will retry","error":"[2a00:a500:0:96::188] Obtain: subject does not qualify for a public certificate: 2a00:a500:0:96::188","attempt":2,"retrying_in":120,"elapsed":60.001601645,"max_dura$
{"level":"info","ts":1687649178.134662,"logger":"tls.obtain","msg":"releasing lock","identifier":"2a00:a500:0:96::188"}
{"level":"error","ts":1687649178.134821,"logger":"tls.on_demand","msg":"renewing certificate on-demand failed","subjects":["2a00:a500:0:96::188"],"not_after":1687663191,"error":"context canceled"}
{"level":"info","ts":1687662103.291739,"logger":"tls.on_demand","msg":"obtaining new certificate","remote_ip":"94.103.96.130","remote_port":"35378","server_name":"94.103.96.188"}
{"level":"info","ts":1687662103.2955809,"logger":"tls.obtain","msg":"acquiring lock","identifier":"94.103.96.188"}
{"level":"info","ts":1687662103.2965617,"logger":"tls.obtain","msg":"lock acquired","identifier":"94.103.96.188"}
{"level":"info","ts":1687662103.2966921,"logger":"tls.obtain","msg":"obtaining certificate","identifier":"94.103.96.188"}
{"level":"error","ts":1687662103.2970204,"logger":"tls.obtain","msg":"will retry","error":"[94.103.96.188] Obtain: subject does not qualify for a public certificate: 94.103.96.188","attempt":1,"retrying_in":60,"elapsed":0.000441277,"max_duration":2592000}
{"level":"info","ts":1687662163.2976422,"logger":"tls.obtain","msg":"obtaining certificate","identifier":"94.103.96.188"}
{"level":"error","ts":1687662163.2980185,"logger":"tls.obtain","msg":"will retry","error":"[94.103.96.188] Obtain: subject does not qualify for a public certificate: 94.103.96.188","attempt":2,"retrying_in":120,"elapsed":60.001438909,"max_duration":259200$
{"level":"error","ts":1687662262.1381907,"logger":"tls.on_demand","msg":"renewing certificate on-demand failed","subjects":["94.103.96.188"],"not_after":1687676503,"error":"timed out waiting to obtain certificate for 94.103.96.188"}
{"level":"error","ts":1687662283.037333,"logger":"tls.on_demand","msg":"renewing certificate on-demand failed","subjects":["94.103.96.188"],"not_after":1687676503,"error":"timed out waiting to obtain certificate for 94.103.96.188"}
{"level":"info","ts":1687662283.2978654,"logger":"tls.obtain","msg":"releasing lock","identifier":"94.103.96.188"}
{"level":"error","ts":1687662283.2980218,"logger":"tls.on_demand","msg":"renewing certificate on-demand failed","subjects":["94.103.96.188"],"not_after":1687676503,"error":"context canceled"}
{"level":"error","ts":1687662283.2981737,"logger":"tls.on_demand","msg":"renewing certificate on-demand failed","subjects":["94.103.96.188"],"not_after":1687676503,"error":"no matching certificate to load for : open /var/lib/caddy/.local/share/caddy/certi$
{"level":"error","ts":1687662283.2981794,"logger":"tls.on_demand","msg":"renewing certificate on-demand failed","subjects":["94.103.96.188"],"not_after":1687676503,"error":"no matching certificate to load for : open /var/lib/caddy/.local/share/caddy/certi$
{"level":"error","ts":1687662283.2982175,"logger":"tls.on_demand","msg":"renewing certificate on-demand failed","subjects":["94.103.96.188"],"not_after":1687676503,"error":"no matching certificate to load for : open /var/lib/caddy/.local/share/caddy/certi$
{"level":"error","ts":1687662283.2982714,"logger":"tls.on_demand","msg":"renewing certificate on-demand failed","subjects":["94.103.96.188"],"not_after":1687676503,"error":"no matching certificate to load for : open /var/lib/caddy/.local/share/caddy/certi$
{"level":"error","ts":1687662283.2981806,"logger":"tls.on_demand","msg":"renewing certificate on-demand failed","subjects":["94.103.96.188"],"not_after":1687676503,"error":"no matching certificate to load for : open /var/lib/caddy/.local/share/caddy/certi$
{"level":"error","ts":1687662283.2984183,"logger":"tls.on_demand","msg":"renewing certificate on-demand failed","subjects":["94.103.96.188"],"not_after":1687676503,"error":"no matching certificate to load for : open /var/lib/caddy/.local/share/caddy/certi$
{"level":"info","ts":1687662322.1662922,"logger":"tls.on_demand","msg":"obtaining new certificate","remote_ip":"212.243.40.116","remote_port":"58667","server_name":"94.103.96.188"}
{"level":"info","ts":1687662322.166539,"logger":"tls.obtain","msg":"acquiring lock","identifier":"94.103.96.188"}
{"level":"info","ts":1687662322.167656,"logger":"tls.obtain","msg":"lock acquired","identifier":"94.103.96.188"}
{"level":"info","ts":1687662322.1677473,"logger":"tls.obtain","msg":"obtaining certificate","identifier":"94.103.96.188"}
{"level":"error","ts":1687662322.16795,"logger":"tls.obtain","msg":"will retry","error":"[94.103.96.188] Obtain: subject does not qualify for a public certificate: 94.103.96.188","attempt":1,"retrying_in":60,"elapsed":0.000280393,"max_duration":2592000}
{"level":"info","ts":1687662382.169044,"logger":"tls.obtain","msg":"obtaining certificate","identifier":"94.103.96.188"}
{"level":"error","ts":1687662382.1694114,"logger":"tls.obtain","msg":"will retry","error":"[94.103.96.188] Obtain: subject does not qualify for a public certificate: 94.103.96.188","attempt":2,"retrying_in":120,"elapsed":60.001741248,"max_duration":259200$
{"level":"error","ts":1687662463.0567381,"logger":"tls.on_demand","msg":"renewing certificate on-demand failed","subjects":["94.103.96.188"],"not_after":1687676503,"error":"timed out waiting to obtain certificate for 94.103.96.188"}
{"level":"error","ts":1687662492.8766448,"logger":"tls.on_demand","msg":"renewing certificate on-demand failed","subjects":["94.103.96.188"],"not_after":1687676503,"error":"timed out waiting to obtain certificate for 94.103.96.188"}
{"level":"error","ts":1687662502.1429038,"logger":"tls.on_demand","msg":"renewing certificate on-demand failed","subjects":["94.103.96.188"],"not_after":1687676503,"error":"timed out waiting to obtain certificate for 94.103.96.188"}
{"level":"info","ts":1687662502.1664727,"logger":"tls.obtain","msg":"releasing lock","identifier":"94.103.96.188"}
{"level":"error","ts":1687662502.1667027,"logger":"tls.on_demand","msg":"renewing certificate on-demand failed","subjects":["94.103.96.188"],"not_after":1687676503,"error":"context canceled"}
{"level":"error","ts":1687662502.1668742,"logger":"tls.on_demand","msg":"renewing certificate on-demand failed","subjects":["94.103.96.188"],"not_after":1687676503,"error":"no matching certificate to load for : open /var/lib/caddy/.local/share/caddy/certi$
{"level":"error","ts":1687662502.166896,"logger":"tls.on_demand","msg":"renewing certificate on-demand failed","subjects":["94.103.96.188"],"not_after":1687676503,"error":"no matching certificate to load for : open /var/lib/caddy/.local/share/caddy/certif$
{"level":"error","ts":1687662502.1669002,"logger":"tls.on_demand","msg":"renewing certificate on-demand failed","subjects":["94.103.96.188"],"not_after":1687676503,"error":"no matching certificate to load for : open /var/lib/caddy/.local/share/caddy/certi$
{"level":"error","ts":1687662502.1669648,"logger":"tls.on_demand","msg":"renewing certificate on-demand failed","subjects":["94.103.96.188"],"not_after":1687676503,"error":"no matching certificate to load for : open /var/lib/caddy/.local/share/caddy/certi$
{"level":"error","ts":1687662502.1670926,"logger":"tls.on_demand","msg":"renewing certificate on-demand failed","subjects":["94.103.96.188"],"not_after":1687676503,"error":"no matching certificate to load for : open /var/lib/caddy/.local/share/caddy/certi$
{"level":"error","ts":1687662502.1672964,"logger":"tls.on_demand","msg":"renewing certificate on-demand failed","subjects":["94.103.96.188"],"not_after":1687676503,"error":"no matching certificate to load for : open /var/lib/caddy/.local/share/caddy/certi$
{"level":"info","ts":1687662523.0864077,"logger":"tls.on_demand","msg":"obtaining new certificate","remote_ip":"94.103.96.130","remote_port":"38802","server_name":"94.103.96.188"}
{"level":"info","ts":1687662523.086662,"logger":"tls.obtain","msg":"acquiring lock","identifier":"94.103.96.188"}
{"level":"info","ts":1687662523.0877855,"logger":"tls.obtain","msg":"lock acquired","identifier":"94.103.96.188"}
{"level":"info","ts":1687662523.0879154,"logger":"tls.obtain","msg":"obtaining certificate","identifier":"94.103.96.188"}
{"level":"error","ts":1687662523.0881758,"logger":"tls.obtain","msg":"will retry","error":"[94.103.96.188] Obtain: subject does not qualify for a public certificate: 94.103.96.188","attempt":1,"retrying_in":60,"elapsed":0.000364581,"max_duration":2592000}
{"level":"info","ts":1687662561.3478105,"logger":"tls.cache.maintenance","msg":"certificate expires soon; queuing for renewal","identifiers":["94.103.96.188"],"remaining":13941.652192865}
{"level":"info","ts":1687662561.347979,"logger":"tls.cache.maintenance","msg":"attempting certificate renewal","identifiers":["94.103.96.188"],"remaining":13941.652024121}
{"level":"info","ts":1687662561.3482196,"logger":"tls.renew","msg":"acquiring lock","identifier":"94.103.96.188"}

But after a few attemps it finally succeed to renew it (through local cert I guess) and the situation is back to normal.

{"level":"info","ts":1687649361.3794806,"logger":"tls.renew","msg":"certificate renewed successfully","identifier":"2a00:a500:0:96::188"}
{"level":"info","ts":1687662883.5889027,"logger":"tls.renew","msg":"certificate renewed successfully","identifier":"94.103.96.188"}`

The problem is that in the meantime, for a few minutes, the SSL is broken for these endpoints. I noticed this because we monitor the endpoints from Zabbix, and every 12h we have alerts about SSL error on these.

Any idea what could be wrong here ?

When we were running 2.6.4 with same configs, on these hosts, the issue was not present.

Kind regards

2. Full log output

Can’t post it. It exceed the allowed amount of chars that can be posted.

3. Caddy version:

v2.7.0-beta.2 h1:jaS1odoRuDR2W8igaKgVGvVjhTNt8xfoz3YPC4bcenA=

d. My complete Caddy config:

{
        admin 127.0.0.1:8888
        default_bind 127.0.0.1 [::1] 94.103.96.188 [2a00:a500:0:96::188]
        grace_period 3s
        log {
                output file /var/log/caddy/caddy.log {
                        roll_size 250MiB
                        roll_keep_for 15d
                }
                level ERROR
        }
        email no@notreally.com
        on_demand_tls {
                ask https://you.dont.want.to.know/caddy/dnslookup
                interval 2m
                burst 10000
        }
        servers {
                trusted_proxies cloudflare {
                        interval 12h
                        timeout 15s
                }
        }
}

# Common options we want to apply to every "virtualhosts"
(common) {
        @sc_server_fqdn {
                path /_sc_get_server_fqdn
        }
        respond @sc_server_fqdn "web23.swisscenter.com" 200 {
                close
        }
        reverse_proxy http://127.0.0.80:80
}

# Host related endpoints
http://web23.swisscenter.com, http://localhost, http://127.0.0.1, http://[::1], http://94.103.96.188, http://[2a00:a500:0:96::188] {
        redir https://web23.swisscenter.com{uri} 308
}
https://localhost, https://127.0.0.1, https://[::1], https://94.103.96.188, https://[2a00:a500:0:96::188] {
        tls internal
        redir https://web23.swisscenter.com{uri} 308
}
https://web23.swisscenter.com {
        import common
}

# LVE Manager endpoint
manager.web23.swisscenter.com {
        @manager_access {
                not remote_ip 192.168.50.0/24
        }
        route @manager_access {
                respond "We're sorry, but this resource is not available for you. If you feed this is an error, please contact your amazing server administrator." 403 {
                        close
                }
        }
        reverse_proxy http://127.0.0.1:9000
}


# Per virtualhost specific configs
# NOTE: This folder is currently EMPTY. It's only there if we need to add  specific tweaks/config for some specific customer
import /etc/caddy/customers/*.conf

# Default catchall endpoints
http:// {
        import common
}
https:// {
        import common
        tls {
                on_demand
                load /etc/caddy/certs
        }
}

Thanks, this seems like a bug.

Perhaps it could be trying LE and ZeroSSL, then falling back to the internal issuer finally?

Maybe the default public ACME CAs are being prepended to the list of issuers when they shouldn’t be…

Will look into this more when I’m back at my desk.

Hello Matt,

Yes, my guess is that it finally uses the internal issuer as after some minutes it successfully renew the certs.

Should I enable debug logging, would it give you more info about it ?

Maybe I should lower the internal certificats lifetime to 30 minutes or so, so I don’t have to leave caddy running with debug log and wait the default 12h hours for a renew to trigger ?

Kind regards

1 Like

Couldn’t hurt, sure :slight_smile: That might be useful.

You could do that, or just delete the cert on disk and reload Caddy.

Thanks for helping see this through!

Well, I tried using your config to reproduce the behavior, but everything including initial issuance and renewal worked for me.

I’m having a little trouble following the broken and truncated logs above. If you could please post a full log (without redactions, ideally, since specifics matter), and find a minimal config to reproduce it, that would be helpful.

Hello Matt,

Okay. I will setup caddy on a test server with minimal config but enough to reproduce this behavior, then send you the exact config and full logs.

Kind regards

Hello Matt,

I replicated the setup on another server with the same Caddyfile and same caddy version 2.7.x, and I’m not able to reproduce it (yet).

I’m getting crazy because there are no differences, except that on the two servers I upgraded to 2.7.x it constantly happens, there are a lot more of requests coming in on the prod servers.
There are around 6000 domains handled by caddy with on demand tls while on my test server there are only 5 domains handled by odemand tls.

Also on the test server there was no existing caddy data in it’s folder. Still I’ve grepped the data folder on the prod sever for the IP addreses for which the certificate renewal are attempter through external issues, but found nothing (except in certificates/local, which is normal)

What is strange is that reverting to 2.6.4 on the bogus servers resolves the problem.

I’m trying my best to trigger the issue on the test server… And will get back to you as soon as I was able to reproduce it.

Kind regards

OMG I was now able to replicate it.

I think I’m on something here. The prod servers are monitored by Zabbix and they are regulary doing HTTPS check by calling https://IP.ADDRESS of these servers.

My test server was not monitored by Zabbix. After adding it it started generating these suspicious log entries:

# grep public caddy.log 
{"level":"error","ts":1688121916.9126716,"logger":"tls.obtain","msg":"will retry","error":"[94.103.97.97] Obtain: subject does not qualify for a public certificate: 94.103.97.97","attempt":1,"retrying_in":60,"elapsed":0.000565006,"max_duration":2592000}

I suspect that Zabbix is not transmitting any SNI information for the request, where maybe browsers and other tools like curl maybe automatically use the IP address as SNI…

So maybe caddy is somehow not threating non-SNI or SNI with IP address similary ?

Hello Matt,

Here is how to reproduce the issue.

First, the full config. (notice I added temporarily a specific IP Address 1.2.3.4 to do the tests locally with it)

{
        admin 127.0.0.1:8888
        default_bind 127.0.0.1 [::1] 94.103.97.97 [2a00:a500:0:97:1487:3dff:fe03:26b1] 1.2.3.4
        grace_period 3s
        log {
                output file /var/log/caddy/caddy.log {
                        roll_size 250MiB
                        roll_keep_for 15d
                }
                level DEBUG
        }
        email bob@microsoft.com
        on_demand_tls {
				# Always return 200 (disabled checks for testing)
                ask http://127.0.0.80/
                interval 2m
                burst 10000
        }
        servers {
                protocols h1 h2 h3
                trusted_proxies cloudflare {
                        interval 12h
                        timeout 15s
                }
        }
}

# Common options we want to apply to every "virtualhosts"
(common) {
        # Return client IP
        @sc_client_ip {
                path /_sc_client_ip
        }
        respond @sc_client_ip "{client_ip}" 200 {
                close
        }
		# Forward to apache
        reverse_proxy http://127.0.0.80:80
}

# Host related endpoints
http://moo.zygounet.ch, http://localhost, http://127.0.0.1, http://[::1], http://94.103.97.97, http://[2a00:a500:0:97:1487:3dff:fe03:26b1], http://1.2.3.4 {
        redir https://moo.zygounet.ch{uri} 308
}
https://localhost, https://127.0.0.1, https://[::1], https://94.103.97.97, https://[2a00:a500:0:97:1487:3dff:fe03:26b1], https://1.2.3.4 {
        tls {
                issuer internal {
                        lifetime 10m
                }
        }
        tls internal
        redir https://moo.zygounet.ch{uri} 308
}
https://moo.zygounet.ch {
        import common
}

# Default catchall endpoints
http:// {
        import common
}
https:// {
        import common
        tls {
                on_demand
			}
}

Add this ip address to the server network interface so it can bind to it (won’t be routed but not a problem for doing local tests)

ifconfig eth0:0 1.2.3.4 netmask 255.255.255.0

Then I start caddy

systemctl start caddy

Now, do a HTTPS connection call on port 443 on IP 1.2.3.4

openssl s_client -noservername -connect 1.2.3.4:443 2>/dev/null | openssl x509 -noout -dates
notBefore=Jun 30 13:50:31 2023 GMT
notAfter=Jun 30 14:00:31 2023 GMT

All fine here. The cert was obtained at caddy startup as it’s not on demand, and it does it through internal issuer

{"level":"info","ts":1688133031.7356513,"logger":"tls.obtain","msg":"obtaining certificate","identifier":"1.2.3.4"}
{"level":"debug","ts":1688133031.735658,"logger":"tls.obtain","msg":"trying issuer 1/2","issuer":"local"}
...cut...
{"level":"info","ts":1688133031.7376113,"logger":"tls.obtain","msg":"certificate obtained successfully","identifier":"1.2.3.4"}
{"level":"debug","ts":1688133031.7377222,"logger":"events","msg":"event","name":"cert_obtained","id":"46607647-d1bf-469f-8685-60c1eff3b14c","origin":"tls","data":{"certificate_path":"certificates/local/1.2.3.4/1.2.3.4.crt","identifier":"1.2.3.4","issuer":"local","metadata_path":"certificates/local/1.2.3.4/1.2.3.4.json","private_key_path":"certificates/local/1.2.3.4/1.2.3.4.key","renewal":false,"storage_path":"certificates/local/1.2.3.4"}}

Now wait a few minutes for cert renew to trigger.

When the renew is triggered it tries to optain it from external issuer and whilte it’s trying and retrying , I can’t connect anymore to the IP:443 with SSL

 openssl s_client -noservername -connect 1.2.3.4:443                               
CONNECTED(00000003)
(it hangs here)

Note that at some point in the log it seems it says matched certificate in cache, but at the same time there tls.on_demand saying certificate not found on disk… Seems it’s mixing things here for the same identifier “1.2.3.4”

{"level":"debug","ts":1688133644.9507334,"logger":"tls.handshake","msg":"matched certificate in cache","remote_ip":"1.2.3.4","remote_port":"39772","subjects":["1.2.3.4"],"managed":true,"expiration":1688133632,"hash":"1def480189abe8d29ca3220288942feab8ea936ba66a215893ff09e93884dff7"}
{"level":"debug","ts":1688133644.95091,"logger":"tls.on_demand","msg":"certificate not found on disk; obtaining new certificate","identifiers":["1.2.3.4"]}
{"level":"info","ts":1688133657.6530557,"logger":"tls.obtain","msg":"obtaining certificate","identifier":"1.2.3.4"}
{"level":"debug","ts":1688133657.653147,"logger":"events","msg":"event","name":"cert_obtaining","id":"f435a67c-c565-4501-bae8-0179e89e560b","origin":"tls","data":{"identifier":"1.2.3.4"}}
{"level":"debug","ts":1688133657.6534843,"logger":"tls.obtain","msg":"trying issuer 1/2","issuer":"acme-v02.api.letsencrypt.org-directory"}
{"level":"debug","ts":1688133657.6535087,"logger":"tls.obtain","msg":"trying issuer 2/2","issuer":"acme.zerossl.com-v2-DV90"}
{"level":"debug","ts":1688133657.653529,"logger":"events","msg":"event","name":"cert_failed","id":"4b2f64a9-2453-480f-b385-e9d50d65b898","origin":"tls","data":{"error":{},"identifier":"1.2.3.4","issuers":["acme-v02.api.letsencrypt.org-directory","acme.zerossl.com-v2-DV90"],"renewal":false}}
{"level":"error","ts":1688133657.6535645,"logger":"tls.obtain","msg":"will retry","error":"[1.2.3.4] Obtain: subject does not qualify for a public certificate: 1.2.3.4","attempt":2,"retrying_in":120,"elapsed":60.001803913,"max_duration":2592000}
{"level":"debug","ts":1688133674.321572,"logger":"events","msg":"event","name":"tls_get_certificate","id":"1c5f4c6e-d661-497e-9f7d-f0585c48cf56","origin":"tls","data":{"client_hello":{"CipherSuites":[52393,52392,49195,49199,49196,49200,49161,49171,49162,49172,156,157,47,53,49170,10,4867,4865,4866],"ServerName":"www.area13.com","SupportedCurves":[29,23,24,25],"SupportedPoints":"AA==","SignatureSchemes":[2052,1027,2055,2053,2054,1025,1281,1537,1283,1539,513,515],"SupportedProtos":null,"SupportedVersions":[772,771,770,769],"Conn":{}}}}

Full log here: log - Pastebin.com

Finally it succeed to optain the cert through local issuer and it’s answering again.

openssl s_client -noservername -connect 1.2.3.4:443                               
CONNECTED(00000003)
depth=2 CN = Caddy Local Authority - 2023 ECC Root
verify return:1
depth=1 CN = Caddy Local Authority - ECC Intermediate
verify return:1
depth=0 
verify return:1
---
Certificate chain
 0 s:
   i:CN = Caddy Local Authority - ECC Intermediate
   a:PKEY: id-ecPublicKey, 256 (bit); sigalg: ecdsa-with-SHA256
   v:NotBefore: Jun 30 14:10:31 2023 GMT; NotAfter: Jun 30 14:20:31 2023 GMT
 1 s:CN = Caddy Local Authority - ECC Intermediate
   i:CN = Caddy Local Authority - 2023 ECC Root
   a:PKEY: id-ecPublicKey, 256 (bit); sigalg: ecdsa-with-SHA256
   v:NotBefore: Jun 30 09:22:24 2023 GMT; NotAfter: Jul  7 09:22:24 2023 GMT
---

Conclusion,

as long as the clients connecting to the IP are passing a SNI (even if it’s the IP address used as SNI), everything is fine. The renewals are done through the internal issuer.
However, the mess begins when some client is connect to the IP without passing SNI. It will return the correct certificate and answer the request. But then at next renewal it will attempts to renew the cert through external issuers…
Huh ?

SNI will never contain an IP address, it’s not valid as per the RFC. If something does send an IP in SNI, it’s not spec compliant.

Why are you using HTTPS without a domain, anyway? That’s asking for trouble tbh.

You might be able to use the default_sni option to set a default for clients that don’t support SNI.

But I suggest just using a domain, even if that domain resolves to a LAN IP. You can get a publicly trusted cert with the DNS challenge.

Or, do you really need HTTPS for LAN clients? HTTP is fine if you trust the software in your network to not sniff/tamper the traffic.

Hello Francis,

Yes, it makes perfect sense not to use IPs in SNI.

It’s not that I want to. We’re monitoring the servers with Zabbix which has a net.tcp.service for checking if a service is answering.

By default it uses the host IP address configured in the host definition, therefore I guess it is doing a call to ip:443 port without SNI.

That’s something I should give a try indeed

We don’t really need it. Our zabbix monitoring does this. Some random curious people could also try https://our.ip, like for example could some people scanning for services on a given IP range.

My goal is to understand here why the behavior changed between 2.6.4 and 2.7 beta with the same config and why, additionally to using tls internal, it is also trying to obtain certs through on_demand with LE/ZeroSSL while in the config the IPs endpoints it is clearly specified to use tls internal.

Kind regards.

Why not use port 80 instead for HTTP?

You don’t need to worry about that. There will always be bots hitting HTTPS. Those can be ignored.

There definitely was a lot of changes to certmagic that may have broken something. I don’t think you’re mistaken there. @matt will need to dig deeper on that.

We check both port 80 and 443. We historically always did (when we were only using apache on these servers). Now we just added caddy in front of it as a SSL terminator because of it’s amazing automatic https and certs handling :slight_smile:

Yes, but now these bots are triggering the issue with caddy 2.7.x and it produces downtime.

Yeah, that’s why I’m trying to understand where is the “problem”.

After your suggestion, I tried to use default_sni instead of generating tls internal certificates for non-SNI stuff.
I had not much success with it, so I tried the config from minimal adding step by step stuff to it.

With this config, a non-SNI call to the IP addresses use the cert for moo.zygounet.ch which is the expected behavior.

{
        admin 127.0.0.1:8888
        default_bind 127.0.0.1 [::1] 94.103.97.97 [2a00:a500:0:97:1487:3dff:fe03:26b1]
        grace_period 3s
        log {
                output file /var/log/caddy/caddy.log {
                        roll_size 250MiB
                        roll_keep_for 15d
                }
                level DEBUG
        }
        email sr@madjix.ch
        default_sni moo.zygounet.ch
        on_demand_tls {
                ask http://127.0.0.80/
                interval 2m
                burst 10000
        }
}

# Specific entry for main host
moo.zygounet.ch {
        respond "Welcome to moo" {
                close
        }
}

# Catch all routes for the rest of the domains
http:// {
        respond "Catch all HTTP" {
                close
        }
}
https:// {
        tls {
                on_demand
        }
        respond "Catch all HTTPS" {
                close
        }
}

Next step, I’m only enabling on_demand tls for the catchall https route:

{
        admin 127.0.0.1:8888
        default_bind 127.0.0.1 [::1] 94.103.97.97 [2a00:a500:0:97:1487:3dff:fe03:26b1]
        grace_period 3s
        log {
                output file /var/log/caddy/caddy.log {
                        roll_size 250MiB
                        roll_keep_for 15d
                }
                level DEBUG
        }
        email sr@madjix.ch
        default_sni moo.zygounet.ch
        on_demand_tls {
                ask http://127.0.0.80/
                interval 2m
                burst 10000
        }
}

# Specific entry for main host
moo.zygounet.ch {
        respond "Welcome to moo" {
                close
        }
}

# Catch all routes for the rest of the domains
http:// {
        respond "Catch all HTTP" {
                close
        }
}
https:// {
        tls {
                on_demand
        }
        respond "Catch all HTTPS" {
                close
        }
}

And game over… It’s not using the default_sni anymore for non-SNI requests. Full log from startup until the first non-SNI requeset come in:

{"level":"info","ts":1688241569.7915084,"logger":"admin","msg":"admin endpoint started","address":"127.0.0.1:8888","enforce_origin":false,"origins":["//localhost:8888","//[::1]:8888","//127.0.0.1:8888"]}
{"level":"info","ts":1688241569.7919698,"logger":"tls.cache.maintenance","msg":"started background certificate maintenance","cache":"0xc0006cae70"}
{"level":"info","ts":1688241569.792083,"logger":"http.auto_https","msg":"enabling automatic HTTP->HTTPS redirects","server_name":"srv0"}
{"level":"warn","ts":1688241569.7921112,"logger":"http.auto_https","msg":"server is listening only on the HTTP port, so no automatic HTTPS will be applied to this server","server_name":"srv1","http_port":80}
{"level":"debug","ts":1688241569.7921755,"logger":"http.auto_https","msg":"adjusted config","tls":{"automation":{"policies":[{"subjects":["moo.zygounet.ch"]},{"on_demand":true}],"on_demand":{"rate_limit":{"interval":120000000000,"burst":10000},"ask":"http://127.0.0.80/"}}},"http":{"grace_period":3000000000,"servers":{"srv0":{"listen":["127.0.0.1:443","94.103.97.97:443","[2a00:a500:0:97:1487:3dff:fe03:26b1]:443","[::1]:443"],"routes":[{"handle":[{"handler":"subroute","routes":[{"handle":[{"body":"Welcome to moo","close":true,"handler":"static_response"}]}]}],"terminal":true},{"handle":[{"handler":"subroute","routes":[{"handle":[{"body":"Catch all HTTPS","close":true,"handler":"static_response"}]}]}],"terminal":true}],"tls_connection_policies":[{"default_sni":"moo.zygounet.ch"}],"automatic_https":{},"trusted_proxies":{"interval":43200000000000,"source":"cloudflare","timeout":15000000000}},"srv1":{"listen":["127.0.0.1:80","94.103.97.97:80","[2a00:a500:0:97:1487:3dff:fe03:26b1]:80","[::1]:80"],"routes":[{},{"handle":[{"body":"Catch all HTTP","close":true,"handler":"static_response"}]},{}],"automatic_https":{"disable":true},"trusted_proxies":{"interval":43200000000000,"source":"cloudflare","timeout":15000000000}}}}}
{"level":"info","ts":1688241570.053522,"logger":"tls","msg":"cleaning storage unit","description":"FileStorage:/var/lib/caddy/.local/share/caddy"}
{"level":"info","ts":1688241570.0534608,"logger":"http","msg":"enabling HTTP/3 listener","addr":"127.0.0.1:443"}
{"level":"debug","ts":1688241570.0541081,"logger":"http","msg":"starting server loop","address":"127.0.0.1:443","tls":true,"http3":true}
{"level":"info","ts":1688241570.054194,"logger":"http","msg":"enabling HTTP/3 listener","addr":"94.103.97.97:443"}
{"level":"debug","ts":1688241570.056041,"logger":"http","msg":"starting server loop","address":"94.103.97.97:443","tls":true,"http3":true}
{"level":"info","ts":1688241570.0562692,"logger":"http","msg":"enabling HTTP/3 listener","addr":"[2a00:a500:0:97:1487:3dff:fe03:26b1]:443"}
{"level":"debug","ts":1688241570.0568643,"logger":"http","msg":"starting server loop","address":"[2a00:a500:0:97:1487:3dff:fe03:26b1]:443","tls":true,"http3":true}
{"level":"info","ts":1688241570.0569558,"logger":"http","msg":"enabling HTTP/3 listener","addr":"[::1]:443"}
{"level":"debug","ts":1688241570.0570948,"logger":"http","msg":"starting server loop","address":"[::1]:443","tls":true,"http3":true}
{"level":"info","ts":1688241570.0571094,"logger":"http.log","msg":"server running","name":"srv0","protocols":["h1","h2","h3"]}
{"level":"debug","ts":1688241570.0571818,"logger":"http","msg":"starting server loop","address":"127.0.0.1:80","tls":false,"http3":false}
{"level":"debug","ts":1688241570.057231,"logger":"http","msg":"starting server loop","address":"94.103.97.97:80","tls":false,"http3":false}
{"level":"debug","ts":1688241570.0572927,"logger":"http","msg":"starting server loop","address":"[2a00:a500:0:97:1487:3dff:fe03:26b1]:80","tls":false,"http3":false}
{"level":"debug","ts":1688241570.0573504,"logger":"http","msg":"starting server loop","address":"[::1]:80","tls":false,"http3":false}
{"level":"info","ts":1688241570.0573616,"logger":"http.log","msg":"server running","name":"srv1","protocols":["h1","h2","h3"]}
{"level":"info","ts":1688241570.0573688,"logger":"http","msg":"enabling automatic TLS certificate management","domains":["moo.zygounet.ch"]}
{"level":"debug","ts":1688241570.0578368,"logger":"tls","msg":"loading managed certificate","domain":"moo.zygounet.ch","expiration":1695889462,"issuer_key":"acme-v02.api.letsencrypt.org-directory","storage":"FileStorage:/var/lib/caddy/.local/share/caddy"}
{"level":"debug","ts":1688241570.0582128,"logger":"tls.cache","msg":"added certificate to cache","subjects":["moo.zygounet.ch"],"expiration":1695889462,"managed":true,"issuer_key":"acme-v02.api.letsencrypt.org-directory","hash":"d4f410fdb9b8756f193813f9fdd1dca272611929695c22d1fcdf0c09a9bcbfa4","cache_size":1,"cache_capacity":10000}
{"level":"debug","ts":1688241570.0582497,"logger":"events","msg":"event","name":"cached_managed_cert","id":"d80bbfa4-dd11-4294-b3b0-9f5548cf91c1","origin":"tls","data":{"sans":["moo.zygounet.ch"]}}
{"level":"info","ts":1688241570.0585215,"msg":"autosaved config (load with --resume flag)","file":"/var/lib/caddy/.config/caddy/autosave.json"}
{"level":"info","ts":1688241570.0586107,"msg":"serving initial configuration"}
{"level":"info","ts":1688241570.0590901,"logger":"tls","msg":"finished cleaning storage units"}
{"level":"debug","ts":1688241573.8954487,"logger":"events","msg":"event","name":"tls_get_certificate","id":"cdaef607-46a6-4987-9898-e88d3570bdeb","origin":"tls","data":{"client_hello":{"CipherSuites":[4865,4867,4866,49195,49199,52393,52392,49196,49200,49162,49161,49171,49172,156,157,47,53,10],"ServerName":"","SupportedCurves":[29,23,24,25,256,257],"SupportedPoints":"AA==","SignatureSchemes":[1027,1283,1539,2052,2053,2054,1025,1281,1537,515,513],"SupportedProtos":["h2","http/1.1"],"SupportedVersions":[772,771,770,769],"Conn":{}}}}
{"level":"debug","ts":1688241573.895683,"logger":"tls.handshake","msg":"no matching certificates and no custom selection logic","identifier":"94.103.97.97"}
{"level":"debug","ts":1688241573.8956966,"logger":"tls.handshake","msg":"choosing certificate","identifier":"moo.zygounet.ch","num_choices":1}
{"level":"debug","ts":1688241573.895714,"logger":"tls.handshake","msg":"default certificate selection results","identifier":"moo.zygounet.ch","subjects":["moo.zygounet.ch"],"managed":true,"issuer_key":"acme-v02.api.letsencrypt.org-directory","hash":"d4f410fdb9b8756f193813f9fdd1dca272611929695c22d1fcdf0c09a9bcbfa4"}
{"level":"debug","ts":1688241573.8972282,"logger":"tls","msg":"response from ask endpoint","domain":"94.103.97.97","url":"http://127.0.0.80/?domain=94.103.97.97","status":200}
{"level":"debug","ts":1688241573.8972573,"logger":"tls.handshake","msg":"all external certificate managers yielded no certificates and no errors","remote_ip":"194.230.141.19","remote_port":"58135","sni":""}
{"level":"debug","ts":1688241573.897479,"logger":"tls.handshake","msg":"did not load cert from storage","remote_ip":"194.230.141.19","remote_port":"58135","server_name":"","error":"no matching certificate to load for : open /var/lib/caddy/.local/share/caddy/certificates/acme.zerossl.com-v2-dv90/wildcard_/wildcard_.key: no such file or directory"}
{"level":"info","ts":1688241573.8975227,"logger":"tls.on_demand","msg":"obtaining new certificate","remote_ip":"194.230.141.19","remote_port":"58135","server_name":"94.103.97.97"}
{"level":"info","ts":1688241573.8979316,"logger":"tls.obtain","msg":"acquiring lock","identifier":"94.103.97.97"}
{"level":"info","ts":1688241573.8995721,"logger":"tls.obtain","msg":"lock acquired","identifier":"94.103.97.97"}
{"level":"info","ts":1688241573.8997474,"logger":"tls.obtain","msg":"obtaining certificate","identifier":"94.103.97.97"}
{"level":"debug","ts":1688241573.8998492,"logger":"events","msg":"event","name":"cert_obtaining","id":"6c3cba0e-d6e3-4f1b-98b3-0f951b7831bd","origin":"tls","data":{"identifier":"94.103.97.97"}}
{"level":"debug","ts":1688241573.9002137,"logger":"tls.obtain","msg":"trying issuer 1/2","issuer":"acme-v02.api.letsencrypt.org-directory"}
{"level":"debug","ts":1688241573.9002397,"logger":"tls.obtain","msg":"trying issuer 2/2","issuer":"acme.zerossl.com-v2-DV90"}
{"level":"debug","ts":1688241573.9002612,"logger":"events","msg":"event","name":"cert_failed","id":"d2468c2b-b514-4cbd-9fd4-94b0e247a376","origin":"tls","data":{"error":{},"identifier":"94.103.97.97","issuers":["acme-v02.api.letsencrypt.org-directory","acme.zerossl.com-v2-DV90"],"renewal":false}}
{"level":"error","ts":1688241573.9003124,"logger":"tls.obtain","msg":"will retry","error":"[94.103.97.97] Obtain: subject does not qualify for a public certificate: 94.103.97.97","attempt":1,"retrying_in":60,"elapsed":0.00071949,"max_duration":2592000}
root at moo in /var/log/caddy

Maybe I’m understanding it wrongly, but I would expect it still uses the default_sni for non-SNI requests and not triggering on_demand here ?

Looks like the catchall routes are taking over the default_sni for the certificate management.
It’s maybe that I’m doing it wrong and that it is behaving the way it’s expected…

For this “issue” the result is the same with 2.6.4 or 2.7.0-beta.2

Kind regards

Since the same server is both HTTP and HTTPS, I don’t think there’s any value in checking both. Just do one, and the easiest is HTTP.

How could they trigger a problem? Caddy won’t have a certificate for their request so the handshake will simply fail and that’s all. I don’t understand what you mean. You definitely should reject IP addresses

I’ll let @matt follow up on the rest though.

We could of course remove the HTTPS check and use only HTTP to ensure caddy is still alive and responding. But the fact we are also checking HTTPS and started receiving alerts put in light that there was something wrong when we switched to 2.7.x. It was then somehow useful :slight_smile:

Actually at each non-SNI request, it will try to get a certificate for IPs, through the catch all routes, and using on_demand.

That’s why at first, in my config I created specific blocks for IP addresses and configured them to use tls intenal. It almost worked, except that at some point it will still tries to grab certs for IP addresses using on_demand and LE/ZeroSSL.

I then was quite confident that your suggestion of using default_sni would help, but it seems the SSL catch all routes are taking it over anyway.

What I try to do here is quite basic though:

  • A main domain block, here it’s moo.zygounet.ch, the main server name. Using normal automatic tls (not on_demand)
  • Then for non-SNI, either use tls internal or default_sni
  • And finally a catchall for all other unspecified matches, which should then use on_demand for SSL certs

But unfortunately somehow on_demand is still triggered for non-SNI requests and I can’t figure how to avoid this :slight_smile:

Thanks for your help and feedback though !

I will need some time to catch up on this thread – sorry for the wait. It’s a holiday weekend here and I have family visiting from out-of-state. I’ll try to get to this soon. :+1:

Hello francis,

As a side note:

As you said and per RFC 6066

Literal IPv4 and IPv6 addresses are not permitted in “HostName”.

But caddy currently accept an IP address as SNI and tries to get a certificate for it.

# openssl s_client -servername 1.2.3.4 -connect 94.103.97.97:443
CONNECTED(00000003)
{"level":"debug","ts":1688277366.4167995,"logger":"tls","msg":"response from ask endpoint","domain":"1.2.3.4","url":"http://127.0.0.80/?domain=1.2.3.4","status":200}
{"level":"info","ts":1688277366.4168372,"logger":"tls.on_demand","msg":"obtaining new certificate","remote_ip":"94.103.97.97","remote_port":"48176","server_name":"1.2.3.4"}
{"level":"info","ts":1688277366.4172957,"logger":"tls.obtain","msg":"acquiring lock","identifier":"1.2.3.4"}
{"level":"info","ts":1688277366.4196906,"logger":"tls.obtain","msg":"lock acquired","identifier":"1.2.3.4"}
{"level":"info","ts":1688277366.4199967,"logger":"tls.obtain","msg":"obtaining certificate","identifier":"1.2.3.4"}
{"level":"debug","ts":1688277366.4201689,"logger":"events","msg":"event","name":"cert_obtaining","id":"128d7822-a2b8-4a5a-8103-a3f3987de9c6","origin":"tls","data":{"identifier":"1.2.3.4"}}
{"level":"debug","ts":1688277366.4207788,"logger":"tls.obtain","msg":"trying issuer 1/2","issuer":"acme-v02.api.letsencrypt.org-directory"}
{"level":"debug","ts":1688277366.420822,"logger":"tls.obtain","msg":"trying issuer 2/2","issuer":"acme.zerossl.com-v2-DV90"}
{"level":"debug","ts":1688277366.420853,"logger":"events","msg":"event","name":"cert_failed","id":"e6d590de-8b8f-4436-87a5-df26851bae8b","origin":"tls","data":{"error":{},"identifier":"1.2.3.4","issuers":["acme-v02.api.letsencrypt.org-directory","acme.zerossl.com-v2-DV90"],"renewal":false}}
{"level":"error","ts":1688277366.4209447,"logger":"tls.obtain","msg":"will retry","error":"[1.2.3.4] Obtain: subject does not qualify for a public certificate: 1.2.3.4","attempt":1,"retrying_in":60,"elapsed":0.001231527,"max_duration":2592000}

As this is not allowed, shouldn’t caddy in this case either:

  • Reject the request ?
  • Treat the request as if it was non-SNI ?

Kind regards.

Caddy is not getting a certificate because of an IP address in SNI. It’s getting one because of on-demand TLS configured for https:// and an https request to the IP address. i.e. it has nothing to do with SNI… maybe the lack of it, but not because of it.

It looks like it’s doing what is configured.

Hello Matt,

# openssl s_client -servername 1.2.3.4 -connect 94.103.97.97:443

Here I pass 1.2.3.4 (an IP) as the SNI and it is in fact trying to get a certificate for it through on demand. As IP are forbidden in SNI shouldn’t it ignore 1.2.3.4 as SNI and fall back to the real IP address ?

Anyway that’s not the issue here. It was just something I noticed and wanted to share.

The issue is that with the catchall on_demand in the config, caddy does:

  1. Ignore default_sni setting, for example with this config
{
        default_sni moo.zygounet.ch
}
moo.zygounet.ch {
        respond "Welcome to moo" {
                close
        }
}
http:// {
        respond "Catch all HTTP" {
                close
        }
}
https:// {
        tls {
                on_demand
        }
        respond "Catch all HTTPS" {
                close
        }
}

I don’t understand why, because if default_sni is specified I would expect it to be used.
I would expect with this config, the certificate presented to be the one for moo.zygounet.ch when calling https://IP.ADDRESS, or am I getting it wrong maybe ?

2)Try to renew any internal certificates through on_demand w/ letsencrypt/ZeroSSL, for example with this config:

94.103.97.97 {
        tls {
                issuer internal {
                        lifetime 15m
                }
        }
        respond "non-SNI HTTPS request" {
                close
        }
}
https:// {
        tls {
                on_demand
        }
        respond "Catch all HTTPS" {
                close
        }
}

Again here I don’t understand why non-SNI request to IP 94.103.97.97 are correctly using internal tls certificate but then when the certificate is about to expire, it will try to renew it through LE/ZeroSSL.

With either way i’m not able to correctly have in my config:

A catch all with on_demand LE/ZeroSSL for all https requests except for non-SNI, either by using default_sni or specific https://IP.ADDRESS blocks.

Is there a way to tell to the https:// block to ignore non-SNI requests ? Or to exclude specifically for example 94.103.97.97 ?

What I try to do seems kinda basic. All https requests should use tls on_demand except for non-SNI that should be either using certs generated with tls internal or an existing cert through default_sni.

I found nothing in the docs how to do it a way it works correctly. I feed dumb at this point :slight_smile:

Thanks and kind regards.