TLS handshake errors out of nowhere

So I’ve had Caddy working well for the last few months with zero issues. Out of nowhere today I started getting TLS handshake issues and certificate timeouts.

After 1.0 Upgrade:

2019/04/25 18:31:30 [INFO] Certificate for [ftp.alexsguardian.net] expires in 644h43m46.60933353s; attempting renewal
2019/04/25 18:31:30 [INFO] [ftp.alexsguardian.net] acme: Trying renewal with 644 hours remaining
2019/04/25 18:31:30 [INFO] [ftp.alexsguardian.net] acme: Obtaining bundled SAN certificate
2019/04/25 18:31:31 [INFO] [ftp.alexsguardian.net] AuthURL: https://acme-v02.api.letsencrypt.org/acme/authz/awXE4dadEJyPyHCcJlv35FQ2lNdLtU8zW-p9orium3I
2019/04/25 18:31:31 [INFO] [ftp.alexsguardian.net] acme: use tls-alpn-01 solver
2019/04/25 18:31:31 [INFO] [ftp.alexsguardian.net] acme: Trying to solve TLS-ALPN-01
2019/04/25 18:31:32 [INFO] Unable to deactivated authorizations: https://acme-v02.api.letsencrypt.org/acme/authz/awXE4dadEJyPyHCcJlv35FQ2lNdLtU8zW-p9orium3I
2019/04/25 18:31:32 [ERROR] Renewing [ftp.alexsguardian.net]: acme: Error -> One or more domains had a problem:
[ftp.alexsguardian.net] acme: error: 403 :: urn:ietf:params:acme:error:unauthorized :: Cannot negotiate ALPN protocol "acme-tls/1" for tls-alpn-01 challenge, url:
; trying again in 10s

Before 1.0 Upgrade:

2019/04/25 18:12:55 [INFO] Certificate for [ftp.alexsguardian.net] expires in 645h2m21.92910889s; attempting renewal
2019/04/25 18:12:55 [INFO] [ftp.alexsguardian.net] acme: Trying renewal with 645 hours remaining
2019/04/25 18:12:55 [INFO] [ftp.alexsguardian.net] acme: Obtaining bundled SAN certificate
2019/04/25 18:12:55 [INFO] [ftp.alexsguardian.net] AuthURL: https://acme-v02.api.letsencrypt.org/acme/authz/hKF1DaZ4guthsvbl9Mi3jD1i7320RVbXB4GzvHbawrU
2019/04/25 18:12:55 [INFO] [ftp.alexsguardian.net] acme: use tls-alpn-01 solver
2019/04/25 18:12:55 [INFO] [ftp.alexsguardian.net] acme: Trying to solve TLS-ALPN-01
2019/04/25 18:12:56 http: TLS handshake error from 162.158.79.253:50076: remote error: tls: illegal parameter
2019/04/25 18:12:56 http: TLS handshake error from 172.69.62.210:54590: remote error: tls: illegal parameter
2019/04/25 18:12:57 [INFO] Unable to deactivated authorizations: https://acme-v02.api.letsencrypt.org/acme/authz/hKF1DaZ4guthsvbl9Mi3jD1i7320RVbXB4GzvHbawrU
2019/04/25 18:12:57 [ERROR] Renewing [ftp.alexsguardian.net]: acme: Error -> One or more domains had a problem:
[ftp.alexsguardian.net] acme: error: 403 :: urn:ietf:params:acme:error:unauthorized :: Cannot negotiate ALPN protocol "acme-tls/1" for tls-alpn-01 challenge, url:
; trying again in 10s

My current setup (which hasn’t changed since I originally setup Caddy) is:

Cloudflare (Full (strict)) > Router (443) <443-444> Caddy in Docker.

I updated to Caddy 1.0 to see if it would fix it and now, apparently, I’ve hit the rate limit for LE and still seeing the unauthorized error. I have 8 subdomains + root domain.

This hit me out of no-where and not sure what happened.

Was able to recover my root domain and get caddy booted and responding to web requests again. Though all of my subdomains are still down due to rate limit.

Does the ALPN challenge not allow me to have Caddy behind Orange CF? If so thats quite annoying since the DNS challenge constantly fails because it doesn’t wait long enough for DNS to update.

Correct.

TLS-ALPN challenge is solved during TLS negotiation with Caddy.

If you have Cloudflare MITMing your HTTPS (i.e. orange-cloud), LetsEncrypt can’t negotiate TLS with Caddy, it must negotiate with Cloudflare which then talks to your server on the client’s behalf.

You might try running Caddy with the -disable-tls-alpn-challenge flag, forcing Caddy (specifically, Caddy’s ACME library, acme-go/lego) to use the HTTP challenge instead. This should work despite the Cloudflare MITM.

2 Likes

Thanks.

Also any idea why I am being flooded with these in stdout?


2019/04/25 22:07:12 http: TLS handshake error from ip:38320: tls: no cipher suite supported by both client and server

First time I’ve seen this.

At a guess - some script kiddie is running vuln scans against your web server for less-secure protocols. Might be trying to fingerprint Caddy. Probably not targeted - likely just random scans. I get bursts of these on some of my servers every now and again, from large groups of IPs so it’s difficult to pin them and rate limit them. It’s not too concerning - happens to everyone with a public IP address at some point or another.

Figured as much. Just wanted to double check.

One more question. Any idea whats going on here? Started seeing this after updating to 1.0 aswell.


2019/04/26 02:29:35 [ERROR] failed to copy buffer:  write tcp 172.22.0.2:443->45.35.198.218:13780: write: broken pipe
2019/04/26 02:29:39 [ERROR] failed to copy buffer:  write tcp 172.22.0.2:443->45.35.198.210:49078: write: broken pipe
2019/04/26 02:29:41 [ERROR] failed to copy buffer:  write tcp 172.22.0.2:443->45.35.198.218:13770: write: broken pipe
2019/04/26 02:29:43 [ERROR] failed to copy buffer:  write tcp 172.22.0.2:443->45.35.198.218:13020: write: broken pipe
2019/04/26 02:29:43 [ERROR] failed to copy buffer:  write tcp 172.22.0.2:443->45.35.205.82:65348: write: broken pipe
2019/04/26 02:29:44 [ERROR] failed to copy buffer:  write tcp 172.22.0.2:443->45.35.198.206:46530: write: broken pipe
2019/04/26 02:29:44 [ERROR] failed to copy buffer:  write tcp 172.22.0.2:443->45.35.198.206:46532: write: broken pipe
2019/04/26 02:29:46 [ERROR] failed to copy buffer:  read tcp 172.22.0.2:443->162.158.79.145:65362: use of closed network connection
2019/04/26 02:29:46 [ERROR] failed to copy buffer:  write tcp 172.22.0.2:443->172.107.22.206:55314: write: broken pipe
2019/04/26 02:29:47 [ERROR] failed to copy buffer:  write tcp 172.22.0.2:443->45.35.205.58:62412: write: broken pipe
2019/04/26 02:29:51 [ERROR] failed to copy buffer:  write tcp 172.22.0.2:443->45.35.205.58:62090: write: broken pipe
2019/04/26 02:29:57 [ERROR] failed to copy buffer:  write tcp 172.22.0.2:443->45.35.205.34:56128: write: broken pipe
2019/04/26 02:30:07 [ERROR] failed to copy buffer:  read tcp 172.22.0.2:443->162.158.79.145:39518: use of closed network connection
2019/04/26 02:30:28 [ERROR] failed to copy buffer:  read tcp 172.22.0.2:443->162.158.79.145:10988: use of closed network connection
2019/04/26 02:30:50 [ERROR] failed to copy buffer:  read tcp 172.22.0.2:443->162.158.79.145:38678: use of closed network connection
2019/04/26 02:31:11 [ERROR] failed to copy buffer:  read tcp 172.22.0.2:443->162.158.79.145:11366: use of closed network connection
2019/04/26 02:31:32 [ERROR] failed to copy buffer:  read tcp 172.22.0.2:443->162.158.79.145:38692: use of closed network connection
2019/04/26 02:31:53 [ERROR] failed to copy buffer:  read tcp 172.22.0.2:443->162.158.79.145:10110: use of closed network connection
2019/04/26 02:32:00 http2: received GOAWAY [FrameHeader GOAWAY len=20], starting graceful shutdown
2019/04/26 02:32:18 http: TLS handshake error from 172.56.29.61:38457: EOF
2019/04/26 02:32:24 http: TLS handshake error from 172.56.29.61:55283: EOF
2019/04/26 02:32:35 http: TLS handshake error from 45.35.198.194:39730: EOF
2019/04/26 02:32:36 http: TLS handshake error from 45.35.198.222:37504: EOF
2019/04/26 02:32:39 [ERROR] failed to copy buffer:  write tcp 172.22.0.2:443->45.35.198.194:39728: write: broken pipe
2019/04/26 02:32:40 http: TLS handshake error from 45.35.198.210:56120: EOF

The 172.22.0.2 IP is the IP of the Caddy container. Its built from Abiosoft’s container and the only thing different is the plugins arg.

All of those errors seem plausibly attributable to problems on the client end with keeping the connection open.

Which version were you on prior to the 1.0 upgrade?

I was on 11.5

So it looks like the Comcast node for my region is having massive latency isssues in their t3 routing. So that’s probably contributing to the caddy log with timeouts, etc as clients can’t keep the connection open.

2 Likes

2019/04/25 22:07:12 http: TLS handshake error from ip:38320: tls: no cipher suite supported by both client and server

It is also possible that you are recieving requests from older clients. In 0.11.5 caddy removed several cipher types from the default list that are used by older clients such as Safari on older Ipads etc. These clients are no longer able to connect to your website and will get an error that looks like your site is down.

There is some discussion here https://github.com/mholt/caddy/issues/2512

Keep an eye on your logs, if you believe it is affecting a large number of your connections then add back in all the cipher suites in your tls directive in your caddyfile

https://caddyserver.com/docs/tls

3 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.