Weird protocol limitation with Caddy on gateway

1. Caddy version (caddy version):

the command caddy version returns “(devel)”, actual FreeBSD pkg version is caddy-2.4.6_2.

2. How I run Caddy:

a. System environment:

FreeBSD 13.0

b. Command:

Caddy is launched by the pkg startup script.

service caddy start

yields to this:

/usr/local/bin/caddy run --pingback 127.0.0.1:13071 --config /usr/local/etc/caddy/Caddyfile --adapter caddyfile --pidfile /var/run/caddy/caddy.pid

c. Service/unit/compose file:

d. My complete Caddyfile or JSON config:

# The Caddyfile is an easy way to configure your Caddy web server.
#
# To use your own domain name (with automatic HTTPS), first make
# sure your domain's A/AAAA DNS records are properly pointed to
# this machine's public IP, then replace the line below with your
# domain name.
{
	http_port 80
	https_port 443
	servers {
		protocol {
			allow_h2c
			experimental_http3
		}
	}
}
# Unless the file starts with a global options block, the first
# uncommented line is always the address of your site.
#
boleskine.patpro.net {
	handle_path /images-tftp/* {
		root * /usr/tftpboot/images
		file_server * browse
		@blocked not remote_ip 192.168.0.0/24
		respond @blocked "Nope" 403
	}

	handle /inside/ {
		reverse_proxy 192.168.0.2:80
	}

	handle /munin/* {
		root * /usr/local/www
		file_server
	}

	handle_path /foo/* {
		root * /bar
		file_server * browse
	}

	handle {
		root * /usr/local/www/caddy
		respond "Hello, world!"
	}

	# Set up a reverse proxy:
	# reverse_proxy localhost:8080

	# Serve a PHP site through php-fpm:
	# php_fastcgi localhost:9000

	# Enable logging:
	log {
		output file /var/log/caddy/access.log
		# Caddy's structured log format:
		format json
		# Or, for Common Log Format:
		# format single_field common_log
	}
}

3. The problem I’m having:

The Caddy server is hosted on a FreeBSD internet gateway; the box sitting between my LAN and Internet. Server is reachable from LAN and from Internet.
From Internet, I can use Firefox and Chrome and get HTTP/3 connections.
From the LAN, I can only achieve HTTP/2 connections.

4. Error messages and/or full log output:

I’ve compiled a curl version supporting HTTP/3 and tested from my LAN.

http2 is OK:

$ /usr/local/opt/curl/bin/curl -svoI --http2 https://boleskine.patpro.net
*   Trying 78.196.217.20:443...
* Connected to boleskine.patpro.net (78.196.217.20) port 443 (#0)
* ALPN: offers h2
* ALPN: offers http/1.1
*  CAfile: /etc/ssl/cert.pem
*  CApath: none
} [5 bytes data]
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
} [512 bytes data]
* TLSv1.2 (IN), TLS handshake, Server hello (2):
{ [122 bytes data]
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
} [1 bytes data]
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
{ [15 bytes data]
* TLSv1.3 (IN), TLS handshake, Certificate (11):
{ [3836 bytes data]
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
{ [79 bytes data]
* TLSv1.3 (IN), TLS handshake, Finished (20):
{ [36 bytes data]
* TLSv1.3 (OUT), TLS handshake, Finished (20):
} [36 bytes data]
* SSL connection using TLSv1.3 / TLS_CHACHA20_POLY1305_SHA256
* ALPN: server accepted h2
* Server certificate:
*  subject: CN=boleskine.patpro.net
*  start date: Apr  9 18:11:32 2022 GMT
*  expire date: Jul  8 18:11:31 2022 GMT
*  subjectAltName: host "boleskine.patpro.net" matched cert's "boleskine.patpro.net"
*  issuer: C=US; O=Let's Encrypt; CN=R3
*  SSL certificate verify ok.
* Using HTTP2, server supports multiplexing
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
} [5 bytes data]
* h2h3 [:method: GET]
* h2h3 [:path: /]
* h2h3 [:scheme: https]
* h2h3 [:authority: boleskine.patpro.net]
* h2h3 [user-agent: curl/7.83.0-DEV]
* h2h3 [accept: */*]
* Using Stream ID: 1 (easy handle 0x7faf92010c00)
} [5 bytes data]
> GET / HTTP/2
> Host: boleskine.patpro.net
> user-agent: curl/7.83.0-DEV
> accept: */*
> 
{ [5 bytes data]
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
{ [130 bytes data]
* Connection state changed (MAX_CONCURRENT_STREAMS == 250)!
} [5 bytes data]
< HTTP/2 200 
< alt-svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
< server: Caddy
< content-length: 13
< date: Sat, 16 Apr 2022 19:55:33 GMT
< 
{ [5 bytes data]
* Connection #0 to host boleskine.patpro.net left intact

http3 fails:

$ /usr/local/opt/curl/bin/curl -svoI --http3 https://boleskine.patpro.net
*   Trying 78.196.217.20:443...
*  CAfile: /etc/ssl/cert.pem
*  CApath: none
* Connect socket 5 over QUIC to 78.196.217.20:443
* Sent QUIC client Initial, ALPN: h3,h3-29,h3-28,h3-27
* Connection timeout after 300000 ms
* Closing connection 0

5. What I already tried:

I’ve made a huge number of tcpdump+tests to pin down the issue. Caddy gets the client request from WAN interface, that’s OK because I request a hostname registered for the public IP of the gateway.
When the request is http2, the reply seems to be sent by the WAN interface and routed to the client on the LAN: (192.168.0.2-> client, 78.196.217.20-> WAN ip of gateway, 192.168.0.1->LAN ip of gateway)

# tcpdump -ni em0 port 443
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on em0, link-type EN10MB (Ethernet), capture size 262144 bytes
22:38:00.426618 IP 192.168.0.2.53931 > 78.196.217.20.443: Flags [S], seq 1273791029, win 65535, options [mss 1460,nop,wscale 6,nop,nop,TS val 2454617454 ecr 0,sackOK,eol], length 0
22:38:00.426657 IP 78.196.217.20.443 > 192.168.0.2.53931: Flags [S.], seq 3652463662, ack 1273791030, win 65535, options [mss 1460,nop,wscale 7,sackOK,TS val 2742369152 ecr 2454617454], length 0
22:38:00.426917 IP 192.168.0.2.53931 > 78.196.217.20.443: Flags [.], ack 1, win 2058, options [nop,nop,TS val 2454617454 ecr 2742369152], length 0
22:38:00.434429 IP 192.168.0.2.53931 > 78.196.217.20.443: Flags [P.], seq 1:518, ack 1, win 2058, options [nop,nop,TS val 2454617461 ecr 2742369152], length 517

When the request is http3, the reply is sent by the LAN interface, and the client have no idea what to do with it:

# tcpdump -ni em0 proto UDP and port 443
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on em0, link-type EN10MB (Ethernet), capture size 262144 bytes
21:56:12.279789 IP 192.168.0.2.51837 > 78.196.217.20.443: UDP, length 1200
21:56:12.280409 IP 192.168.0.1.443 > 192.168.0.2.51837: UDP, length 153
21:56:13.278765 IP 192.168.0.2.51837 > 78.196.217.20.443: UDP, length 1200
21:56:13.278968 IP 192.168.0.1.443 > 192.168.0.2.51837: UDP, length 153
21:56:15.276805 IP 192.168.0.2.51837 > 78.196.217.20.443: UDP, length 1200
21:56:15.276827 IP 192.168.0.2.51837 > 78.196.217.20.443: UDP, length 1200
21:56:15.277039 IP 192.168.0.1.443 > 192.168.0.2.51837: UDP, length 153
21:56:15.277059 IP 192.168.0.1.443 > 192.168.0.2.51837: UDP, length 153

I think this is an anomaly on Caddy’s side, it should send back the reply on the interface the request was received.

So I’ve added “bind 78.196.217.20” in the Caddyfile and behavior changed. HTTP/3 requests yield to this tcpdump output:

# tcpdump -ni em0 proto UDP and port 443
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on em0, link-type EN10MB (Ethernet), capture size 262144 bytes
22:21:21.464163 IP 192.168.0.2.55013 > 78.196.217.20.443: UDP, length 1200
22:21:21.464384 IP 78.196.217.20.443 > 192.168.0.2.55013: UDP, length 141
22:21:21.465355 IP 192.168.0.2.55013 > 78.196.217.20.443: UDP, length 1200
22:21:21.466285 IP 78.196.217.20.443 > 192.168.0.2.55013: UDP, length 1252
22:21:21.466299 IP 78.196.217.20.443 > 192.168.0.2.55013: UDP, length 1252
22:21:21.466310 IP 78.196.217.20.443 > 192.168.0.2.55013: UDP, length 1252

But the result client side is still not good:

$ /usr/local/opt/curl/bin/curl -svoI --http3 https://boleskine.patpro.net/
*   Trying 78.196.217.20:443...
*  CAfile: /etc/ssl/cert.pem
*  CApath: none
* Connect socket 5 over QUIC to 78.196.217.20:443
* Sent QUIC client Initial, ALPN: h3,h3-29,h3-28,h3-27
* SSL certificate problem: certificate has expired
* Failed to connect to boleskine.patpro.net port 443 after 32 ms: SSL peer certificate or SSH remote key was not OK
* Closing connection 0

Connecting with a browser from the LAN (Firefox for example) is still limited to http2 but shows no certificate problem at all.

Any help appreciated! Thanks

1 Like

You can remove these. It’s redundant.

Are you sure you need this? Only turn this on if you understand what it does and you actually need it. Nothing in your config suggests you need H2C.

Hmm, well Caddy isn’t doing anything special, it’s just listening on port 443 for UDP. This must just be something weird about your network’s handling of UDP packets.

Thank you Francis,
As I’ve wrote, by default Caddy seems to reply to H1 and H2 requests properly (through the same network interface it received the request) but seems to reply to H3 through the wrong interface. The only work around is to force Caddy to bind only to external interface, which is not so great for my use-case. And I would add that UDP handling seems pretty OK, as, for example, pure UDP traffic (as in “dig +notcp”) is working great for request from the LAN to the external interface of the gateway.

Also, with this setup (bind to external interface), H1 & H2 requests are OK and TLS cert is properly handled, but as I wrote, H3 requests yields to client-side error “SSL certificate problem: certificate has expired”.

In order to easily discriminate traffic on the network (quite busy) I changed https port to 3284 and made a new tcpdump client side.

Caddy not bound to external IP address, H2 request:

09:15:40.028730 IP 192.168.0.2.58267 > 78.196.217.20.3284: Flags [S], seq 4239347093, win 65535, options [mss 1460,nop,wscale 6,nop,nop,TS val 2577998868 ecr 0,sackOK,eol], length 0
09:15:40.029013 IP 78.196.217.20.3284 > 192.168.0.2.58267: Flags [S.], seq 327211749, ack 4239347094, win 65535, options [mss 1460,nop,wscale 7,sackOK,TS val 104116921 ecr 2577998868], length 0
09:15:40.029044 IP 192.168.0.2.58267 > 78.196.217.20.3284: Flags [.], ack 1, win 2058, options [nop,nop,TS val 2577998868 ecr 104116921], length 0
09:15:40.029376 IP 192.168.0.2.58267 > 78.196.217.20.3284: Flags [P.], seq 1:518, ack 1, win 2058, options [nop,nop,TS val 2577998868 ecr 104116921], length 517
09:15:40.030253 IP 78.196.217.20.3284 > 192.168.0.2.58267: Flags [.], seq 1:1449, ack 518, win 514, options [nop,nop,TS val 104116922 ecr 2577998868], length 1448
09:15:40.030387 IP 78.196.217.20.3284 > 192.168.0.2.58267: Flags [.], seq 1449:2897, ack 518, win 514, options [nop,nop,TS val 104116922 ecr 2577998868], length 1448
...

Caddy not bound to external IP address, H3 request:

09:15:51.903346 IP 192.168.0.2.63610 > 78.196.217.20.3284: UDP, length 1200
09:15:51.903876 IP 192.168.0.1.3284 > 192.168.0.2.63610: UDP, length 153
09:15:52.902422 IP 192.168.0.2.63610 > 78.196.217.20.3284: UDP, length 1200
09:15:52.902867 IP 192.168.0.1.3284 > 192.168.0.2.63610: UDP, length 153
09:15:54.900503 IP 192.168.0.2.63610 > 78.196.217.20.3284: UDP, length 1200
...

Caddy bound to external IP address, H2 request:

09:16:29.282950 IP 192.168.0.2.58284 > 78.196.217.20.3284: Flags [S], seq 362072075, win 65535, options [mss 1460,nop,wscale 6,nop,nop,TS val 2578047620 ecr 0,sackOK,eol], length 0
09:16:29.283231 IP 78.196.217.20.3284 > 192.168.0.2.58284: Flags [S.], seq 1083177143, ack 362072076, win 65535, options [mss 1460,nop,wscale 7,sackOK,TS val 2168864449 ecr 2578047620], length 0
09:16:29.283268 IP 192.168.0.2.58284 > 78.196.217.20.3284: Flags [.], ack 1, win 2058, options [nop,nop,TS val 2578047620 ecr 2168864449], length 0
09:16:29.283559 IP 192.168.0.2.58284 > 78.196.217.20.3284: Flags [P.], seq 1:518, ack 1, win 2058, options [nop,nop,TS val 2578047620 ecr 2168864449], length 517
09:16:29.284819 IP 78.196.217.20.3284 > 192.168.0.2.58284: Flags [.], seq 1:1449, ack 518, win 514, options [nop,nop,TS val 2168864450 ecr 2578047620], length 1448
...

Caddy bound to external IP address, H3 request:

09:16:38.710769 IP 192.168.0.2.56155 > 78.196.217.20.3284: UDP, length 1200
09:16:38.711251 IP 78.196.217.20.3284 > 192.168.0.2.56155: UDP, length 141
09:16:38.711398 IP 192.168.0.2.56155 > 78.196.217.20.3284: UDP, length 1200
09:16:38.713158 IP 78.196.217.20.3284 > 192.168.0.2.56155: UDP, length 1252
09:16:38.713292 IP 78.196.217.20.3284 > 192.168.0.2.56155: UDP, length 1252
...

But still, only on H3 connexion, client side the TLS certificate is found expired:

$ /usr/local/opt/curl/bin/curl -svI --http3 https://boleskine.patpro.net:3284
*   Trying 78.196.217.20:3284...
*  CAfile: /etc/ssl/cert.pem
*  CApath: none
* Connect socket 5 over QUIC to 78.196.217.20:3284
* Sent QUIC client Initial, ALPN: h3,h3-29,h3-28,h3-27
* SSL certificate problem: certificate has expired
* Failed to connect to boleskine.patpro.net port 3284 after 608 ms: SSL peer certificate or SSH remote key was not OK
* Closing connection 0

I don’t know enough about the QUIC internals to answer this. Maybe @marten-seemann would know if the different interface could affect QUIC/H3?

That sounds a very old bug we had in quic-go: Incorrect source ip address in outgoing UDP datagrams · Issue #1736 · lucas-clemente/quic-go · GitHub. TLDR: quic-go didn’t read the source address of the UDP address and let the kernel decide which interface to use to send out the response packet. This works fine in most cases, but can fail if there are two interfaces that feel responsible for sending out the response packet.

We solved this by reading the UDP control message: quic-go/sys_conn_oob.go at 6d4a694183971a16a168a3bab29de9b7b28900c5 · lucas-clemente/quic-go · GitHub. This requires the underlying net.PacketConn passed to quic-go to be a net.UDPConn, or at least to implement a bunch of additional methods that are defined on the net.UDPConn. In specific, quic.OOBCapablePacketConn is the interface that we assert: quic-go/sys_conn.go at 6d4a694183971a16a168a3bab29de9b7b28900c5 · lucas-clemente/quic-go · GitHub. Due to the way that Go interface assertions work, if you wrap a net.UDPConn too many times, it might not be possible to interface-assert these methods, even if the underlying connection is a net.UDPConn.

@mholt, what kind of connection you pass to quic-go? It would probably make sense to assert in Caddy that that connection actually implements quic.OOBCapablePacketConn.

2 Likes

@patpro can you please try v2.5.0-rc.1? We’ve made some changes to HTTP/3 since v2.4.6 which may or may not change how it behaves for you here.

@marten-seemann as of v2.5.0, we’re calling quic.ListenAddrEarly, which lets quic-go make the conns for us (looks like with net.ListenUDP), we don’t play with it otherwise AFAICT.

In v2.4.6 and older, we used net.ListenPacket("udp", ":443"). Is that the problem?

1 Like

@francislavoie net.ListenUDP creates a net.UDPConn, so that should be fine, if you don’t wrap it too often. quic.ListenAddrEarly should work in any case.

1 Like

I’ve just built v2.5.0-rc.1 with xcaddy (nice tool!) but I’ve got the same exact symptoms unfortunately.

# caddy version
v2.5.0-rc.1 h1:d/ivzqaW+ht8J4yD+XI9omgCDIbQCDOD5AzKPTwkwWk=
# tcpdump -ni em0 proto UDP and port 443
Password:
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on em0, link-type EN10MB (Ethernet), capture size 262144 bytes
23:32:12.211290 IP 192.168.0.2.50237 > 78.196.217.20.443: UDP, length 1200
23:32:12.211591 IP 192.168.0.1.443 > 192.168.0.2.50237: UDP, length 153
23:32:13.210295 IP 192.168.0.2.50237 > 78.196.217.20.443: UDP, length 1200
23:32:13.210531 IP 192.168.0.1.443 > 192.168.0.2.50237: UDP, length 153

Hello, any other idea?

Today I’ve tried something different. I’ve disabled kernel extensions accf_http.ko and accf_data.ko just to make sure they don’t get in the way and mess with UDP packets, but no luck. Behavior is exactly the same after rebooting without those modules loaded…

I’ve also changed my firewall settings (pf) so that it would not sanitize packets anymore (disabling the scrub rule), no change. I’ve setup packet sanitizing with different options, no change.

I’m kind of out of options here.

This topic was automatically closed after 30 days. New replies are no longer allowed.