Reverse proxying trying to make a TLS connection to an http-only backend

1. Caddy version (caddy version):

v2.4.5

Built with:

./xcaddy build \
        --with github.com/caddyserver/caddy/v2=./caddy-fork \
        --with github.com/caddyserver/ntlm-transport \
        --with github.com/caddy-dns/cloudflare \
        --with github.com/caddyserver/replace-response

2. How I run Caddy:

a. System environment:

Ubuntu 18.04 LTS
Caddy running as systemd service (custom build using xcaddy, see above)
go version go1.16.4 linux/amd64

b. Command:

systemctl start caddy2

c. Service/unit/compose file:

[Unit]
Description=Caddy2
Documentation=https://caddyserver.com/docs/
After=network.target network-online.target
Requires=network-online.target

[Service]
Type=notify
User=caddy
Group=caddy
ExecStart=/opt/caddy2/caddy run --environ --config /opt/caddy2/Caddyfile
ExecReload=/opt/caddy2/caddy reload --config /opt/caddy2/Caddyfile
TimeoutStopSec=5s
LimitNOFILE=1048576
LimitNPROC=512
PrivateTmp=true
ProtectSystem=full
AmbientCapabilities=CAP_NET_BIND_SERVICE

[Install]
WantedBy=multi-user.target

d. My complete Caddyfile or JSON config:

{
        # enables debug logging
        debug

        # general
        admin 127.0.0.1:2019
        grace_period 5s
        order replace after encode

        # tls
        email myemail@auburnobriens.org

        storage file_system {
                root /opt/caddy2/data
        }

        log {
                output file /opt/caddy2/log/service.log {
                        roll_size 1gb
                        roll_keep 30
                        roll_keep_for 730d
                }
        }
}


https://service.auburnobriens.org:21370 {
        tls {
            issuer acme {
                email myemail@auburnobriens.org
                dns cloudflare somekey (will move to env var after testing)
                preferred_chains {
                    root_common_name "ISRG Root X1"
                }
            }
        }

        log {
            format json
            output file /opt/caddy2/log/access.log {
                roll_size 1gb
                roll_keep 30
                roll_keep_for 730d
            }
        }

        reverse_proxy http://10.1.10.144:9002
}

3. The problem I’m having:

A request to this endpoint from a specific client (across the internet, using some backend software they have) result in the following network flow:

  1. Request comes in
  2. Proxy host opens a TCP connection to 10.1.10.144 on port 9002 (SYN, SYN/ACK, ACK)
  3. Proxy host sends a TCP Client Hello to 10.1.10.144 on port 9002
  4. 10.1.10.144 sends a TCP ACK back, but doesn’t respond to the TCP Client Hello (as it’s an HTTP-only endpoint)
  5. ~120 seconds later, Caddy decides to stop waiting and errors out (see #4)
  • Postman mocking up the same request to the reverse proxy works fine
  • wget / curl to the http://10.1.10.144:9002 address from the reverse proxy host both work fine

I added a few fmt.Println commands in modules/caddyhttp/reverseproxy/httptransport.go, as follows:

func (h *HTTPTransport) RoundTrip(req *http.Request) (*http.Response, error) {
        h.SetScheme(req)

        fmt.Println("req") // Prints fine
        fmt.Println(req) // Prints fine
        fmt.Println("h") // Prints fine
        fmt.Println(h) // Prints fine

        // if H2C ("HTTP/2 over cleartext") is enabled and the upstream request is
        // HTTP/2 without TLS, use the alternate H2C-capable transport instead
        if req.ProtoMajor == 2 && req.URL.Scheme == "http" && h.h2cTransport != nil {
                fmt.Println("ohhhh")
                return h.h2cTransport.RoundTrip(req)
        }

        resp, err := h.Transport.RoundTrip(req)
        // We _never_ get beyond here - I don't know golang well, but I assume that 
        // it's spinning on something, and something upstream in code
        // times it out after ~120 seconds, hence the EOF error message.
        if err != nil {
                fmt.Println("err")
                fmt.Println(err)
        } else {
                fmt.Println("noerr")
                fmt.Println(resp)
        }

        return resp, err
}

4. Error messages and/or full log output:

Every time this occurs, the following two logs show up in debug logging:

{
    "level": "debug",
    "ts": 1634334685.012545,
    "logger": "http.handlers.reverse_proxy",
    "msg": "upstream roundtrip",
    "upstream": "10.1.10.144:9002",
    "duration": 127.217628844,
    "request": {
        "remote_addr": "69.18.71.143:7183",
        "proto": "HTTP/1.1",
        "method": "POST",
        "host": "service.auburnobriens.org:21370",
        "uri": "https://service.auburnobriens.org:21370",
        "headers": {
            "X-Forwarded-For": [
                "69.18.71.143"
            ],
            "User-Agent": [
                "gSOAP/2.7"
            ],
            "Content-Length": [
                "449103"
            ],
            "X-Forwarded-Proto": [
                "https"
            ],
            "Accept-Encoding": [
                "gzip, deflate"
            ],
            "Content-Type": [
                "application/soap+xml;charset=UTF-8;action=\"setPlan\""
            ]
        },
        "tls": {
            "resumed": false,
            "version": 771,
            "cipher_suite": 49195,
            "proto": "",
            "proto_mutual": true,
            "server_name": "service.auburnobriens.org"
        }
    },
    "error": "EOF"
}
{
    "level": "error",
    "ts": 1634334685.0126512,
    "logger": "http.log.error.log13",
    "msg": "EOF",
    "request": {
        "remote_addr": "69.18.71.143:7183",
        "proto": "HTTP/1.1",
        "method": "POST",
        "host": "service.auburnobriens.org:21370",
        "uri": "https://service.auburnobriens.org:21370",
        "headers": {
            "User-Agent": [
                "gSOAP/2.7"
            ],
            "Content-Length": [
                "449103"
            ],
            "Connection": [
                "close"
            ],
            "Accept-Encoding": [
                "gzip, deflate"
            ],
            "Content-Type": [
                "application/soap+xml;charset=UTF-8;action=\"setPlan\""
            ]
        },
        "tls": {
            "resumed": false,
            "version": 771,
            "cipher_suite": 49195,
            "proto": "",
            "proto_mutual": true,
            "server_name": "service.auburnobriens.org"
        }
    },
    "duration": 127.217750345,
    "status": 502,
    "err_id": "agknrjcyp",
    "err_trace": "reverseproxy.statusError (reverseproxy.go:857)"
}

5. What I already tried:

  • Using raw 10.1.10.144:9002 and http://10.1.10.144 as the reverse proxy “to”
  • Use the following in the reverse proxy config
    • flush_interval -1
    • buffer_requests
    • buffer_responses
  • Tried the following in http transport config:
    • compression off
    • keepalive off
    • versions 1.1
    • read_buffer 0
    • write_buffer 0

6. Links to relevant resources:

Nothing useful I’ve found.

What does the actual request look like? I have a hunch, depending on what the incoming request looks like.

There’s a known issue where if the HTTP request line looks like this:

GET https://service.auburnobriens.org/foo

Then the scheme/host are set, and it overrides how reverse_proxy determines the scheme/host.

I have a PR to fix it, but we’re a bit concerned about the “correctness” and “safety” of the fix, so we haven’t merged it yet

Edit: oh actually, the clue is in the logs:

That means that it is the issue I described above. The URL in the incoming request should not have the scheme and host in it. The client is kinda making a bad request, and Caddy is handling that kind of bad request in a non-ideal way.

So yeah, that PR above would fix it for you, but we haven’t figured out the exact correct way to do it. You could build Caddy with that branch/commit to test it out.

1 Like

Ah, this is fantastic - thank you! I’ll test against the branch :slight_smile:

1 Like

Confirmed - building on that branch resolved the problem. Thanks for the pointer to that issue, don’t know how I missed it!

1 Like

I’ve updated the PR to adjust the fix to be localized to the reverseproxy package. Do you mind trying again with a build from the latest version of the branch? Thanks!

1 Like

Yep, just built it and tested again - works great, and feels far less scary!

2 Likes

This topic was automatically closed after 30 days. New replies are no longer allowed.