How can I fall back to a different upstream pool when the first fails?

1. The problem I’m having:

I have 2 pools of nodes I’m reverse_proxying to - one in the US, and one in EU. If the EU pool fails, I’d like to fall back to the US pool of nodes.

Following the instructions found here: reverse_proxy (Caddyfile directive) — Caddy Documentation

However, what I’m seeing is the handle_response block is never reached regardless of what is in it. I can verify handle_response isn’t hit because I can’t output {rp.status_code}, respond 200, etc. from within it.

The behavior is consistent if I remove the load balancing block. I have also attempted to use the precise error code, 503, rather than allowing for all error codes.

2. Error messages and/or full log output:

{"level":"info","ts":1715280177.0506995,"logger":"http.handlers.reverse_proxy.health_checker.active","msg":"HTTP request failed","host":"5.0.6.1:10157","error":"Get \"http://5.0.6.1:10157/status\": dial tcp 5.0.6.1:10157: connect: connection refused"}
{"level":"info","ts":1715280177.0507023,"logger":"http.handlers.reverse_proxy.health_checker.active","msg":"HTTP request failed","host":"5.0.6.1:10057","error":"Get \"http://5.0.6.1:10057/status\": dial tcp 5.0.6.1:10057: connect: connection refused"}
{"level":"info","ts":1715280177.0506918,"logger":"http.handlers.reverse_proxy.health_checker.active","msg":"HTTP request failed","host":"5.0.6.1:17157","error":"Get \"http://5.0.6.1:17157/status\": dial tcp 5.0.6.1:17157: connect: connection refused"}
{"level":"error","ts":1715280186.8973837,"logger":"http.log.error","msg":"no upstreams available","request":{"remote_ip":"162.158.238.189","remote_port":"20702","client_ip":"ip_here","proto":"HTTP/2.0","method":"GET","host":"shade-rpc.lavenderfive.com","uri":"/","headers":{"X-Forwarded-Proto":["https"],"Origin":["https://shadeprotocol.io"],"Cdn-Loop":["cloudflare"],"Cf-Ipcountry":["FI"],"X-Forwarded-For":["ip_here"],"Cf-Ray":["8813d54f9c188d7c-HEL"],"Cf-Connecting-Ip":["ip_here"],"Accept-Encoding":["gzip, br"],"Cf-Visitor":["{\"scheme\":\"https\"}"],"Accept":["text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9"],"User-Agent":["Uptime-Kuma/1.23.13"]},"tls":{"resumed":false,"version":772,"cipher_suite":4865,"proto":"h2","server_name":"shade-rpc.lavenderfive.com"}},"duration":0.00014329,"status":503,"err_id":"emzqpkhpj","err_trace":"reverseproxy.(*Handler).proxyLoopIteration (reverseproxy.go:490)"}
{"level":"error","ts":1715280186.8974426,"logger":"http.log.access","msg":"handled request","request":{"remote_ip":"162.158.238.189","remote_port":"20702","client_ip":"ip_here","proto":"HTTP/2.0","method":"GET","host":"shade-rpc.lavenderfive.com","uri":"/","headers":{"Cdn-Loop":["cloudflare"],"Cf-Ipcountry":["FI"],"X-Forwarded-For":["ip_here"],"Cf-Ray":["8813d54f9c188d7c-HEL"],"X-Forwarded-Proto":["https"],"Accept-Encoding":["gzip, br"],"Cf-Visitor":["{\"scheme\":\"https\"}"],"Accept":["text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9"],"User-Agent":["Uptime-Kuma/1.23.13"],"Cf-Connecting-Ip":["ip_here"]},"tls":{"resumed":false,"version":772,"cipher_suite":4865,"proto":"h2","server_name":"shade-rpc.lavenderfive.com"}},"bytes_read":0,"user_id":"","duration":0.00014329,"size":0,"status":503,"resp_headers":{"Server":["Caddy"],"Alt-Svc":["h3=\":443\"; ma=2592000"]}}

3. Caddy version:

v2.7.6 h1:w0NymbG2m9PcvKWsrXO6EEkY9Ru4FJK8uQbYcev1p3A=

4. How I installed and ran Caddy:

Installed via web.

a. System environment:

Ubuntu 22.04, systemd

b. Command:

sudo systemctl start caddy.service

c. Service/unit/compose file:

[Unit]
Description=Caddy
Documentation=https://caddyserver.com/docs/
After=network.target network-online.target
Requires=network-online.target

[Service]
Type=notify
User=caddy
Group=caddy
ExecStart=/usr/bin/caddy run --environ --config /etc/caddy/Caddyfile
ExecReload=/usr/bin/caddy reload --config /etc/caddy/Caddyfile --force
TimeoutStopSec=5s
Restart=always
RuntimeMaxSec=43200
LimitNOFILE=1048576
LimitNPROC=512
PrivateTmp=true
ProtectSystem=full
AmbientCapabilities=CAP_NET_ADMIN CAP_NET_BIND_SERVICE

[Install]
WantedBy=multi-user.target

d. My complete Caddy config:

secretnetwork-rpc.lavenderfive.com {
		reverse_proxy 5.0.6.1:17157 5.0.6.1:10057 5.0.6.1:10157 {
				lb_policy round_robin
				lb_retries 3
				health_uri /status?
				health_interval 30s
				health_timeout 5s
				health_status 2xx
				health_body catching_up"\s*:\s*false

			@error status 5xx
			handle_response @error {
				reverse_proxy 5.0.6.2:17157 5.0.6.2:10057 5.0.6.2:10157 {
			        lb_policy round_robin
			        lb_retries 3
			        health_uri /status?
			        health_interval 30s
		            health_timeout 5s
			        health_status 2xx
			        health_body catching_up"\s*:\s*false
				}
			}
		}
	}
}

5. Links to relevant resources:

I can use handle_errors to emulate the intended behavior, like the following:

shade-rpc.lavenderfive.com {
		reverse_proxy 5.0.6.1:17157 5.0.6.1:10057 5.0.6.1:10157
        handle_errors {
	       @5xx `{err.status_code} >= 500 && {err.status_code} < 600` {
             reverse_proxy 5.0.6.2:17157 5.0.6.2:10057 5.0.6.2:10157
           }
        }
}

However, the handle_errors page suggests you should use handle_response with reverse_proxy.

handle_response only gets run if an actual response was received from the upstream. In other words, only successfully proxied requests (even if the status code is an error one like 4xx or 5xx)

If the request never made it upstream (i.e. connection error, or no upstreams are available), then yes handle_errors is what you need to do.

2 Likes

That makes sense. I ended up playing around with it more afterward, using nodes that were online but responding incorrectly, and the behavior worked as expected… which fits perfectly in line with what you’re saying.

Thanks @francislavoie! Appreciate the response.

2 Likes