Caddy does not close connections when server is marked as Unhealthy

1. The problem I’m having:

I am trying set up a primary/secondary failover server system using Caddy, where the Servers and Clients are connecting using signalr, Caddy is then listening to the /status health_uri to determin if the servers are active or not, by checking the response body.
The problem:
If i have the primary and secondary servers running and active, then connect a client to the Caddy server. The connection is then of course routed to the primary server, like it should be.
But if i now change the primary server to be innactive, Caddy is still connecting the client to the innactive primary server. Even though Caddy has marked the primary server as unhealty and all new connections gets routed to the seccondary server.

Is this how it is supposed to work? And is there a way to work around this, like some way to close connections to an unhealthy server?

2. Error messages and/or full log output:

This is what gets printed int the console when Caddy checks the health of the unhealthy primary server

2023/09/06 14:15:38.753 ←[34mINFO←[0m   http.handlers.reverse_proxy.health_checker.active       response body failed expectations    {"host": "localhost:55167"}

3. Caddy version:

2.7.4

4. How I installed and ran Caddy:

a. System environment:

I am running everything on windows 10 and i downloaded caddy_2.7.4_windows_arm64.zip
from the release on github

b. Command:

I used cd to get into where i uziped the folder then ran

caddy run --config Caddyfile

c. Service/unit/compose file:

d. My complete Caddy config:

:2015

reverse_proxy * {
	to localhost:55167
	to localhost:5500

	lb_policy first
	lb_try_duration 5s
	lb_try_interval 250ms
	health_uri /status

	health_interval 10s
	health_timeout 2s
	health_status 200
	health_body "Active"
}

5. Links to relevant resources:

What’s your proof of that? I don’t understand.

I am using signalr to constantly push updates from the server to connected clients, and i can very clearly see that the updates are coming from the inactive primary server.

Even if i swap out the Secondary Server with a mock server that only has the “/status” endpoint, which means that no client can actually connect to it, but Caddy will still mark it as healthy and route data to it. When i then make the primary server “inactive” the connected client never loses connection, but every new client that tries to connect will fail, because they try to connect to the mock server.

Ah it looks like signalr uses websocket connections.

Yes, connections are not severed when an upstream is marked unhealthy, and that’s working as intended. Being marked unhealthy only affects routing of requests to an upstream, it doesn’t affect existing connections.

The upstream server should close the connections if it can no longer handle them, forcing the client to reconnect, and it’ll get routed to a different (healthy) upstream.

1 Like