This doesn’t make sense. health_uri is supposed to be a request path+query to request on the current upstream. You could change the health_port to use a different port on the same upstream’s IP, but not a totally different upstream.
Active health checks are meant to allow upstreams to self-report their health. If it can’t be reached and confirmed healthy, then it’s marked unhealthy.
You could write your own dynamic upstreams provider which can perform its own health checks.
Maybe we should add support for sending a POST + request body for health checking.
Where the new port is implementing the logic I wanted to achieve using a different upstream. This isn’t ideal, as that means the health logic will need to be maintained among 200+ different upstreams needing the same logic, but it works.
This is why being able to change the upstream would be helpful. Right now the same script is duplicated 200+ times to act as a health-check-proxy.
If you could change the upstream, that single script could be passed the upstream & port and do the same logic for all 200 duplications.
Yeah, you’d implement your own active health checks in your dynamic upstreams plugins (only handing Caddy the upstreams that are healthy). Gives you full control over how you want to do health checking.
Holy moly, 200 upstreams???
Probably something we will need to do. PRs welcome if you need it sooner rather than later. Shouldn’t be too complicated.
Yeah, you’d implement your own active health checks in your dynamic upstreams plugins (only handing Caddy the upstreams that are healthy). Gives you full control over how you want to do health checking.
Ahhh I didn’t realize that’s how it’d work. I’ll look deeper into it.
Holy moly, 200 upstreams???
Yeah, trust me, it’s a doozy of a directory of Caddyfiles. Ansible to the rescue.
Just catching up on this thread; interesting use case-- we could probably expand the capabilities of the active health checker. I’ll talk to Francis and others and see what is the best way to go about this.
Turns out this solution of redirecting to a different port that has a sitting health checker is only partially effective.
grpc.website.com {
reverse_proxy {
transport http {
versions h2c 2
dial_timeout 3s
}
to IP:22390
import lb-config 22317
}
}
Due to the network being upgraded to h2c ://for grpc, passing in the port doesn’t work as there’s no way to change the transport method back to http://.
Mild to moderate. Right now the transport used is the same as that of the reverse_proxy handler. So if it’s designed to use h2c, we’d have to make a separate transport configrable for active health checks. (which, I originally considered, is not really a great idea because now your health check uses a different protocol entirely than what you’re actually proxying).
So, difficult? Not really… Good idea? Also not sure
Then from there the message is built to make requests. As part of the gRPC standard there should be a /health endpoint to hit, but again, it requires building the query in the method outlined above.
If we circle back around to the caddy active health checks, in theory it would look like:
But again that’d miss all the proto methods that have to be tied in.
To be clear, my question about whether grpc health wasn’t intended to be snide, more covering bases so I understand if I need to expend additional effort trying to see if caddy does support grpc health checks. It sounds like not.
Circling back around, this is why I think being able to swap out the scheme may be the easiest method for supporting grpc health checks.