`As you can see I’m just trying to have a reverse proxy for 2 different nodes, with the same service isntalled. I would like to have caddy route the users to the first available node, so I can stop the second one and install a new version without stopping the service for the users. The problem, I have tested some config options, is that caddy doesn’t understand immediately if a node is down, and try to serve with the stopped service, so I got a 502. How can I achieve this simple use case?
I think you want to play with the lb_try_duration option and passive health checks to see if it does what you need.
Active health checks happen in the background, so there’s up to 1 second (the interval you set) where Caddy won’t realize that a backend is down.
Another approach you could take it make a /health endpoint on your app, and set it up such that when you want to take it down, make your /health endpoint return as unhealthy, while still serving requests. Keep serving requests as normal for something like 2x your health interval (you can tweak that number, just an idea). After that interval, where you know Caddy will have realized the backend is down, you can shut down that instance.
Also FYI, use ``` on the lines before and after your config to use code formatting in these forums
Very good, I will try immediately with your solution. The problem I got was that it seems that the health_interval was not respected, meaning that I get 502 for more than just one (o 2) seconds, before caddy realise the node is down. I don’t know if that is normal, I expected that is was more “fast” in understanding the situation, what do you think?
When running as a systemd service, the stdout logs are written to the system journal. You can see them by running journalctl --no-page -u caddy | less (hit Shift+G to jump to the bottom)