Fault tolerance, docker, clustering

I understood it as it could validate each service independently, and then only apply the ones that’s valid, and log the errors for the ones that’s not. I.E. run the validate function on every single service before stitching them all together to a caddyfile.

That’s definitely an option and yeah I could build tooling around that.

Well I mentioned that servers can go down, and I’ve also had the docker daemon go down. I actually thought that swarm would move containers if it ran out of memory if you used memory reservations. I see now that’s not the case.

I don’t think there’s any validation logic in the actual Docker Caddyfile loader. My understanding was that it simply generates the config and then feeds it to the reload function, a la SIGUSR1 - which would mean Caddy itself does the validating and reverts if the entire new config is bad. That said…

Adding some validation and selectively discarding entire services if they produce a syntactically invalid Caddyfile seems plausible, although I don’t know if checking them “out of context” of the rest of the generated Caddyfile might cause some weirdness considering there’s some merging going on. I’d wager it could gain some traction as a feature request over at https://github.com/lucaslorentz/caddy-docker-proxy.

Aye, but this is an issue of general unreliability, not an issue specific to a container. Same as if KVM failed for some reason or someone tripped over a power cable.

I think I’ll make that feature request and see how that goes. I should learn go at some point…

Yeah. Being sure that it’ll come up in the same state as it was before would be really neat I think.

I made an issue on the caddy-docker-proxy project.

Thank you for all the help. I didn’t initially understand USR1 and the path forward. You’ve all been very helpful.

2 Likes

The plugin author (lucaslorentz) has now made it so that each service/container gets validated and excluded from the final caddyfile if there’s any errors :tada:

1 Like