Multi homed Caddy setup

As @Whitestrake pointed out, if you’re obtaining certs for the same names on different machines, you may run into rate limits depending on how many machines you have. That’s why the TLS storage plugin type could be useful…

I haven’t read that RFC to be honest; I’m not sure how it works! :flushed:

No problem I’ll check it out.

Somehow this feels like a problem ‘Let’s encrypt’ should solve, but things as rate-limiting sure make this interesting. Having two (independent) Caddy instances talk to each other via some mechanism to exchange TLS information sound iffy to me. It also creates dependence between the two (or more) Caddy servers. Granted they share a dep on Let’s Encrypt, but still.

Caddy would “work” if you just mounted a network drive for certs. Problem is, you need some kind of signal that certs have changed and the other instances need to pick up the changes.

What if each instance watched the file system? If anything in the cert dir changes (or the caddyfile), it will wait some short random time and issue a caddy restart?

1 Like

To get fancy, maybe something like redis or etcd to implement some simple locking around cert changes so only one instance at a time tries to update things.

1 Like

Can we do this without restarting? Would a reload pick up a cert change on disk? I’m interested in a solution but not keen on Caddy randomly going down for ~10 sec while it grabs 3-4 git sites via middleware, as it does on start or restart.

But yeah, sounds like you could do that right now with a bash script and inotify-tools.

I’ve an extra requirement. My caddy instances don’t share an internal network, they are just to VPss running caddy and linked via DNS round robin into 1 site.

I got autocert working using redis as my cache. It stores all certs in there, and also the challenges, so whichever server receives the callback can fulfill the challenge. Don’t even need to rely on dns challenge with that setup.

Perhaps if caddy had fully pluggable cert and temporary challenge storage… a setup with an external db or whatever could work.

Why can’t we have 1 caddy act as a master and the rest as slaves (with some fallbacks if the master is down?) and have that master do all the Let’s Encrypt interaction and tunnel that back to the slave?

What info needs to be transferred on a ongoing basis actually? Isn’t this just the first cert?

IOW: why can’t Caddy be an ACME proxy?

A restart with USR1 will not incur downtime, it just may take the restart several seconds to complete.

Remember, Caddy isn’t designed to be a general-purpose certificate manager. Its ACME functions are designed to maintain certificates for the sites it is serving…

But Caddy wont respond to queries, no? That’s downtime.

I think it makes (a lot of) sense to think about multi homed Caddy setups esp wrt ACME communication. Solving this in a good way probably means that caddy can run better in a container as well and allow for super easy autoscaling in cloud environments.

1 Like

No, it will. The restarts are graceful, zero-downtime.

Ah Ok. I stand corrected.

@matt are you willing to take code that would allow Caddys to “work together” to get a cert or is that a no-go?

Would a storage plugin not do that? I think that was why it was designed.

Is that this code? https://github.com/mholt/caddy/blob/master/caddytls/storage.go

Yep, that’s the interface!

Yep. But would USR1 notice cert changes on disk? Or just config changes?

It’ll reload certificates too.

1 Like

2 posts were split to a new topic: V1: Certs not reloading on USR1