Lucaslorentz / caddy-docker-proxy High Availability

1. Caddy version:

v2.6.2 h1:wKoFIxpmOJLGl3QXoo6PNbYvGW4xLEgo32GPBEjWL8o=

2. How I installed, and run Caddy:

I am using lucaslorentz / caddy-docker-proxy (GitHub - lucaslorentz/caddy-docker-proxy: Caddy as a reverse proxy for Docker)

a. System environment:

Ubuntu HOST using Docker swarm

b. Command:

docker swarm starts caddy

c. Service/unit/compose file:

  caddy:
    image: lucaslorentz/caddy-docker-proxy:ci-alpine
    ports:
      - 80:80
    networks:
      - app_network
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
      - caddy_data:/data
    deploy:
      placement:
        constraints:
          - node.role == manager
      replicas: 1
      restart_policy:
        condition: any

d. My complete Caddy config:

*.phppointofsale.com:80, *.phppos.com:80 {
	reverse_proxy 10.0.1.30 10.0.1.29 {
		trusted_proxies private_ranges
	}
}
*.phppointofsalestaging.com:80 {
	reverse_proxy 10.0.1.18 10.0.1.17 {
		trusted_proxies private_ranges
	}
}
custom.phppointofsale.com:80, custom.phppos.com:80 {
	reverse_proxy 10.0.1.25 10.0.1.24 {
		trusted_proxies private_ranges
	}
}
feedback.phppointofsale.com:80, feedback.phppos.com:80 {
	reverse_proxy 10.0.1.51 10.0.1.52 {
		trusted_proxies private_ranges
	}
}
help.phppointofsale.com:80, help.phppos.com:80 {
	reverse_proxy 10.0.1.39 10.0.1.40 {
		trusted_proxies private_ranges
	}
}
mysql.phppointofsale.com:80, mysql.phppos.com:80 {
	reverse_proxy 10.0.1.15 10.0.1.14 {
		trusted_proxies private_ranges
	}
}
www.phppointofsale.com:80, www.phppos.com:80, phppointofsale.com:80, phppos.com:80 {
	reverse_proxy 10.0.1.45 10.0.1.46 {
		trusted_proxies private_ranges
	}
}
www.phppointofsalestaging.com:80, phppointofsalestaging.com:80 {
	reverse_proxy 10.0.1.4 10.0.1.3 {
		trusted_proxies private_ranges
	}
}
zatca.phppointofsale.com:80, zatca.phppos.com:80 {
	reverse_proxy 10.0.1.57 10.0.1.54 {
		trusted_proxies private_ranges
	}
}

3. The problem I’m having:

I am trying to determine a way to run Caddy with docker swarm in a High availbility way. Right now I have 4 nodes (1 manager and 3 workers). It appears caddy can only run on 1 manager at a time. So if the manager fails, everything goes down. This also routes all traffic though the one manager.

  1. I want caddy to run on managers and workers
  2. If a manager or worker fails, the system continues to work.

4. Error messages and/or full log output:

When manager node goes down, application’s go down

5. What I already tried:

I tried setting it up like this

but couldn’t get the example to work. It seems caddy controller didn’t start.

6. Links to relevant resources:

You need your docker swarm to be high available too then.

A swarm cluster with only 1 manager can’t be high available.
Please refer to the docker swarm documentation :slight_smile:

If I increase swarm size and have 3 managers and the manager with caddy on it goes down, will caddy pop up on another manager?

How many managers and workers is ideal?

Note that in the distributed.yaml you shared, there are “servers” and “controllers”.

A controller runs on a swarm manager and sends the Caddyfile to the actual servers serving traffic on a swarm worker.
If the swarm manager currently running the controller goes down, the actual servers will continue to serve traffic just fine.
You just will miss out on config changes, until a controller is redeployed somewhere.
Doesn’t matter if it’s the same swarm manager being up again, or another.

See “Add manager nodes for fault tolerance” in Administer and maintain a swarm of Docker Engines | Docker Documentation

I don’t see deploy inside caddy controller. What should it be?

deploy:
  placement:
    constraints:
      - node.role == manager

should suffice

Thanks! I get from a high level how these 2 components interact now. Can you explain the networking (CADDY_CONTROLLER_NETWORK)

Also can I make caddy server global? (Instead of 3 replicas?)

It’s the

Network allowed to configure Caddy server in CIDR notation. Ex: 10.200.200.0/24

used by controllers to send their config to the servers.

See caddy-docker-proxy/distributed.yaml at 172f39f06f2972eec60c0d38514c5fd0b50af9ca · lucaslorentz/caddy-docker-proxy · GitHub
and caddy-docker-proxy/README.md at 172f39f06f2972eec60c0d38514c5fd0b50af9ca · lucaslorentz/caddy-docker-proxy · GitHub

Just copy the default from the distributed.yaml and you should be good.


Yes.


Just be aware that the usual things when running Caddy in a distributed setup applies.
Most importantly, the shared certificate storage.

From Automatic HTTPS — Caddy Documentation

Caddy will store public certificates, private keys, and other assets in its configured storage facility (or the default one, if not configured – see link for details).
[…]
Any Caddy instances that are configured to use the same storage will automatically share those resources and coordinate certificate management as a cluster.

So I did this setup and it kind of works.

I tested with 1 manager and 1 worker

I then shut down manager. It became unhealthy from load balancer so no requests were being routed

However every 1-2 requests I would get a bad gateway error which means somehow it was still trying to route in the mesh

Do I still need multiple managers to prevent this? how many managers and workers do you recommend?

Also the placement constraint on caddy_controller is manager. Should I also limit to 1 replica or is it ok if it runs on a lot of managers

I recommend reading Administer and maintain a swarm of Docker Engines | Docker Documentation and other parts of Docker Swarm’s documentation.

Because honestly don’t really feel like explaining Docker Swarm in great detail right now, especially considering there are lots of resources out there doing just that already.

And this also scratches the line of being slightly off-topic and with the current amount of volunteers, it’s only really feasible to focus on Caddy questions.

1 Like