1. The problem I’m having:
I’m exploring using Caddy with the rate limit module as a rate limiting reverse proxy. Our API is constructed around tasks (what those tasks are isn’t relevant to this discussion. If you’re curious it’s the IETF’s Distributed Aggregation Protocol) and its request paths generally look like tasks/{task-id}/resource
, where {task-id}
is a URL-safe Base64 blob unique to each task. We want to apply different rate limits to different tasks. So far so good: I can craft a Caddyfile containing rate limits with a path
or path_regexp
that matches on a particular {task-id}
value.
I also want to run multiple instances of the rate-limiter and have them do what I’ve seen called “global rate-limiting”. No problem: the rate limiting plugin supports what it calls “distributed” rate limits using a few different storage backends like Redis.
We also have the perhaps unusual goal of being able to dynamically set per-task rate limits. Caddy’s administration API seems like a good fit for this: I can do PATCH /config/[path-to-rate-limits]
to add or modify rate limit entries.
Where I get stuck is dynamic rate limits and distributed rate limits. It seems like the admin API governs a single instance of Caddy, so that if I wanted to update rate limits, I’d have to hit each Caddy replica’s admin API and update the rate limits one by one. That seems risky since the rate_limit
module’s docs caution that “[i]n order for [distributed rate limits] to work, all instances in the cluster must have the exact same RL zone configurations.”
One way forward I can see would be to have the Caddy replicas get their RL zone configurations from a common config store. In my case, I am running all this in Kubernetes, so they would all get their Caddyfile from the same ConfigMap. So then updating rate limits means updating that ConfigMap and then restarting/reloading the Caddy replicas so they pick up the new configuration. I think this approach has the same problem with replicas having different configs loaded while a config change is being rolled out, though.
I think ultimately my question is about managing configuration across replicas of Caddy and not specifically about rate limiting. Is there some prior art or a plugin I can use for this?
On the other hand, am I reading too much into the rate_limit
module’s caution about consistent rate limit zone configurations? Is it OK if replicas briefly have an inconsistent view of RL config during updates?
2. Error messages and/or full log output:
No error messages (yet); this is an architecture question.
3. Caddy version:
n/a
4. How I installed and ran Caddy:
n/a, I haven’t tried this yet
a. System environment:
b. Command:
PASTE OVER THIS, BETWEEN THE ``` LINES.
Please use the preview pane to ensure it looks nice.
c. Service/unit/compose file:
PASTE OVER THIS, BETWEEN THE ``` LINES.
Please use the preview pane to ensure it looks nice.
d. My complete Caddy config:
PASTE OVER THIS, BETWEEN THE ``` LINES.
Please use the preview pane to ensure it looks nice.