Load-balancing synchronous apps

yktd · November 4, 2024, 9:54am

Hi,

I need to load balance a windows app that can process one http request at a time (HTTP wrapper over a COM client library, don’t ask).

Would Caddy be able to handle that, i.e. hold requests until any of the backends will become available?

Thanks!

TuggyNE · November 4, 2024, 8:44pm

Not sure how you would define or determine “become available” (return Status 200 OK?) or “hold requests”.

As a very simple starting point, setting lb_policy to round_robin would seem useful. You might also experiment with lb_try_duration or lb_retries on top of that, and perhaps lb_try_interval depending on the nature of synchronization.

I don’t think Caddy has any built-in way to force requests to be processed serially, so if the COM wrapper doesn’t have at least some synchronization logic this is not likely to work well.

yktd · November 4, 2024, 9:19pm

Hey @TuggyNE !

Thanks for diving in. Can you please elaborate on the synchronisation mechanisms that you’ve mentioned?

We have control over the the HTTP part, so maybe that’s something that we can acomodate.

Thanks.

TuggyNE · November 4, 2024, 9:58pm

It’s been quite a while since I’ve done anything with COM, so I’m not sure what kind of threading issues you’re dealing with there. But presumably there’s some way to detect that there’s already a request being processed by a particular worker process, and you’d need to make sure it returns an appropriate HTTP error that Caddy’s load balancer will recognize and use to try another backend. You’d need to set fail_duration in the Caddyfile so the passive health check logic is turned on, and perhaps unhealthy_status. In principle, this should give you the best utilization and fairness, because requests are shuffled around with small delays till a currently-unused backend can handle them immediately.

Alternatively, just setting long timeouts, round-robin backend selection, and using a mutex or something within the wrapper to force serialization would allow you to do almost as well, delaying some proportion of requests until the previous one finished up, but spreading that out fairly evenly. (It’s possible the COM component already actually does this.) This does have the potential to leave some backends under-utilized even under heavy load, though.

In either case, obviously, request latencies and even error rates can get really bad in the long tail if you’re close to your sustained throughput limits.

yktd · November 5, 2024, 9:56am

got ya. at this stage we want to avoid all the COM multithreading shenanigans, and that’s why our backends are only handling one request at a time.

so:

now: user facing app -> http server that can synchronously process one request at a time

what we want:

user facing app -> reverse-proxy w. load balancer -> (http server that can synchronously process one request at a time * N)

that way we can keep our COM wrapper as a simple http facade.

otherwise, as you wrote it will get complicated really quick.

francislavoie · November 7, 2024, 11:48am

To follow up here, this came up on Discord.

Using unhealthy_request_count + lb_try_duration is probably the combo needed here.

unhealthy_request_count marks an upstream “unhealthy” (meaning “don’t pick me for new requests”) if it has N or more requests already in flight. Using lb_try_duration has Caddy hold requests for up to N seconds if all upstreams are already handling requests.

Though lb_try_duration is not a queue, it a polling retry, so there’s a chance a new request can skip the line and get handled before one that’s waiting from lb_try_duration gets to go through. Depends how high the throughput you’re trying to handle is. Obviously you could scale up with more upstreams to increase your amount of concurrent requests.

yktd · November 8, 2024, 3:19pm

Thanks @francislavoie , I marked this as the answer.

I noted that there is

and reverse-proxy module JSON Config Structure - Caddy Documentation that describes try_duration and try_interval which seems to be identical in purpose to lb_try_duration and lb_try_interval.

Can you please explain how are they related?

francislavoie · November 8, 2024, 3:33pm

I’m referencing Caddyfile config. Caddyfile does map to JSON config (Caddyfile is an adapter, its job is to produce JSON config). It’s named differently in JSON because it’s structured within a load_balancing object, but it’s flat in the Caddyfile with lb_ as a prefix.

system · December 8, 2024, 3:34pm

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.