Caddy v2 reverse proxy performance tuning options?

jmn · October 21, 2019, 10:38am

I am wondering if there are any ways to configure Caddy v2 to improve the performance of a reverse proxy.
This is my current configuration as a Caddyfile:

www.postya.net {
    reverse_proxy {
	to localhost:4000
    }
}

Simple wrk test with Nginx:

wrk -c 100 -d 60 -t 2  http://www.postya.net/posts
Running 1m test @ http://www.postya.net/posts
  2 threads and 100 connections
   Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     4.36ms    1.05ms  38.54ms   85.31%
    Req/Sec    11.55k     1.55k   14.76k    75.00%
  1379288 requests in 1.00m, 489.33MB read
Requests/sec:  22978.14
Transfer/sec:      8.15MB

With Caddy v2:

wrk -c 100 -d 60 -t 2  http://www.postya.net/posts
    Running 1m test @ http://www.postya.net/posts
      2 threads and 100 connections
      Thread Stats   Avg      Stdev     Max   +/- Stdev
        Latency    18.81ms    4.25ms  69.01ms   85.66%
        Req/Sec     2.63k   300.67     3.19k    71.99%
      314647 requests in 1.00m, 49.51MB read
    Requests/sec:   5239.63
    Transfer/sec:    844.28KB

matt · October 21, 2019, 2:02pm

We haven’t spent much time optimizing the v2 proxy yet.

Which version (commit) are you using?

jmn · October 21, 2019, 2:37pm

Hi matt,
I’m using commit 208f2ff93c1bd2c009e4b96f664c1808ede79f3a

I’m not in much need of more performance right now to be frank, I’m just curios. Caddy solved a problem for me where Nginx reverse proxy caused my application to error silently so I’m a happy camper.

matt · October 21, 2019, 3:54pm

Do you know what it was that caused it? Interested in knowing the difference here.

As for performance, it’s definitely something we can improve on here – just want to get everything in a “working” state first. Do you have any idea what is causing the latency? (Maybe a profile?)

jmn · October 21, 2019, 4:21pm

Unfortunately it is a complete mystery to me, and I suspect it would include deep debugging of nginx to figure out.

No I don’t and I’m not proficient in go debugging, but perhaps I can try to learn a bit and get some data at a later point.

Here’s a wrk with --latency:

wrk -c 100 -d 60 -t 2  --latency http://www.postya.net/posts
Running 1m test @ http://www.postya.net/posts
  2 threads and 100 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    18.65ms    4.45ms  67.18ms   84.02%
    Req/Sec     2.65k   332.66     4.92k    71.55%
  Latency Distribution
     50%   18.05ms
     75%   19.92ms
     90%   23.37ms
     99%   33.86ms
  315973 requests in 1.00m, 49.72MB read
Requests/sec:   5259.72
Transfer/sec:    847.51KB

jmn · October 21, 2019, 4:32pm

With 50 concurrent connections the latency is looking much better:

wrk -c 50 -d 60 -t 1  --latency http://www.postya.net/posts
Running 1m test @ http://www.postya.net/posts
  1 threads and 50 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     4.44ms    2.72ms  31.21ms   78.98%
    Req/Sec     3.98k   425.33     5.03k    76.87%
  Latency Distribution
     50%    3.56ms
     75%    5.27ms
     90%    8.59ms
     99%   13.11ms
  237425 requests in 1.00m, 37.36MB read
Requests/sec:   3956.06
Transfer/sec:    637.45KB

matt · October 21, 2019, 6:07pm

Okay, thanks for experimenting.

What if you try configuring keepalive_idle_conns (do a “find in page” here): Home · caddyserver/caddy Wiki · GitHub

For perf testing, it’d be best to use the JSON config directly (you can use the caddy adapt command to convert what you have over to JSON) so you have complete control over things. In JSON, you’d want keep_alive.max_idle_conns and keep_alive.max_idle_conns_per_host.

jmn · October 21, 2019, 6:36pm

I tried what you said with a JSON config and set keep_alive.max_idle_conns and keep_alive.max_idle_conns_per_host to a very high number (100k) but got no real difference in latency with 100 connections.

matt · October 21, 2019, 7:06pm

Okay, good to know. If anyone wants to keep investigating, that would be great! (I have my hands full for a while.)

matt · October 23, 2019, 4:12am

@jmn On second look, these tests are kind of meaningless since we don’t know what it’s really comparing. (Even getting the configs won’t fully satisfy an answer to that question, because there are many more dimensions involved in benchmarks like these, but at least it’s a start for things we can look into.)

Can you share your full, unchanged nginx config please?