Nginx vs Caddy benchmark video

Kyrra · November 3, 2024, 6:38pm

I know there is a common comparison between caddy and Nginx. I’m a bit surprised that the difference was as much as it was in the video (as far as request per second that can be handled given the same system resources).

That the guy who runs this YouTube channel tends to be open to any feedback on tweaking the test if something may be unfair out of the box. If anyone has any feedback for the guy, I’d recommend leaving comments on his YouTube video, or issuing pull requests against the GitHub repo that are linked in the video description.

matt · November 5, 2024, 12:49am

Thanks for the discussion – I also was pinged on Twitter and saw this, I had been meaning to comment.

I might actually post my comment here, and direct people to it, since it’s a better medium than most social media sites… so my comments aren’t directed at you in particular, they’re just my overall thoughts and response.

Anyway… that guy doesn’t know how to benchmark servers.

I don’t either. Nobody really does! They just know how to use tricks to get clicks. And it works, he’s got 18,000 views and successfully fooled that many people that this is how rigorous testing are done. Truly tragic.

First, in general, I think high level “benchmarks” such as this aren’t really benchmarks. There are too many factors to be a reliable benchmark. This isn’t a “change one thing and repeat” kind of test like you’d expect from rigorous science, because there are hundreds… no, thousands, of dimensions at play.

Network stack
Kernel config
OS scheduler
Other processes
CPU temperature
… so many other factors (esp. depending on config)

It’s actually ridiculous… trying to measure precise results across process boundaries subject to arbitrary I/O performance of a complex machine you have nearly no control over.

Not to mention that web servers never encounter real load like what load testers generate. Compare this to other common benchmarks you frequently see in videos or read about in reviews where people test the write performance of hard drives, or the compute of a GPU, using authentic data or workloads. Web servers are optimized for real traffic, not load tests. You wouldn’t want to run a web server that does fabulous on load tests in production because the tunings will be totally different, and you’ll easily ruin your production performance.

Since server benchmarks are fake, you might as well disable garbage collection. You won’t need it for a short-duration test. Just for fun I did a quick “benchmark” of Caddy and got almost 400,000 req/sec with GC off (1.3ms latency). With it on, I maxed out a little under 300,000 req/sec (3.3ms latency). Didn’t even run out of memory.

Some of the points in the video don’t even make sense, for example:

You might wonder why there aren’t any errors for Caddy. Nginx tries to process all incoming requests, while Caddy simply doesn’t accept new requests when overloaded. That’s why there are no timeouts on Caddy’s graph.

Now, I’m not familiar with the testing tool he’s using (there are hundreds of them and I haven’t used them all), but if it’s not counting rejected connections as errors, it’s broken. Very broken. If Caddy is not accepting connections, the tool should be showing errors! You could have the server terminated and still have a 100% success rate otherwise.

Caddy will actually accept all the connections (hence, no errors, maybe?) and service all of them even if it takes longer, whereas nginx will just drop connections on the floor.

So the explanation behind why Caddy has no errors is misinformed at best, or backward at worst. Caddy has no errors either because the testing tool is broken or Caddy isn’t.

I don’t even know where to start when it comes to the Caddy config.

Enabling gzip? What is he even benchmarking? A web server or a compression algorithm? (Gzip has been around for decades so the C implementation is surely much more optimized than any Go implementation.)
The transform encoder?? Really? That’s literally going out of your way to compile a custom Caddy binary to disable optimizations we have in place by default. Of course allocations are going to go through the roof. By doing this, he’s not comparing the two servers’ performance (can’t rigorously be done btw). He’s trying to see how well Caddy does at being nginx… outputting the same log format, etc. Well, guess what: Caddy wasn’t designed to be nginx. It was designed to replace it.
Writing logs to a file? Why get the file system more involved than you have to? At least rip that out of both servers’ configs. Is he trying to test the disk or the web server?
The Caddy reverse proxy has no tuning to optimize performance for the load test, but the nginx config has dozens of tunings.
Caddy’s default gzip setting is 5, I think; but nginx is tuned to 3.
Benchmarking TLS performance is also futile while pulling from the source of entropy. To measure TLS performance properly you need to disable randomness for the duration of testing.

How about instead of testing to see how well Caddy does at being nginx, testing nginx to see how well it does at being Caddy? See how well it scales, then see how early it falls over. You’ll be surprised how bad nginx is.

(Thank you to @Mohammed90 for PR’ing a few corrections to the Caddy config.)

But none of this matters since it’s not real. Go programs are fast enough for Google, Netflix, Stripe, and many other large Internet companies… it’s probably fast enough for you (and still getting faster).

aputra · November 5, 2024, 7:04pm

I’m sorry you feel that way, I’m not biased at all. It’s hard to read, but I’ll go over each point you made in the blog post about the video and release an updated benchmark.