Any performance overhead as you add more headers under HTTP/2?

I haven’t benchmarked or looked into this yet with Caddy. But has anyone else benchmarked performance overhead (if any) as you add more HTTP headers to served requests under HTTP/2 loads ?

I did test H2O vs OpenLiteSpeed vs Nginx for HTTP/2 loads via nghttp2 client’s h2load and H20 had very noticeable performance overhead when adding more HTTP headers to served requests. Sort of was better by serving all additional HTTP headers under a single line h2load test h2o - overhead with additional header.set etc settings ? · Issue #240 · h2o/h2o · GitHub

Guess it’s on my to do list to check out :slight_smile:

guess i found some older Dec 2015 Caddy benchmarks for header middleware at Performance analysis of Caddy server header middleware | by karthic Rao | Medium

@matt has there been any improvements since then ? :slight_smile:

The screenshot below containing depicts the performance difference of the two benchmarks

Wow! The performance degradation with more headers is so clear. BenchmarkHeadersWithMoreRules clearly consuming many more folds per operation, and this could also increase with more rules and headers.Now lets dig deeper to see the reason for such a steep slowdown.

I haven’t been working on optimizations myself, I’m trying to make Caddy feature-complete, and I don’t think anyone else has optimized headers specifically…

1 Like

I see… totally understand that perfection takes time… lots of time :slight_smile:

hoping someone does look into this as I can imagine with all the security and HTTPS specific headers (HSTS, HPKP etc) you’d want to be using, this performance overhead will be high !

@matt

Did some quick benchmark tests with nghttp2 h2load HTTP/2 load tester comparing Caddy 0.9 with headers and without (just server header)

test compared these 2 header configurations

with

header / {
    #Strict-Transport-Security "max-age=31536000"
    Cache-Control "max-age=86400"
    X-Content-Type-Options "nosniff"
    X-Frame-Options "SAMEORIGIN"
    X-XSS-Protection "1; mode=block"
    X-Powered-By "Caddy via CentminMod"
    #-Server
}

without

header / {
    #Strict-Transport-Security "max-age=31536000"
    #Cache-Control "max-age=86400"
    #X-Content-Type-Options "nosniff"
    #X-Frame-Options "SAMEORIGIN"
    #X-XSS-Protection "1; mode=block"
    #X-Powered-By "Caddy via CentminMod"
    #-Server
}

Low concurrency tests at 10 concurrent users and 100 requests

  • Caddy 0.9 HTTP/2 HTTPS with headers = finished in 55.63ms, 1797.46 req/s, 2.43MB/s
  • Caddy 0.9 HTTP/2 HTTPS with without headers = finished in 48.52ms, 2060.92 req/s, 2.75MB/s

Higher concurrency tests at 100 concurrent users and 1000 requests

  • Caddy 0.9 HTTP/2 HTTPS with headers = finished in 324.30ms, 3083.56 req/s, 4.17MB/s
  • Caddy 0.9 HTTP/2 HTTPS without headers = finished in 303.17ms, 3298.46 req/s, 4.41MB/s

Then compared to my Centmin Mod LEMP stack Nginx (plan to integrate Caddy into) with headers

Low concurrency tests at 10 concurrent users and 100 requests

  • Caddy 0.9 HTTP/2 HTTPS = finished in 55.63ms, 1797.46 req/s, 2.43MB/s
  • Centmin Mod Nginx 1.11.3 HTTP/2 HTTPS = finished in 25.92ms, 3857.58 req/s, 5.99MB/s

Higher concurrency tests at 100 concurrent users and 1000 requests

  • Caddy 0.9 HTTP/2 HTTPS = finished in 324.30ms, 3083.56 req/s, 4.17MB/s
  • Centmin Mod Nginx 1.11.3 HTTP/2 HTTPS = finished in 228.77ms, 4371.15 req/s, 6.78MB/s

Caddy HTTP/2 used ECDSA 256bit with AES128 exchange while Nginx HTTP/2 used RSA 2048bit with AES256 exchange

Full setup and raw h2load results at https://community.centminmod.com/posts/34346/

Edit: Note Nginx is setup out of box to handle 2 worker_processes on the 4 cpu VirtualBox CentOS 7.2 guest server. While I believe Caddy is setup to utilise all 4 cpus.

user              nginx nginx;
worker_processes 2;
worker_priority -10;

Edit: Retesting with Centmin Mod Nginx set to 4 worker_processes now. Seeing as it’s 4 cpu threads not 4 real cpu cores, the bump from 2 to 4 only meant a marginal increase.

Low concurrency tests at 10 concurrent users and 100 requests

  • Caddy 0.9 HTTP/2 HTTPS with headers = finished in 55.63ms, 1797.46 req/s, 2.43MB/s
  • Caddy 0.9 HTTP/2 HTTPS with without headers = finished in 48.52ms, 2060.92 req/s, 2.75MB/s
  • Centmin Mod Nginx 1.11.3 HTTP/2 HTTPS (2 cpus) = finished in 25.92ms, 3857.58 req/s, 5.99MB/s
  • Centmin Mod Nginx 1.11.3 HTTP/2 HTTPS (4 cpus) = finished in 22.39ms, 4465.68 req/s, 6.93MB/s

Higher concurrency tests at 100 concurrent users and 1000 requests

  • Caddy 0.9 HTTP/2 HTTPS with headers = finished in 324.30ms, 3083.56 req/s, 4.17MB/s
  • Caddy 0.9 HTTP/2 HTTPS without headers = finished in 303.17ms, 3298.46 req/s, 4.41MB/s
  • Centmin Mod Nginx 1.11.3 HTTP/2 HTTPS (2 cpus) = finished in 228.77ms, 4371.15 req/s, 6.78MB/s
  • Centmin Mod Nginx 1.11.3 HTTP/2 HTTPS (4 cpus) = finished in 195.44ms, 5116.69 req/s, 7.94MB/s

Even higher concurrency tests at 2000 concurrent users and 25000 requests

  • Caddy 0.9 HTTP/2 HTTPS with headers = finished in 6.16s, 4058.36 req/s, 5.47MB/s
  • Centmin Mod Nginx 1.11.3 HTTP/2 HTTPS (4 cpus) = finished in 3.21s, 7795.35 req/s, 12.09MB/s

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.