Announce: New http cache plugin

After a lot of work I have finally finished my cache plugin and it is now published: https://caddyserver.com/docs/http.cache

There a lot of things to add but it handles the most important ones:

  • It supports concurrency: so your upstream server will only see one request when you have many concurrent requests
  • It supports the use of Vary header: so you won’t have problems caching gzipped responses
  • It supports correct use of cache headers: it will respect your cache-control and expire headers

I hope it’s useful for everybody, any feedback is appreciated. If you have any problem do not doubt to report it to Issues · nicolasazrak/caddy-cache · GitHub

7 Likes

This is really cool and I can’t wait to try it out.

Forgive me if it’s a stupid question, but what does this cache do exactly ? I think it has nothing to do with static caches like those for WordPress or even Nginx ? What does it cache exactly ?

I guess my question could be : why should I use it ?

It is a middleware that caches responses according RFC 7234 - Hypertext Transfer Protocol (HTTP/1.1): Caching. It will store every successful response so the next time the same request hits caddy, it won’t be necessary to fetch it again. Really similar to proxy_cache in nginx, only that caddy-cache is not limited to proxy responses.

You should use it for slow public responses. For example if you use gzip, which is a cpu intensive task, you will only compress you content once, then save the result to disk, and the next time another user request the same page it won’t have to be gzipped again. So the response will be sent very quickly (freeing the cpu for doing another thing)

Another common use case is if you have a slow application behind the proxy. Imagine you have a rails/php/node app serving a page that is the same for every user, but to generate it you need to do a lot of slow database queries. You can generate it once, and then send the same page to every user using the cache so you do not have to make those queries again, slowing down everything else.

To make it short: it will help to speed up a lot of things. I hope I have answered your question, I’m not sure if it was exactly what you where asking.

3 Likes

I think so, it seems I under estimated the value of the plugin.

I’m using WordPress and a static cache right now. I will try to add your plugin and see if speed is better afterwards.

Thanks for your answer anyway ! :+1:

It’s worth stating to be cautious about what you cache when it comes to apps with confidential/user-specific data.

For example, caching something like /profile/edit could be disastrous, or /wp-admin, for example. Try to stick to caching static assets.

3 Likes

Can’t emphasize this enough. Caching is hard, even if Caddy and a plugin make it look easy: always be sure your caching policies account for protected pages!

To clarify further when it comes to WordPress specifically, and for anyone else reading… I recommend sticking to static assets because, for example, every single page changes when an admin is logged in - a toolbar gets added to the top. Caddy might see this and cache that toolbar even on otherwise static pages.

For those individual pages, I can’t recommend enough using a plugin for WordPress itself - WP Super Cache is a standout example. A smarter cache will know what it should and shouldn’t save. You can cache the static assets at the Caddy layer, as well, for a possible boost there - but let the people who built wordpress.com figure out the more complicated caching policies for you.

1 Like

Ok thanks for your answers. Too bad, I thought I had a new toy to play with. :slight_smile:

Thanks http.cache definitely improves this for caddy performance. I did quick test with non-HTTPS on virtualbox CentOS 7.3 compared with my Centmin Mod Nginx web server with siege 4.0.2 benchmarks and there is definitely a boost compared to plain caddy. Caddy http.cache had ~30.3% boost in performance (2,458 r/s) compared to normal caddy (1,887 r/s) but still behind Nginx for plain non-cached static html file request (between 3,373-3,605 r/s).

Noticed in http.cache mode, there’s duplicate Server: Caddy headers though ?

  • CentOS 7.3 64bit Virtualbox 4 cpu threads
  • 15.6" Samsung ATIV Book 8 Laptop
  • Intel Core i7 3635QM Quad core (4C/8T)
  • 16GB RAM
  • 960GB Crucial M500 SSD

caddy 0.10.6 default no cache

siege -b -c50 -t30s http://localhost:8888 -m caddy-c50-t30s

caddy 0.10.6 enable http.cache proxy on port 8889 to 8888 backend

siege -b -c50 -t30s http://localhost:8889 -m caddy-http.cache-c50-t30s

centmin mod nginx 1.13.3 with worker_processes 2; default

siege -b -c50 -t30s http://localhost -m cmm-nginx-c50-t30s

centmin mod nginx 1.13.3 with worker_processes auto;

siege -b -c50 -t30s http://localhost -m cmm-nginx-auto-c50-t30s
head -n1 /usr/local/var/log/siege.log; tail -8 /usr/local/var/log/siege.log     
      Date & Time,  Trans,  Elap Time,  Data Trans,  Resp Time,  Trans Rate,  Throughput,  Concurrent,    OKAY,   Failed
**** caddy-c50-t30s ****
2017-08-01 08:51:48,  56611,      30.00,         283,       0.03,     1887.03,        9.43,       49.60,   56611,       0
**** cmm-nginx-c50-t30s ****
2017-08-01 08:52:18, 108095,      29.98,         656,       0.01,     3605.57,       21.88,       49.03,  108095,       0
**** cmm-nginx-auto-c50-t30s ****
2017-08-01 10:57:54, 100516,      29.80,         610,       0.01,     3373.02,       20.47,       49.11,  100516,       0
**** caddy-http.cache-c50-t30s ****
2017-08-01 11:37:11,  71534,      29.10,         358,       0.02,     2458.21,       12.30,       49.42,   71534,       0
curl -I http://localhost:80
HTTP/1.1 200 OK
Date: Tue, 01 Aug 2017 10:51:36 GMT
Content-Type: text/html; charset=utf-8
Content-Length: 3801
Last-Modified: Sun, 30 Jul 2017 04:13:50 GMT
Connection: keep-alive
Vary: Accept-Encoding
ETag: "597d5cfe-ed9"
Server: nginx centminmod
X-Powered-By: centminmod
Accept-Ranges: bytes
curl -I http://localhost:8888
HTTP/1.1 200 OK
Accept-Ranges: bytes
Cache-Control: max-age=86400
Content-Length: 3801
Content-Type: text/html; charset=utf-8
Etag: "otvyf22xl"
Last-Modified: Sun, 30 Jul 2017 04:13:50 GMT
Server: Caddy
X-Powered-By: Caddy via CentminMod
Date: Tue, 01 Aug 2017 10:51:39 GMT
curl -I http://localhost:8889
HTTP/1.1 200 OK
Accept-Ranges: bytes
Cache-Control: max-age=86400
Content-Length: 3801
Content-Type: text/html; charset=utf-8
Date: Tue, 01 Aug 2017 11:34:46 GMT
Etag: "otvyf22xl"
Last-Modified: Sun, 30 Jul 2017 04:13:50 GMT
Server: Caddy
Server: Caddy
X-Cache-Status: miss
X-Powered-By: Caddy via CentminMod
curl -I http://localhost:8889
HTTP/1.1 200 OK
Accept-Ranges: bytes
Cache-Control: max-age=86400
Content-Length: 3801
Content-Type: text/html; charset=utf-8
Date: Tue, 01 Aug 2017 11:34:46 GMT
Etag: "otvyf22xl"
Last-Modified: Sun, 30 Jul 2017 04:13:50 GMT
Server: Caddy
Server: Caddy
X-Cache-Status: hit
X-Powered-By: Caddy via CentminMod
caddy -version
Caddy 0.10.6
caddy -plugins
Server types:
  net
  http

Caddyfile loaders:
  short
  flag
  default

Other plugins:
  http.authz
  http.awslambda
  http.basicauth
  http.bind
  http.browse
  http.cache
  http.cgi
  http.cors
  http.errors
  http.expires
  http.expvar
  http.ext
  http.fastcgi
  http.filemanager
  http.filter
  http.git
  http.gopkg
  http.gzip
  http.header
  http.hugo
  http.index
  http.internal
  http.ipfilter
  http.jwt
  http.limits
  http.log
  http.login
  http.mailout
  http.markdown
  http.mime
  http.minify
  http.nobots
  http.pprof
  http.prometheus
  http.proxy
  http.proxyprotocol
  http.push
  http.ratelimit
  http.realip
  http.reauth
  http.redir
  http.request_id
  http.restic
  http.rewrite
  http.root
  http.status
  http.templates
  http.timeouts
  http.upload
  http.webdav
  http.websocket
  net.host
  shutdown
  startup
  tls
  tls.dns.cloudflare
  tls.dns.digitalocean
  tls.dns.googlecloud
  tls.dns.linode
  tls.dns.namecheap
  tls.dns.ovh
  tls.dns.route53
  tls.dns.vultr
  tls.storage.file

nginx -V
nginx version: nginx/1.13.3
built by clang 3.4.2 (tags/RELEASE_34/dot2-final)
built with LibreSSL 2.5.5
TLS SNI support enabled
configure arguments: --with-ld-opt=‘-lrt -ljemalloc -Wl,-z,relro -Wl,-rpath,/usr/local/lib’ --with-cc-opt=‘-m64 -mtune=native -DTCP_FASTOPEN=23 -g -O3 -fstack-protector -fuse-ld=gold --param=ssp-buffer-size=4 -Wformat -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wno-sign-compare -Wno-string-plus-int -Wno-deprecated-declarations -Wno-unused-parameter -Wno-unused-const-variable -Wno-conditional-uninitialized -Wno-mismatched-tags -Wno-sometimes-uninitialized -Wno-parentheses-equality -Wno-tautological-compare -Wno-self-assign -Wno-deprecated-register -Wno-deprecated -Wno-invalid-source-encoding -Wno-pointer-sign -Wno-parentheses -Wno-enum-conversion -Wno-c++11-compat-deprecated-writable-strings -Wno-write-strings -gsplit-dwarf’ --sbin-path=/usr/local/sbin/nginx --conf-path=/usr/local/nginx/conf/nginx.conf --with-compat --with-http_stub_status_module --with-http_secure_link_module --with-libatomic --with-http_gzip_static_module --with-http_sub_module --with-http_addition_module --with-http_image_filter_module=dynamic --with-http_geoip_module --with-stream_geoip_module --with-stream_realip_module --with-stream_ssl_preread_module --with-threads --with-stream=dynamic --with-stream_ssl_module --with-http_realip_module --add-dynamic-module=…/ngx-fancyindex-0.4.0 --add-module=…/ngx_cache_purge-2.3 --add-module=…/ngx_devel_kit-0.3.0 --add-module=…/set-misc-nginx-module-0.31 --add-module=…/echo-nginx-module-0.60 --add-module=…/redis2-nginx-module-0.14 --add-module=…/ngx_http_redis-0.3.7 --add-module=…/memc-nginx-module-0.18 --add-module=…/srcache-nginx-module-0.31 --add-module=…/headers-more-nginx-module-0.32 --with-pcre=…/pcre-8.41 --with-pcre-jit --with-zlib=…/zlib-1.2.11 --with-http_ssl_module --with-http_v2_module --with-openssl=…/libressl-2.5.5

While I was developing the cache, I did the same benchmarks and I found nginx is faster because it uses the sendfile syscall which avoids copying data to userland, saving a lot of time. Caddy is based on middlewares and cannot use that because middlewares would not be able to intercept responses and modify them.
Anyway, as soon as you want to modify a response for every request (using https for example, where the response is encrypted), sendfile becomes useless, data has to be copied to userland. A fairer comparison should be disabling sendfile on nginx and with that the results should be closer. In raw http probably nginx will always win but if you use https (which is the selling point of caddy) differences should not be that much.

Another difference that I found was the worker_processes, increasing that to the number of cores gets better results. Nginx is event based and each worker only uses 1 cpu, while caddy uses go threads and the single process can use the 100% of the cpu.

2 Likes

Cheers i did do much older HTTP/2 benchmarks with h2load HTTP/2 load tester way back too at https://community.centminmod.com/threads/caddy-http-2-server-benchmarks.5170/ and again Nginx won with 3x times performance while Caddy used 2.6x times more cpu and 2.3x times more memory h2load HTTP/2 tests.

Will revisit them with caddy 0.10.6 and http.cache too as well as caddy scaling with more http headers added Any performance overhead as you add more headers under HTTP/2?

Did you also think about using an in-memory cache? I would actually prefer that over static file caching because the latter could also be done by the application.

@eva2000 I didn’t expect that result but it is completely logical that nginx is faster, it is written in C, it was written with performance as a first concern and it had years of optimizations. Caddy is still young and has a lot of opportunities for optimizations.

Maybe the results were because of different stream ciphers used (in your benchmark nginx used cacha-20 vs caddy that uses AES, cacha20 is supposed to be way faster), or TLS tickets (enabled in nginx but not in caddy that might force to create new TLS session each time, I’m not sure if h2load use them). About the latest versions, it will probably help, go got faster in the latest releases, especially after 1.7 with the new compiler optimizations.

@kekub I have thought about an in memory store. I wrote the storage layer independently so new implementations can be used. I have even written one that uses mmap, but it had the problem that if you had a lot of cached pages it could fill up your memory starting to swap the cached content. So I prefer to not release it until there is a setting to limit the cache size (using a LRU algorithm or something similar)

3 Likes

indeed nginx has a few years head start as well :slight_smile:

indeed, I will retest when I have more free time - looking forward to see what Go 1.8+ brings to Caddy :slight_smile:

Already seeing if this can help with an Apache/AJP proxy for a backend app. Apache caching was helpful, but slow. This already seems faster in initial tests. Thank you!

Edit: per what @kekub was asking about; when I was looking into the setup I’m working on now, Caching Guide - Apache HTTP Server Version 2.4 was somewhat helpful. I guess they use 3 different mechanisms? One of them appeared to be specifically memory caching for TLS/SSL + credentials usage.

I think you can still use this: key is to set it to cache images or something. It goes by MIME type; which I wish the http.gzip module did as well, because Apache supports doing DEFLATE (GZIP) based on type.

Working example for me (thus far)…

cache {
    default_max_age 1440m
    match_header Content-Type image/* text/css
    path /tmp/caddy
    status_header X-Cache-Status
}
1 Like

Have you seen any real use progress with this configuration ? What can I expect ?

This is being used in concert with http.proxy to secure an older backend app. I believe my boss’ words was “f-ing hit it out of the park”. :grinning:

Edit: I should say I’m still experimenting with the default_max_age; 10080m = 7 days; 1440m = 1 day.

1 Like

Excellent!