Suggestions for simplifying my Caddyfile

asb · September 11, 2022, 8:18am

1. Output of `caddy version`:

v2.5.2

2. How I run Caddy:

systemctl start caddy

a. System environment:

Arch Linux

b. Command:

systemctl start caddy

c. Service/unit/compose file:

https://github.com/archlinux/svntogit-community/blob/377137893476949e35f0a2f66df52b01acc97b32/trunk/caddy.service

d. My complete Caddy config:

{
	servers {
		protocol {
			experimental_http3
		}
	}
	email asb@asbradbury.org
}

(muxup_file_server) {
	file_server {
		index ""
		precompressed br
		disable_canonical_uris
	}
}
www.muxup.com {
	redir https://muxup.com{uri} 308
	header Cache-Control "max-age=2592000, stale-while-revalidate=2592000"
}
muxup.com {
	root * /var/www/muxup.com/htdocs
	encode gzip
	log {
		output file /var/log/caddy/muxup.com.access.log
	}
	header Strict-Transport-Security "max-age=31536000; includeSubDomains; preload"

	vars short_cache_control "max-age=3600"
	vars long_cache_control "max-age=2592000, stale-while-revalidate=2592000"

	@method_isnt_GET_or_HEAD not method GET HEAD
	@path_is_suffixed_with_html_or_br path *.html *.html/ *.br *.br/
	@path_or_html_suffixed_path_exists file {path}.html {path}
	@html_suffixed_path_exists file {path}.html
	@path_or_html_suffixed_path_doesnt_exist not file {path}.html {path}
	@path_is_root path /
	@path_has_trailing_slash path_regexp ^/(.*)/$

	handle @method_isnt_GET_or_HEAD {
		error 405
	}
	handle @path_is_suffixed_with_html_or_br {
		error 404
	}
	handle @path_has_trailing_slash {
		route {
			uri strip_suffix /
			header @path_or_html_suffixed_path_exists Cache-Control "{vars.long_cache_control}"
			redir @path_or_html_suffixed_path_exists {path} 308
			error @path_or_html_suffixed_path_doesnt_exist 404
		}
	}
	handle @path_is_root {
		rewrite index.html
		header Cache-Control "{vars.short_cache_control}"
		import muxup_file_server
	}
	handle @html_suffixed_path_exists {
		rewrite {path}.html
		header Cache-Control "{vars.short_cache_control}"
		import muxup_file_server
	}
	handle * {
		header Cache-Control "{vars.long_cache_control}"
		import muxup_file_server
	}
	handle_errors {
		header -Cache-Control
		respond "{err.status_code} {err.status_text}"
	}
}

3. The problem I’m having:

I’ve recently started a new blog and serve it using Caddy. I’ve documented my Caddy setup here. As noted, I had some fairly specific goals in terms of things like redirect behaviour. As the config ended up being fairly involved, I wondered if anyone saw opportunities to simplify (while meeting my stated requirements). See also my test script

I’m pasting the list of goals from the blog post here for convenience:

Enable new and shiny things like HTTP3 and serving Brotli compressed content.
Set appropriate Cache-Control headers in order to avoid unnecessary re-fetching content. Set shorter lifetimes for served .html and 308 redirects vs other assets. Leave 404 responses with no Cache-Control header.
Avoid serving the same content at multiple URLs (unless explicitly asked for) and don’t expose the internal filenames of content served via a different canonical URL. Also, prefer URLs without a trailing slash, but ensure not to issue a redirect if the target file doesn’t exist. This means (for example):
- muxup.com/about/ should redirect to muxup.com/about
- muxup.com/2022q3////muxup-implementation-notes should 404 or redirect.
- muxup.com/about/./././ should 404 or redirect
- muxup.com/index.html should 404.
- muxup.com/index.html.br (referring to the precompressed brotli file) should 404.
- muxup.com/non-existing-path/ should 404.
- If there is a directory foo and a foo.html at the same level, serve foo.html for GET /foo (and redirect to it for GET /foo/).
Never try to serve */index.html or similar (except in the special case of GET /).

I believe the rules regarding // and /./ are unimplementable until v2.6.0 (which includes #4948). EDIT: I also can’t see a straightforward way of modifying the above Caddyfile to give 404s for www.muxup.com/non-existent requests but redirects if the target exists.

4. Error messages and/or full log output:

N/A.

5. What I already tried:

See Caddyfile above.

6. Links to relevant resources:

N/A.

asb · September 11, 2022, 4:13pm

I’ve also tried to add logic to reject any requests that don’t use the HTTP methods GET or HEAD and am a bit baffled as the results seem to indicate that the handle directives aren’t being executed in the order I expect (as documented “the first matching handle block will be evaluated”).

I’m getting the following:

$ curl -X POST -s -o /dev/null -w'%{http_code}\n' https://muxup.com
200
$ curl -X POST -s -o /dev/null -w'%{http_code}\n' https://muxup.com/index.html
404
$ curl -X POST -s -o /dev/null -w'%{http_code}\n' https://muxup.com/about
405

When my expectation would be 405 for all URLs as the first matching handle block is surely handle @method_isnt_GET_or_HEAD?

matt · September 12, 2022, 3:44am

These are really fascinating requirements!

And your test script is very similar in concept to what I want to someday add to Caddy natively: a way to run automated tests before applying new configurations, to make sure your site will work the way you expect.

Use Caddy 2.6 beta; HTTP/3 is on by default. The brotli content is already supported, as you discovered.

Keep in mind this will break relative links on index files, as the browser won’t know it has actually accessed a directory. There’s a good reason that index files / directories are served with a trailing slash!

Anyway, I am not sure exactly what your question is; you’ve stated your goals, but is there specifically something not working? (It seems you’re wanting suggestions to simplify your Caddyfile, but I’m not sure I have any yet.)

I’m also not super confident that 2.6 will do everything you are hoping for, at least with the standard path matcher. All of what you’re wanting can definitely be done with a custom matcher or handler (or combination of both), but right now I think only some of them can be accomplished natively, even with 2.6. You should download the beta (grab the latest from master, actually) and try!

Looks like you’ve got that working?

In 2.6, try a path matcher with *//* – let me know if that doesn’t work.

Mmm, not sure about this one; I don’t believe I’ve treated . components like repeated slashes (//) where, if you specify it in your pattern, it doesn’t normalize that part of the URI. (The complexity starts getting really high.) Could maybe open an issue to request this feature.

Looks like you have these working?

I often do this with try_files {path}.html, at least for the first part. The redir is extra.

I guess what you have, index "" is the way to do that. Never really heard of that; most people just don’t have index.html files if they don’t want them to be served.

Not entirely sure I understand this part: you if it’s non-existent how could it ever exist, and thus redirect?

I’ll have to try reproducing this when I get into the office tomorrow.

asb · September 13, 2022, 8:21am

Supporting this kind of testing natively would be a huge help I think - great idea!

matt:

asb:

Also, prefer URLs without a trailing slash

Keep in mind this will break relative links on index files, as the browser won’t know it has actually accessed a directory. There’s a good reason that index files / directories are served with a trailing slash!

Anyway, I am not sure exactly what your question is; you’ve stated your goals, but is there specifically something not working? (It seems you’re wanting suggestions to simplify your Caddyfile, but I’m not sure I have any yet.)

I’m also not super confident that 2.6 will do everything you are hoping for, at least with the standard path matcher. All of what you’re wanting can definitely be done with a custom matcher or handler (or combination of both), but right now I think only some of them can be accomplished natively, even with 2.6. You should download the beta (grab the latest from master, actually) and try!

I wasn’t very clear on this one and it’s perhaps more a requirement on the site content than the server setup. I’d started out by serving foo/index.html as /foo but for the reasons you suggested moved to generating such files as /foo.html. I guess the point is that outside of the root /index.html I’m never intending to serve an index.html or expose a /foo/ URL.

Yes, sorry if I wasn’t clear about what is/isn’t working.

Did you consider exposing a non-normalised form of the request path as a variable (the original non-rewritten one would work for my case, not sure if it would work for everyone)? Then I could just use a CEL expression on it.

Yep.

It’s may be a bit overkill, but I was imagining www.muxup.com/non-existent just gives a 404 while www.muxup.com/about (which does exist) redirects to muxup.com/about and www.muxup.com/about/ redirects directly to muxup.com/about. Though I’m now thinking that always redirecting from non-www may be preferable, even if it’s just to a 404.

Thanks! I wasn’t really sure whether to post this is a help thread or a showcase. As you say, I’ve got most things I wanted to work to work. The reason I wanted to post here was:

The behaviour I noted around the POST handling made me wonder if I’d misunderstood the docs about the semantics of handle (or if it’s possible I’m running into a bug here)
Although what I’ve got works, I wanted to check if it’s the “right” or normal way to go about it, to the extent there is such a thing.

matt · September 13, 2022, 5:43pm

If you run caddy adapt you can see what is going on under the hood. I’m still assessing whether that’s intended as correct, or not.

matt · September 13, 2022, 6:14pm

So, the reason for the different ordering comes from sortRoutes():

github.com

caddyserver/caddy/blob/20d487be573424e7647b5a157754f6e284554e23/caddyconfig/httpcaddyfile/directives.go#L389-L450


      
          	sort.SliceStable(routes, func(i, j int) bool {
          		// if the directives are different, just use the established directive order
          		iDir, jDir := routes[i].directive, routes[j].directive
          		if iDir != jDir {
          			return dirPositions[iDir] < dirPositions[jDir]
          		}
          
          
		// directives are the same; sub-sort by path matcher length if there's
          		// only one matcher set and one path (this is a very common case and
          		// usually -- but not always -- helpful/expected, oh well; user can
          		// always take manual control of order using handler or route blocks)
          		iRoute, ok := routes[i].Value.(caddyhttp.Route)
          		if !ok {
          			return false
          		}
          		jRoute, ok := routes[j].Value.(caddyhttp.Route)
          		if !ok {
          			return false
          		}

This file has been truncated. show original

It’s important that we sort directives properly by their matchers, otherwise some directives that are mutually-exclusive would have no effect ever. Maybe there’s a bug here, maybe not. Still determining.

matt · September 13, 2022, 6:34pm

Your situation is complex. This behavior is partially expected and partially a bug.

Changing these lines:

github.com

caddyserver/caddy/blob/20d487be573424e7647b5a157754f6e284554e23/caddyconfig/httpcaddyfile/directives.go#L421-L426


      
          		if len(iPM) > 0 {
          			iPathLen = len(iPM[0])
          		}
          		if len(jPM) > 0 {
          			jPathLen = len(jPM[0])
          		}

-		if len(iPM) > 0 {
+		if len(iPM) == 1 {
			iPathLen = len(iPM[0])
		}
-		if len(jPM) > 0 {
+		if len(jPM) == 1 {
			jPathLen = len(jPM[0])
		}

Causes the more-expected results. But not completely.

One of your handle blocks (@path_is_root) matches exactly the path /, and so it is the first subroute in the adapted JSON, because a directive with any path matcher is preferred over one that doesn’t have a path matcher. Maybe this is counter-intuitive, and it could be a bug. I dunno yet. The reason it moves your @path_is_root handle block to be the first route is because its path matcher / (length 1) is longer than the other blocks without a path matcher (length 0).

For routes with just 1 path to match, we sort by longest path first, otherwise matching /foo* first would shadow matching /foo/bar which is more specific; i.e. /foo/bar matcher would never have a chance to match. So in the case of / versus no matcher, that’s a bit unexpected.

I’m trying to decide if we should simply not reorder the handle blocks at all, or if we only sort blocks with path matchers.

matt · September 13, 2022, 7:44pm

I opened an issue here: Route ordering is weird · Issue #5037 · caddyserver/caddy · GitHub

And fixed the problem here: httpcaddyfile: Fix sorting of repeated directives · caddyserver/caddy@754fe4f · GitHub

The sorting works as expected for me now. You should try it and let me know! (Again, use caddy adapt to see the difference.)

system · October 11, 2022, 8:19am

This topic was automatically closed after 30 days. New replies are no longer allowed.