Handle .html files on Spaces after rewrite

1. Caddy version:

Caddy version v2.6.2

2. How I installed, and run Caddy:

Downloaded binary from release Release v2.6.2 · caddyserver/caddy · GitHub.

a. System environment:

Ubuntu 22.10, systemd

b. Command:

/var/lib/caddy/caddy run --environ --config /var/lib/caddy/Caddyfile

c. Service/unit/compose file:

[Unit]
Description=Caddy
Documentation=https://caddyserver.com/docs/
After=network.target network-online.target
Requires=network-online.target

[Service]
Type=notify
User=caddy
Group=caddy
ExecStart=/var/lib/caddy/caddy run --environ --config /var/lib/caddy/Caddyfile
ExecReload=/var/lib/caddy/caddy reload --config /var/lib/caddy/Caddyfile --force
TimeoutStopSec=5s
LimitNOFILE=1048576
LimitNPROC=512
PrivateDevices=yes
PrivateTmp=true
ProtectSystem=full
AmbientCapabilities=CAP_NET_BIND_SERVICE

[Install]
WantedBy=multi-user.target

d. My complete Caddy config:

{
	on_demand_tls {
		ask https://google.com
		interval 2m
		burst 5
	}
}

https:// {
	tls {
		on_demand
	}

	encode gzip zstd

	rewrite * /pages{uri}

	try_files /pages{path} /pages{path}/ /pages{path}/index.html /pages{path}.html

	file_server

	reverse_proxy * https://build.nyc3.digitaloceanspaces.com {
		header_up Host build.nyc3.digitaloceanspaces.com
	}

	log {
		output file /var/lib/caddy/web_access.log
	}
}

3. The problem I’m having:

I’m trying to serve HTML files from a specific folder on object storage (build.nyc3.digitaloceanspaces.com/pages) and eventually there will be other folders (like build.nyc3.digitaloceanspaces.com/somethingelse) for different sites with different domains pointed at them. I can navigate directly to a file by adding index.html (like /contact/index.html) but the page won’t load if I do a hard refresh without targeting the file specifically (e.g. /contact doesn’t work). Sometimes the files could actually be in a different format like /about.html instead of /about/index.html. It seems like try_files would help resolve these different filename possibilities but I can’t seem to get it to work while still doing a rewrite to remove the folder name so visitors see theirdomain.com/about instead of theirdomain.com/pages/about.

4. Error messages and/or full log output:

No specific error messages, Digital Ocean Spaces returns <Error><Code>NoSuchKey</Code></Error> in the browser if I don’t navigate to the file specifically.

5. What I already tried:

handle_path to strip prefix:

rewrite subfolder: https://www.reddit.com/r/selfhosted/comments/o7e580/caddy_reverse_proxy_with_directory_at_the/

Expanded try_files: Try_files Rewrite Request exclude path help plz

6. Links to relevant resources:

Don’t do this. You must configure this to be an endpoint you control, which can decide whether the domain is one you want to allow. Otherwise, you’re at risk of DDOS. An attacker can force your server to continually issue certs until you run out of storage space.

try_files depends on having files on disk that Caddy can check to see if they exist. You don’t have a configured root, so this won’t work the way you expect.

It’s not possible to use try_files-style rewrites with a proxy upstream, because there’s no way for Caddy to efficiently check if the file exists.

Directives are sorted according to this predetermined directive order: Caddyfile Directives — Caddy Documentation

This means that reverse_proxy will always run before file_server, so your file_server here will never do anything.

If you do want to serve static files, you need to use request matchers to tell Caddy when to proxy and when to serve files, based on a condition on the incoming request.

1 Like

Hi @francislavoie, thanks for the reply!

For sure, this is just a work-in-progress (wanted to copy the whole config to avoid redacting important info). Ultimately I want this to be an endpoint that will check for a .domains file in the subfolder for the given project (pages in this case) and return a 200 status if the requested domain if listed.

So there’s no way to resolve different paths like /contact/index.html or /about.html that are stored on object storage like S3 or Spaces? I figured it would be pretty common to serve static websites from these services.

Thanks for the directives order reference, I’d been rearranging these to no effect. I had read about wrapping these in route to enforce order, but I couldn’t quite get it right.

I guess this is what I’m struggling with, I want to serve files from the proxy for every request. For now, I’m just hardcoding all requests to the pages subfolder where a static website sits. Eventually, I’d require additional context, like a CName or TXT record, for the custom domain to point requests to the correct subfolder for their website (could be something other than pages). All the website content would be static files that live in the bucket in object storage (no files local to Caddy), but I can’t seem resolve the static file paths for the reverse_proxy. Is it possible to do what I’m attempting? Thank you for the help!

No, because the file matcher uses OS system calls to check for the existence of files on disk.

Later, it will be possible to configure Caddy to use a virtual filesystem instead, which would make it possible to use S3 with the file matcher and file_server. There is the GitHub - sagikazarmark/caddy-fs-s3: Caddy FS module for AWS S3 plugin which you can use right now with file_server, but it doesn’t work with the file matcher yet. That’s still a TODO.

But it’s not possible with reverse_proxy, because it can only send requests upstream and get a response back, it has no way to check if a file exists ahead of time on the upstream. You could potentially hack something together with handle_response in the proxy options to retry with a different path, but that’s really complicated and tedious.

Very cool, is this the issue to watch for progress: Define a virtual FS to use for all subsequent routes · Issue #5057 · caddyserver/caddy · GitHub

I downloaded the newest version of Caddy with caddy-fs-s3 enabled. I think you’re saying matching all the different try_files is still a TODO for that module, but I’m unsure if the file_server allows visiting paths without explicitly adding /index.html to the end? It doesn’t seem to be the case, but maybe I’ve misconfigured something:

rewrite * /pages{uri}
reverse_proxy * https://build.nyc3.digitaloceanspaces.com {
        header_up Host build.nyc3.digitaloceanspaces.com
}
file_server {
        fs s3 {
                bucket build
                region nyc3
                endpoint https://build.nyc3.digitaloceanspaces.com
        }
}

It is cool that caddy-fs-s3 let’s you authenticate to private buckets: Auth support? · Issue #7 · sagikazarmark/caddy-fs-s3 · GitHub

Thanks again for the help!

Yep. I forgot I opened that issue :see_no_evil: but that’s the one.

I’m not sure. I’ve never tried using that plugin (I don’t use any AWS services).

If the request is to a directory (i.e. ends with /) I think it should look for index.html, but it won’t otherwise. A try_files rewrite would be needed for that, to test for that file existing on disk before doing the rewrite. But yeah, no virtual-fs support for try_files (and the file matcher which it uses).