Noob reverse proxy questions


(A CH G) #1

Hello,

First and foremost apologies if this question has been answered already but I don’t seem to find it anywhere.

Currently I have a setup at home as follows:

Router -> KVM -> VM (With nextcloud/Collabora) and both are working perfectly. However, I’ve realised that I want to run more web apps facing the internet, and I’m at the conundrum of being stuck, and more confused, the more I read about it. Which brought me to reverse proxies and this forum.

I have read the http.proxy entry in the wiki but still lost as well. Let’s see if trying to explain someone can point me in the right direction. Please bear in mind I’m pretty new to web servers. My idea is as below:

Router -> KVM (future proxy) -> VM1 (nextcloud/collabora), VM2, Docker, etc…

I wouldn’t like to have to start everything from scratch if possible. I have Caddy now up and running in the KVM host. Ready to do a caddyfile.conf there with:

nexclout.example.com{
proxy /var/www/html/nextcloud 192.168.0.42:80 {
transparent
websocket
}
}

And my current nextcloud caddyfile looks like this:

nextcloud.example.com {

root   /var/www/html/nextcloud

fastcgi / 127.0.0.1:9000 php {
        env PATH /bin
}

header / {
         Referrer-Policy                   no-referrer
         Strict-Transport-Security         "max-age=31536000;"
}

# checks for images
rewrite {
        ext .svg .gif .png .html .ttf .woff .ico .jpg .jpeg
        r ^/index.php/(.+)$
        to /{1} /index.php?{1}
}

rewrite {
        r ^/index.php/.*$
        to /index.php?{query}
}

# client support (e.g. os x calendar / contacts)
redir /.well-known/carddav /remote.php/carddav 301
redir /.well-known/caldav /remote.php/caldav 301

# remove trailing / as it causes errors with php-fpm
rewrite {
        r ^/remote.php/(webdav|caldav|carddav|dav)(\/?)(\/?)$
        to /remote.php/{1}

rewrite {
r ^/remote.php/(webdav|caldav|carddav|dav)/(.+?)(/?)(/?)$
to /remote.php/{1}/{2}
}

rewrite {
        r ^/public.php/(dav|webdav|caldav|carddav)(\/?)(\/?)$
        to /public.php/{1}
}

rewrite {
        r ^/public.php/(dav|webdav|caldav|carddav)/(.+)(\/?)(\/?)$
        to /public.php/{1}/{2}
}

# .htaccess / data / config / ... shouldn't be accessible from outside
status 403 {
        /.htaccess
        /data
        /config
        /db_structure
        /.xml
        /README
}

}

Do I have to change anything in the current caddyfile? Do I need both caddy servers running and both caddyfiles configured? Shouldn’t they be configured equally or is the proxy just redirecting to the caddyfile in the VM and that server is doing all the work? Is necessary to add a caddyfile for collabora as well?

Many thanks and apologies for the stupid questions, if any!

Regards,
Arehandoro.


(A CH G) #2

Ok, so… that didn’t work xD Let’s see if I come up with the solution haha.


(Matthew Fay) #3

Hi @Arehandoro, welcome to the Caddy community!

Firstly, by way of a bit of an explanation of what’s going on: conceptually, a proxy acts as both a server (when you connect to it) and a client (when it connects to the upstream server).

You browse to the external site (nextclout.example.com (was this typo from your configuration intentional?)), and connect to the external Caddy. You have the external Caddy configured as a transparent proxy, so it puts your request on a very short wait while it goes to the upstream server (192.168.0.42:80). It says, “I’m looking for nextclout.example.com,” and whatever the internal Caddy serves in response, the external Caddy passes back to you, the client, as though it had produced it itself.

The internal web server doesn’t have to be a Caddy. It could be any HTTP-speaking server. But the upstream server does need to be configured to serve what you want the end client (you, the person browsing to your site) to receive.


Now, I can see a few potential issues:

  1. External Caddy configured to serve nextclout.example.com, but internal Caddy configured for nextcloud.example.com. Transparent proxies request the same hostname the end client requested, so your internal Caddy won’t know how to serve the nextclout subdomain and you’d be getting “404 Site not served on this interface” errors.

  2. The base path for your proxy is /var/www/html/nextcloud. That means that only requests that start with this path (i.e. https://nextclout.example.com/var/www/html/nextcloud in your browser’s URL bar) will be proxied; everything else falls through to the static file server. You probably want to change this to just /, so that any path requested on that subdomain is proxied.

  3. Internal Caddy configured for Automatic HTTPS. This is somewhat a can of worms - the first thing the internal Caddy will do is try to retrieve a HTTPS certificate. To do this, it will try to complete an ACME challenge - either HTTP or TLS-ALPN - which will fail because the internal Caddy isn’t actually answering the challenge, the external Caddy is. Caddy has native clustering capabilities you can use to solve this fairly simply, if you want; sharing Caddy’s certificate storage folders between the host and the VM will sync them right up, giving them the ability to use each other’s certificates and solve ACME challenges for each other.

  4. Following on from above, if you clear that hurdle, the internal Caddy will then set up a listener on port 80 (HTTP) that issues permanent redirects to port 443 (HTTPS). This works for browsers, which will remember these redirects, but not for the proxy, which you have explicitly configured to connect to port 80 every time and faithfully return the result. This means your client’s are gonna get a redirect loop; you request https://nextclout.example.com, external Caddy asks for nextclout.example.com from the internal Caddy on port 80, the internal Caddy responds with a redirect to HTTPS, that response goes back to the client, who connected via HTTPS originally - so they try again… ad nauseum. The fix is to change your upstream server from [ip]:80 to https://[ip] or [ip]:443. I prefer to use the scheme instead of the port, I find it more obvious to someone reading the config later since the scheme implies the port more strongly than the port implies the scheme.

You can alternately avoid 3 and 4 entirely by turning off Automatic HTTPS; replace the site label nextcloud.example.com with http://nextcloud.example.com to tell Caddy to serve it over regular HTTP only.


(A CH G) #4

Hi Matthew,

Thanks for your explanation and well detailed explain. I spent most of the evening trying this, with subtle changes that I’ll try to reproduce now, all of them receiving the error 521. I think managed to at least learn few things in the process too!

  1. Apologies for the typo, I was typing rather than copying the url. In both the proxy and the client is correct. so we can skip this point :slight_smile:

  2. You are right here, this is one of the things I noticed last night. I first thought I had to put the root where the web-app is located, like in the client, which obviously as you mention doesn’t work.

You got me lost there I think. Is it as simple as doing “rclone sync /VM/path/to/certificate storage /Host/path/to/certificate/storage” ? (Can’t remember by heart if that is the correct syntax, just as example)

Then I would need to redirect to a specific port from the proxy (let’s say 2018) to the client and then setup the caddyfile in the client for nextcloud to reflect that port, isn’t it? And opening that port in the client too, I imagine*

*I did this last night, configured the proxy to send to port 2018 and the client to be listening to that same port. But still got error 521.

Alternatively, rather than doing this, do you mean just putting http://nextcloud.example.com in the client caddyfile so the proxy is the one to do the https challenge and the proxy it to the client?

Seeing your answer, and my results last night, yesterday kind of decided to do one thing. Plus another one I realised this morning.

The first is that having configured nextcloud/collabora, perhaps is easier to leave it as it is and use that same VM as a proxy for future proxys, instead of using the VM to proxy everything. At least I don’t need to have nextcloud/collabora down until I manage to fix the issue.

The second is that… I was putting the caddyfile in the proxy in the wrong folder during the WHOLE night so it might be the case that eventually got it to work but caddy wasn’t reading the file haha. Will try again next time I have some time free :slight_smile:

Thanks again for your help. Will post here my results.

Cheers.


(Matthew Fay) #5

No worries!

Is that a Cloudflare error? That’s a whole extra beast - another reverse proxy in front of your reverse proxy! It ALSO handles HTTPS for you, and will be configured to talk either HTTP or HTTPS to your own server, with the same caveats you need to be aware of for your own internal reverse proxy.

It CAN be done, as long as you know exactly what you’re doing setting it up, but I find it much simpler to “grey-cloud” your URL in Cloudflare so they don’t reverse proxy it, they just provide DNS resolution.

No. They have to share the same underlying filesystem at the CADDYPATH, such as via NFS.

They will see each other’s certificates and use them, and if one Caddy initiates a request, it will put down a lock file that any other clustered Caddy sharing that file system can use to solve the ACME challenge and store the new certificate for all the others to start using.

With KVM, you can add a mounted filesystem to a guest, the target being a directory on the host system. This approach would work well.

Yes. Do HTTPS at the edge (client -> external Caddy), talk HTTP internally (external Caddy -> internal Caddy). This is quite a common setup - private, internal networks can usually be assumed to be secure enough for this not to be a problem.

That can shoot your efforts in the foot :stuck_out_tongue: