Proper way to set up multiple upstreams for a website

Hey guys!

My team and I try switching to Caddy from Nginx. We’re testing it for a few weeks already, and the server is really cool! I like the build tool, xcaddy, it makes custom builds super simple. And the config file, it’s pretty flexible and easy to read.

We mostly use Caddy as an edge web server works as a reverse proxy. It’s at least as good as Nginx in that role, but only when we have exactly one upstream per server. We tried multiple ways to set up multi-upstream servers, but still not sure why some of the ways don’t work properly, and what is the best way.

I’ll show what we did on a simplified example that can be run on any computer - a reverse proxy to a publically available website, https://www.garyvaynerchuk.com/

I’ll be testing things using 3 different uls, one per upstream:

Ways we tried

1. Using matches only

https://garyvaynerchuk.${TLD} {
    ${TLS_SETTING}

    @ajaxGoogleapis path /ajax-googleapis/*
    header @ajaxGoogleapis X-Caddy-Route "ajaxGoogleapis"
    uri @ajaxGoogleapis strip_prefix "/ajax-googleapis"
    reverse_proxy @ajaxGoogleapis {
        header_up host ajax.googleapis.com
        to https://ajax.googleapis.com
    }

    @awsCdn path /aws-cdn/*
    header @awsCdn X-Caddy-Route "awsCdn"
    uri @awsCdn strip_prefix "/aws-cdn"
    reverse_proxy @awsCdn {
        header_up host s3.amazonaws.com
        to https://s3.amazonaws.com
    }

    @website path not /ajax-googleapis/* /aws-cdn/*
    header @website X-Caddy-Route "website"
    reverse_proxy @website {
        header_up host www.garyvaynerchuk.com
        to https://www.garyvaynerchuk.com
    }
}

I have 3 matchers here, and I expect that the first two would catch specific prefixes only, and the 3rd one catches everything else. I also have a header directive for every matcher that will help me see which matcher caught the request.

1.1 ajaxGoogleapis

  • returns empty page (0 bytes)
  • has x-caddy-route: website header

1.2 awsCdn

  • returns empty page (0 bytes)
  • has x-caddy-route: website header

1.3 website

  • returns empty page (0 bytes)
  • doesn’t have x-caddy-route header

2. Using matchers + route

https://garyvaynerchuk.${TLD} {
    ${TLS_SETTING}

    @ajaxGoogleapis path /ajax-googleapis/*
    route @ajaxGoogleapis {
        header X-Caddy-Route "ajaxGoogleapis"
        uri strip_prefix "/ajax-googleapis"
        reverse_proxy {
            header_up host ajax.googleapis.com
            to https://ajax.googleapis.com
        }
    }

    @awsCdn path /aws-cdn/*
    route @awsCdn {
        header X-Caddy-Route "awsCdn"
        uri strip_prefix "/aws-cdn"
        reverse_proxy {
            header_up host s3.amazonaws.com
            to https://s3.amazonaws.com
        }
    }

    @website path not /wp-cdn/* /aws-cdn/* /ajax-googleapis/*
    route @website {
        header X-Caddy-Route "website"
        reverse_proxy {
            header_up host www.garyvaynerchuk.com
            to https://www.garyvaynerchuk.com
        }
    }
}

It’s almost identical, but I’ve wrapped every header and reverse_proxy pair into route instead of copying the named matcher every time.

2.1 ajaxGoogleapis

  • works fine

2.2 awsCdn

  • works fine

2.3 website

  • returns empty page (0 bytes)
  • doesn’t have x-caddy-route header

3. Matchers + route, “default” upstream unwrapped

https://garyvaynerchuk.${TLD} {
    ${TLS_SETTING}

    @ajaxGoogleapis path /ajax-googleapis/*
    route @ajaxGoogleapis {
        header X-Caddy-Route "ajaxGoogleapis"
        uri strip_prefix "/ajax-googleapis"
        reverse_proxy {
            header_up host ajax.googleapis.com
            to https://ajax.googleapis.com
        }
    }

    @awsCdn path /aws-cdn/*
    route @awsCdn {
        header X-Caddy-Route "awsCdn"
        uri strip_prefix "/aws-cdn"
        reverse_proxy {
            header_up host s3.amazonaws.com
            to https://s3.amazonaws.com
        }
    }

    header X-Caddy-Route "website"
    reverse_proxy {
        header_up host www.garyvaynerchuk.com
        to https://www.garyvaynerchuk.com
    }
}

It’s almost identical to #2, the only difference - last reverse_proxy and header pair is unwrapped.

3.1 ajaxGoogleapis

  • works fine

3.2 awsCdn

  • works fine

3.3 website

  • works fine

4. Matchers + handle, “default” upstream unwrapped

https://garyvaynerchuk.${TLD} {
    ${TLS_SETTING}

    @ajaxGoogleapis path /ajax-googleapis/*
    handle @ajaxGoogleapis {
        header X-Caddy-Route "ajaxGoogleapis"
        uri strip_prefix "/ajax-googleapis"
        reverse_proxy {
            header_up host ajax.googleapis.com
            to https://ajax.googleapis.com
        }
    }

    @awsCdn path /aws-cdn/*
    handle @awsCdn {
        header X-Caddy-Route "awsCdn"
        uri strip_prefix "/aws-cdn"
        reverse_proxy {
            header_up host s3.amazonaws.com
            to https://s3.amazonaws.com
        }
    }

    header X-Caddy-Route "website"
    reverse_proxy {
        header_up host www.garyvaynerchuk.com
        to https://www.garyvaynerchuk.com
    }
}

Almost identical to #3, but uses handle instead of route. Behaves exactly the same, not sure what is the difference between the two directives.

4.1 ajaxGoogleapis

  • works fine

4.2 awsCdn

  • works fine

4.3 website

  • works fine

Environment

1. Caddy version (caddy version):

v2.2.1 h1:Q62GWHMtztnvyRU+KPOpw6fNfeCD3SkwH7SfT1Tgt2c=
Built with:
xcaddy build v2.2.1 && sudo setcap 'cap_net_bind_service=+ep' ./caddy

2. How I run Caddy:

a. System environment:

Pretty sure that it doesn’t matter, tested on multiple different machines. My primary test installation is Ubuntu 20.04.1, x86_64, non-docker

b. Command:

./caddy-vanilla/caddy run -config websites/conf-dist/Caddyfile_garyvaynerchuk

1 Like

Thanks for the detailed post. This is interesting! If I understand correctly, you have several working configs, and a few that don’t work, and the crux of your question is:

but still not sure why some of the ways don’t work properly, and what is the best way.

There’s a few things I will point you to that will help explain… Caddyfile 101, so to speak.

  • In order to keep the Caddyfile easy to write, handler directives have a default order that is sensible for most configs: https://caddyserver.com/docs/caddyfile/directives#directive-order (you can change the order either by using global options or by defining routes manually)

  • You should run caddy adapt on your configs to see the resulting JSON. You will be able to quickly and clearly see what the actual handler chains look like.

  • Read this wiki article for advanced understanding of composing handlers using the Caddyfile:

Ultimately, as long as a config works for you, I would say it boils down to preference. You can look at the resulting JSON config and see if any of those make more sense under the hood.

You might even be able to simplify your config a little more. Your “default” or “fallback” proxy doesn’t necessarily need to explicitly exclude the other ones. Proxying is a terminal handler, meaning future handlers won’t be called (because a proxy handles the request by responding to it), so you can assume that later proxies won’t be called if earlier ones are. That means you can probably remove matchers like @website path not /ajax-googleapis/* /aws-cdn/* kind of like you did on your last attempts.

Does that all make sense?

1 Like

To add onto that, if I were to write it, I’d do it like this:

	handle_path /ajax-googleapis/* {
		header X-Caddy-Route "ajaxGoogleapis"
		reverse_proxy https://ajax.googleapis.com {
			header_up Host {http.reverse_proxy.upstream.hostport}
		}
	}

	handle_path /aws-cdn/* {
		header X-Caddy-Route "awsCdn"
		reverse_proxy https://s3.amazonaws.com {
			header_up Host {http.reverse_proxy.upstream.hostport}
		}
	}

	handle {
		header X-Caddy-Route "website"
		reverse_proxy https://www.garyvaynerchuk.com {
			header_up Host {http.reverse_proxy.upstream.hostport}
		}
	}

The changes:

  • Uses handle_path which is the same as handle + uri strip_prefix, saves you a line

  • Inlines the path matchers, because it’s a simpler syntax

  • Uses the special placeholder for the Host header, to avoid duplicating/hardcoding the hostname

  • Inlines the proxy to because it’s shorter

  • Wraps your “website” in a handle to make it mutually exclusive from the others

You’re right that route and handle are similar, they both give you subroutes. The difference are:

  • route overrides default the directive order, i.e. it skips the Caddyfile adapter’s sorting logic and preserves handlers in the same order you write them.

  • handle has mutual-exclusivity with other handle and handle_path blocks. This means that if one handle is matched, none of the others will be matched by the same request. Note that the Caddyfile adapter will sort them by the length of their path matchers, so one without a matcher will always be ordered last (unless you also wrap them in a route or something like that, which would skip that sorting behaviour)

The main problem with your first attempt was that uri strip_prefix will happen early on due to directive sorting, then later path not /ajax-googleapis/* /aws-cdn/* would also match because that segment was stripped away. You need mutual exclusivity to prevent the other bits from still running.

The problem with the 2nd was actually with your matcher, you used path not instead of not path, which means it was looking for paths named not, or /wp-cdn/*, etc.

2 Likes

@matt, Thank you for your answer, and especially for caddy adapt, now I finally have a way to look under the hood, way better than just random tests I’m doing! Your wiki article is very helpful too, official documentation lacks something like that. I did know about directive order from docs, but not the ways to control directives order with route and handle, and I didn’t have the very important idea of terminal vs non-terminal directives.

But still, with all the new ideas, I can’t get why the basic “matchers only” method doesn’t work. I’m trying something very basic, like this:

https://garyvaynerchuk.localhost {
    tls internal

    @ajaxGoogleapis path /ajax-googleapis/*
    header @ajaxGoogleapis X-Caddy-Route "ajaxGoogleapis"
    uri @ajaxGoogleapis strip_prefix /ajax-googleapis
    reverse_proxy @ajaxGoogleapis https://ajax.googleapis.com {
        header_up host {http.reverse_proxy.upstream.hostport}
        header_down X-Caddy-Proxy "ajaxGoogleapis"
    }

    @awsCdn path /aws-cdn/*
    header @awsCdn X-Caddy-Route "awsCdn"
    uri @awsCdn strip_prefix /aws-cdn
    reverse_proxy @awsCdn https://s3.amazonaws.com {
        header_up host {http.reverse_proxy.upstream.hostport}
        header_down X-Caddy-Proxy "awsCdn"
    }

    @website not path /aws-cdn/* /ajax-googleapis/*
    header @website X-Caddy-Route "website"
    reverse_proxy @website https://www.garyvaynerchuk.com {
        header_up host {http.reverse_proxy.upstream.hostport}
        header_down X-Caddy-Proxy "awsCdn"
    }

}

And when I try requesting this test URL: https://garyvaynerchuk.localhost/aws-cdn/gv2016wp/wp-content/uploads/20191212171239/text-me.png
I get error 404 from the Nginx server handling @website section, plus x-caddy-route: awsCdn, and x-caddy-proxy: website. Which makes me think that:

  • @awsCdn matcher caught the request for header directive
  • @website matcher caught the request too, for the terminal reverse_proxy directive

I get that matching order is different for header and reverse_proxy, and it’s generally OK that different matchers could be used for different directives, but how can that request match an explicit not path /aws-cdn/*?

And another odd thing here: if I delete the topmost block, the 7 lines starting with @ajaxGoogleapis, everything starts working as I expect, i.e. /aws-cdn is proxied to AWS, and everything else - to the website. This is even stranger to me. Is there any reason for this?

UPD: Sorry, just figured it out! The answer to this was in what Francis wrote:

The main problem with your first attempt was that uri strip_prefix will happen early on due to directive sorting, then later path not /ajax-googleapis/* /aws-cdn/* would also match because that segment was stripped away.

This methid starts routing properly if I remove the strip_prefix.

1 Like