Forward path_regexp groups into reverse proxy target

1. Caddy version (caddy version):

v2.3.0 h1:fnrqJLa3G5vfxcxmOH/+kJOcunPLhSBnjgIvjXV/QTA=

2. How I run Caddy:

a. System environment:

Docker

b. Command:

docker run -e HOST_NAME=dockerhoster -p 8043:443 -d --name front-api-caddy caddy:2.3.0-alpine
docker network create --internal front-api-network
docker network connect front-api-network front-api-caddy

After that, other websites that caddy is forwarding the traffic to are added to the same network.

c. Service/unit/compose file:

N/A

d. My complete Caddyfile or JSON config:

{$HOST_NAME} {
	tls internal

	@api_matcher {
			path_regexp api_reg \/([a-z0-9\-]+)\/([a-z\.]*)\/api\/.*
	}

	@front_matcher {
			path_regexp front_reg \/([a-z0-9\-]+)\/([a-z\.]*)\/.*
	}

	handle @api_matcher {
		reverse_proxy {re.api_reg.1}.{re.api_reg.2}.api:8080
	}

	handle @front_matcher {
		reverse_proxy {re.front_reg.1}.{re.front_reg.2}.front:8080
	}
}

3. The problem I’m having:

I’m trying to host my apps from different git branches (names can only have lower case letters, numbers and “-”) on the same caddy config. I need to reverse proxy these requests onto containers in the same internal network:

https://dockerhoster:8043/branch1/website1/api/function/... --> http://branch1.website1.api:8080/app/function/...
https://dockerhoster:8043/branch1/website1/index.html       --> http://branch1.website1.front:8080/index.html

https://dockerhoster:8043/branch1/website2/api/function/... --> http://branch1.website2.api:8080/app/function/...
https://dockerhoster:8043/branch1/website2/index.html       --> http://branch1.website2.front:8080/index.html

https://dockerhoster:8043/branch2/website1/api/function/... --> http://branch2.website1.api:8080/app/function/...
https://dockerhoster:8043/branch2/website1/index.html       --> http://branch2.website2.front:8080/index.html

https://dockerhoster:8043/branch2/website2/api/function/... --> http://branch2.website2.api:8080/app/function/...
https://dockerhoster:8043/branch2/website2/index.html       --> http://branch2.website2.front:8080/index.html

4. Error messages and/or full log output:

{
    "level": "info",
    "ts": 1634902397.273249,
    "logger": "http.log.access.log0",
    "msg": "handled request",
    "request": {
        "remote_addr": "192.168.73.149:49810",
        "proto": "HTTP/2.0",
        "method": "GET",
        "host": "dockerhoster:8043",
        "uri": "/favicon.ico",
        "headers": {
            "Accept": ["image/avif,image/webp,*/*"],
            "Accept-Encoding": ["gzip, deflate, br"],
            "Dnt": ["1"],
            "Referer": ["https://dockerhoster:8043/caddy3/balancing/"],
            "Sec-Fetch-Mode": ["no-cors"],
            "Te": ["trailers"],
            "User-Agent": ["Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:93.0) Gecko/20100101 Firefox/93.0"],
            "Cookie": ["COOKIE_SUPPORT=true; GUEST_LANGUAGE_ID=hr_HR; _ga=GA1.1.1766446519.1625061378; JSESSIONID=9B5AE65C121678BD8130EECCA9D94FB2; LFR_SESSION_STATE_20158=1624442301553; COMPANY_ID=20154; ID=78536747497434754c4e7750777935775a6a357944513d3d; USER_UUID=4533557049476a694448496c32614271394c55556471336d65355a75716f4b4561674251583638713465453d; LFR_SESSION_STATE_20433=expired"],
            "Sec-Fetch-Dest": ["image"],
            "Sec-Fetch-Site": ["same-origin"],
            "Cache-Control": ["max-age=0"],
            "Accept-Language": ["hr-HR,en-GB;q=0.8,hr;q=0.6,en-US;q=0.4,en;q=0.2"]
        },
        "tls": {
            "resumed": false,
            "version": 772,
            "cipher_suite": 4865,
            "proto": "h2",
            "proto_mutual": true,
            "server_name": "dockerhoster"
        }
    },
    "common_log": "192.168.73.149 - - [22/Oct/2021:11:33:17 +0000] \"GET /favicon.ico HTTP/2.0\" 0 0",
    "duration": 0.000047754,
    "size": 0,
    "status": 0,
    "resp_headers": {
        "Server": ["Caddy"]
    }
}

In this example, caddy3 was the name of the branch, and balancing was name of the website.

5. What I already tried:

My example right now is:

https://dockerhoster:8084/caddy3/balancing/api/x/y --> http://caddy3.balancing.api/x/y
https://dockerhoster:8084/caddy3/balancing/x/y     --> http://caddy3.balancing.front/x/y

If I put

reverse_proxy caddy3.balancing.front:8080

on the end of the Caddyfile, caddy forwards me to the container without problems. I guess that means my http request didn’t trigger the path_regexp, which in turn didn’t get handled by the reverse proxy.

I tried different regexes, but the only the most simple ones work, like ^.*$ If I put ^.caddy.$ (my branch name is caddy3) it stops working.

Is only my regex wrong? Is my whole approach wrong?

Thanks for your guys help!

Please upgrade to v2.4.5!

The request in your logs is to /favicon.ico, which isn’t covered by either of your regexp.

Yeah, I think it’s a problem with your regexp.

Use https://regex101.com/ with the “Golang” mode to test out your regexp against different request paths you expect to work.

At a glance, \/([a-z0-9\-]+)\/([a-z\.]*)\/api\/.* doesn’t match /branch1/website1/api/function because website1 has a number and [a-z\.] doesn’t allow numbers.

1 Like

Ok, now I understand the difference between Path and Referer. What I was actually looking for was header_regexp and field Referer. Now instead of:

path_regexp front_reg \/([a-z0-9\-]+)\/([a-z\.]*).*

I put:

header_regexp front_reg Referer ^https:\/\/.+?\/([a-z0-9\-]+)\/([a-z]+).*$

but it still doesn’t find it. I tested it with Regex101, and it works (the website1 was a bad example, it will always have only lowercase letters)
Proof of regex

So now, does this make sense:

{$HOST_NAME} {
	tls internal

	@api_matcher {
			path_regexp api_reg \/([a-z0-9\-]+)\/([a-z\.]*)\/api\/.*
	}

	@front_matcher {
			header_regexp front_reg Referer ^https:\/\/.+?\/([a-z0-9\-]+)\/([a-z]+).*$
	}

	handle @api_matcher {
		reverse_proxy {re.api_reg.1}.{re.api_reg.2}.api:8080
	}

	handle @front_matcher {
		reverse_proxy {re.front_reg.1}.{re.front_reg.2}.front:8080
	}
}

Hmm. No. I don’t think you want Referer. Path is correct.

The Referer header is just something the browser fills in so that the server can have an indication of what page came before this request, i.e. where the user came from to land on the current page. If the request is for some asset (image/JS/CSS) then typically Referer will be the page (i.e. the HTML document) that included those assets.

I was just saying earlier that you provided the wrong access log. The /favicon.ico request is something browsers automatically do, to show that little branding icon on the browser tab. It’s not relevant at all here. Look for your other access logs that have more relevant request paths.

1 Like

This topic was automatically closed after 30 days. New replies are no longer allowed.