Regex Routing decision

1. Caddy version (caddy version):

2.1.1

2. How I run Caddy:

a. System environment:

docker-compose

b. Command:

docker-compose up

c. Service/unit/compose file:

caddy:
    image: caddy:2.1.1
    << : *restart_policy
    ports:
      - '0.0.0.0:80:80/tcp'
      - '0.0.0.0:443:443/tcp'
    volumes:
      - './caddy:/etc/caddy'
      - './caddy_data:/data'
    depends_on:
      - web
      - relay

d. My complete Caddyfile or JSON config:

sentry.[MY DOMAIN]:443 {
    log

    reverse_proxy /api/store/* relay:3000

    @apiversion path_regexp ^/api/[1-9]\d*/.*$
    reverse_proxy @apiversion relay:3000

    # Catch all for the rest
    reverse_proxy /* web:9000
}

3. The problem I’m having:

I don’t understand the path matching of caddy.

I want to route the regex to the subservice. e.g following request should be routed to the relay:3000

  • /api/2/post
  • /api/45/commit
  • /api/store/commit

and all other to the web:9000.

With the above configuration all request are routed to the web:9000 service which is wrong

When I change the last line to reverse_proxy web:9000 (remove the /*) than it works as expected.

Why is it this way and what is the logic behind the routing decisions

Ah. What happens is obvious if you adapt the Caddyfile to JSON with caddy adapt --config /path/to/Caddyfile --pretty:

{
	"apps": {
		"http": {
			"servers": {
				"srv0": {
					"listen": [
						":443"
					],
					"routes": [
						{
							"match": [
								{
									"host": [
										"sentry.foo.com"
									]
								}
							],
							"handle": [
								{
									"handler": "subroute",
									"routes": [
										{
											"handle": [
												{
													"handler": "reverse_proxy",
													"upstreams": [
														{
															"dial": "relay:3000"
														}
													]
												}
											],
											"match": [
												{
													"path": [
														"/api/store/*"
													]
												}
											]
										},
										{
											"handle": [
												{
													"handler": "reverse_proxy",
													"upstreams": [
														{
															"dial": "web:9000"
														}
													]
												}
											],
											"match": [
												{
													"path": [
														"/*"
													]
												}
											]
										},
										{
											"handle": [
												{
													"handler": "reverse_proxy",
													"upstreams": [
														{
															"dial": "relay:3000"
														}
													]
												}
											],
											"match": [
												{
													"path_regexp": {
														"pattern": "^/api/[1-9]\\d*/.*$"
													}
												}
											]
										}
									]
								}
							],
							"terminal": true
						}
					],
					"logs": {}
				}
			}
		}
	}
}

Notice that your catch-all match/handle pair is ordered before the one with path_regexp .

This is because Caddy uses the length of the path matchers from the Caddyfile to determine the order. Since path_regexp isn’t path, it doesn’t take part in that sorting. It’s hard to know how to sort a regexp because it’s a rule-based thing, not just a simple string to compare. The assumption here is that a longer path is a more specific matcher in general.

Omitting /* is the correct solution. Really, omitting the matcher is the same as having specified * which means “all requests”. It’s more correct for what you want anyways. After doing that, you’ll see that it gets correctly sorted with the least specific, i.e. the implicit * (no matcher) is ordered last.

I can also suggest another approach for your config to reduce duplication a bit - you can combine your first two reverse_proxy directives with one matcher, since they both have the same destination:

@api {
	path /api/store/*
	path_regexp ^/api/[1-9]\d*/.*$
}
reverse_proxy @api relay:3000

reverse_proxy web:9000
1 Like

I suspect this approach won’t work.

A named matcher definition constitutes a matcher set . Matchers in a set are AND’ed together; i.e. all must match. For example, if you have both a header and path matcher in the set, both must match.

For most matchers that accept multiple values, those values are OR’ed; i.e. one must match in order for the matcher to match.

—Request matchers (Caddyfile) — Caddy Documentation

I think it’ll have to remain the way it was rather than consolidated; regex and path matchers can’t be OR’d and neither can two regex matchers. To do this you’d need one regex that covers both use cases - doable, for sure, but less understandable than having them separate.

1 Like

* slight nit: they can be if using JSON config. But I realize people often don’t want to do that.

1 Like

This topic was automatically closed after 30 days. New replies are no longer allowed.