Problem with redirects

1. Caddy version (caddy version):

v2.4.6 h1:HGkGICFGvyrodcqOOclHKfvJC0qTU7vny/7FhYp9hNw=

2. How I run Caddy:

I have it installed from the apt repo and running it as a service. My /etc/apt/sources.list.d/caddy-stable.list looks as follows:

# Source: Caddy
# Site: https://github.com/caddyserver/caddy
# Repository: Caddy / stable
# Description: Fast, multi-platform web server with automatic HTTPS


deb https://dl.cloudsmith.io/public/caddy/stable/deb/debian any-version main

deb-src https://dl.cloudsmith.io/public/caddy/stable/deb/debian any-version main

a. System environment:

No Docker. Running it on top of Ubuntu 20.

b. Command:

service caddy start

c. Service/unit/compose file:

Using the default that came from apt.

d. My complete Caddyfile or JSON config:

example.ro {
	root * /home/grn/app/public
	redir /dev/index.php?id=125&p=1471&s=473001&z=L&c=P http://google.com
	redir /dev/index.php?id=125&p=1471&s=473001&z=L&c=F http://google.com{uri}
	file_server
	@not_static not file
	reverse_proxy @not_static 127.0.0.1:3000
	encode zstd gzip
	log {
		output file /var/log/caddy/example.ro.log {
			roll_size 100mb
			roll_keep 20
			roll_keep_for 168h
		}
		format console
	}
}

3. The problem I’m having:

The redirect DOES NOT work as expected. However, if I change the ? in the URL with & it works.

First, I query the exact URL. I get 404 instead of the redirect, after Caddy hits the Ruby On Rails backend after reverse proxying.

[grn@tayqo:~] $ curl -I 'https://example.ro/dev/index.php?id=125&p=1471&s=473001&z=L&c=P'
HTTP/2 404 
content-type: text/html; charset=utf-8
link: </assets/application-53889e48c84d6a2fc3f23c3e659e3656f46ecf7e7195277ae651ee5a68164c91.css>; rel=preload; as=style; nopush,</packs/js/addtohomescreen-654677869d996d871b51.js>; rel=preload; as=script; nopush
referrer-policy: strict-origin-when-cross-origin
server: Caddy
strict-transport-security: max-age=63072000; includeSubDomains
vary: Accept
x-content-type-options: nosniff
x-download-options: noopen
x-frame-options: SAMEORIGIN
x-permitted-cross-domain-policies: none
x-request-id: 69c15562-a3c8-4ece-a0f0-0b0d43242660
x-runtime: 0.005247
x-xss-protection: 0
date: Wed, 09 Feb 2022 15:13:13 GMT

Second, I query the URL I want to but replace the ? with & (or truly any other character). I DO get the redirect that I would expect:

[grn@tayqo:~] $ curl -I 'https://example.ro/dev/index.php&id=125&p=1471&s=473001&z=L&c=P'
HTTP/2 302 
location: http://google.com
server: Caddy
date: Wed, 09 Feb 2022 15:13:17 GMT

5. What I already tried:

I tried to escape the question mark, as maybe that gets interpreted as regex. Not working. Went through the docs, looking for things I may have missed. Scoured the internet for articles or blogs. Could not find a fix.

I noticed that if I remove the ? then the new URL would get the proper redirect that I would expect. I also noticed that if I leave the ? in the URL but upon doing a cURL I replace it with a different (any) character, it works. But it never works when I have a ? in the URL and I query that URL.

So, it seems to me that in a redir $from $to, if the $from contains a question mark, the redirect is not working properly.

Thoughts?

I went ahead and wrote the simplest config that can be used to reproduce the problem and tested it on localhost on a Mac with version 2.4.5. Got the same behavior.

localhost {
    redir /arstg http://google.com
    redir /qwfp? http://google.com{uri}
}

Go ahead and do a curl -I https://localhost/arstg, then do a curl -I https://localhost/qwfp? and, finally, do a curl -I https://localhost/qwfpx. The first one WORKS, the second one FAILS (although it should work), the third one WORKS (although it shouldn’t)…

Path matchers only match the path part of the URL, which does not include the query part, i.e. the stuff following ?.

If you need to match by both path and query, you’ll need to use a named matcher which uses both the path and query matchers:

2 Likes

Thank you @francislavoie - I am used to nginx and I skimmed through the docs and thus missed that Caddy does this thorough differentiation. Any pointers on how I could match on multiple query params? What I have below matches on path and any of the query particles - ORing them. I am unclear on how I should go about ANDing them.

localhost {
	@m125 {
		path /index.php
		query id=125 p=1294 s=131073 z=L c=F
	}
	redir /arstg http://google.com
	redir @m125 http://google.com{uri}
	log
}

I managed to scratch my own itch in two ways and both are using CELs which are marked as experimental so, I’m not sure this is the best way to approach it.

approach one - easier to understand for humans, harder to write and it looks ugly

localhost {
    # using only CELs
    @redirViaCEL {
        expression ( \
            {uri}=="/index.php?id=10&submenuId=450000" \
            || {uri}=="/index.php?id=10&p=147" \
        )
    }
    redir @redirViaCEL http://example_one/{query}
}

approach number two - a bit more elegant, (maybe harder to understand), similar to nginx

localhost {
    map {uri} {needs_redirect} {
        /index.php?id=11&submenuId=450000 Yes
        /index.php?id=11&p=147 Yes
        default No
    }
    @redirViaMap {
        expression {needs_redirect}=='Yes'
    }
    redir @redirViaMap http://example_two/{query}
}

@francislavoie any thoughts on this? would this be the recommended way to approach such a need? Thank you!

Sorry I didn’t reply earlier, I was mulling this over and discussed it with the team on Slack.

Yes you’re right that query only ORs its arguments right now. And unfortunately specifying query more than once in the same matcher just merges all the arguments together, still being an OR.

There is a workaround though, which is to “abuse” the not matcher to make it work. It would look like this:

not {
	not query id=10
	not query submenuId=450000
}

You can read this as:

! ( ! (query id=10) OR ! (query submenuId=450000) )

Which by De Morgan’s laws, is effectively:

(query id=10) AND (query submenuId=450000)

You solutions with {uri} can also work, but they depend on the query args being in exactly in that order. If you were to re-order id and submenuId to be reversed, it wouldn’t match, but the not approach does.


So yeah, right now there’s no elegant solution unfortunately, and it’s because of a limitation in how the matchers were designed. There’s a lot to write on this topic to give a thorough answer, but I’ll try to condense it.

The first thought is query should support & in the input, and use that to make it AND, so like query id=10&submenuId=450000 literally. This is trivial from a Caddyfile parsing perspective, but the problem is the JSON config (which the Caddyfile adapts to).

See the JSON docs for the query matcher (which unfortunately has slightly misleading text which doesn’t mention that it’s being OR’d) but you’ll notice that it’s just a JSON object with key-value pairs. This structure is obviously limited, there’s no way to tell it “actually, AND is what I want instead”, and any change to this config structure would be a breaking change for anyone configuring query directly via JSON, unfortunately. So modifying query has pitfalls.

In the JSON for routes, you’ll notice that match actually takes an array. It actually takes a list of “matcher sets”, and ORs all those matcher sets. This is really nice in terms of flexibility at the JSON level.

But the way named matchers work in the Caddyfile, a named matcher only maps to a single matcher set. This means you can’t make use of that OR functionality in the Caddyfile by defining two matcher sets.

We did some exploring of this a few months ago and there was a lot of discussion on a Github PR about this, and seeing what we can do… but the work went inactive because it’s really complicated. Feel free to take a dive on this and read into it if you like:


Having written all that, I remembered that we have a {query.*} placeholder which you can probably use alongside expression to do this really nicely. Try this:

@needsRedir expression `({query.id} == "10" && {query.submenuId} == "450000") || ({query.id} == "10" && {query.p} == "147")`

Of note, you can use backticks as an alternate token start/end delimiter, which is important to allow using " as a string delimiter in CEL. You can also avoid having to “escape the newlines” in the Caddyfile this way, and you can break it out onto multiple lines like this if you want:

@needsRedir expression `
	({query.id} == "10" && {query.submenuId} == "450000")
	|| ({query.id} == "10" && {query.p} == "147")`

So yeah, this is probably your best solution for now.

2 Likes

Thank you, @francislavoie . I believe the map solution works best for my scenario - web bots usually crawl URLs in the form they remember them and, if I wish to migrate old sites to Caddy I see map+CEL solution as the most readable.

The not-not solution is the cleanest one if the order of query args would shuffle, 'though.

Thank you so much for your effort and indepth explanation, really appreciated!

1 Like

This topic was automatically closed after 30 days. New replies are no longer allowed.