Removing dynamic prefix (dates) from URI

1. Output of caddy version:

v2.6.2 h1:wKoFIxpmOJLGl3QXoo6PNbYvGW4xLEgo32GPBEjWL8o=

2. How I run Caddy:

a. System environment:

Ubuntu 20.04

b. Command:

caddy start

c. Service/unit/compose file:

n/a

d. My complete Caddy config:

  GNU nano 4.8                                                                                 Caddyfile                                                                                  Modified  
technicalpenguins.com {
        root * /srv/technicalpenguins.com
        php_fastcgi localhost:9000
        file_server
}

http://wordle.com {
        redir https://www.nytimes.com/games/wordle/index.html
}

www.joanconcilio.com, joanconcilio.com {
        root * /srv/joanconcilio.com
        php_fastcgi localhost:9000
        file_server
}

www.unschoolrules.com, unschoolrules.com {
        root * /srv/unschoolrules.com
        php_fastcgi localhost:9000
        file_server

}

3. The problem I’m having:

I want to remove dates from some URIs before passing them to the PHP handler for unschoolrules.com. There are URLs in the wild that look like https://unschoolrules.com/2004/04/rest-of-url and https://unschoolrules.com/2004/rest-of-url; they need to redirect to https://unschoolrules.com/rest-of-url. (This is a pretty standard WordPress URL change.)

4. Error messages and/or full log output:

Paste logs/commands/output here.
USE THE PREVIEW PANE TO MAKE SURE IT LOOKS NICELY FORMATTED.

5. What I already tried:

I cannot figure out the way to do it. I’ve tried handle_path but I don’t think that’s the right approach (it just continued to 404 trying to find the original URL). I’m not sure if I need to completely recreate the server instructions inside each matcher, or if there’s a way to just strip URI prefix. I( think strip_prefix will enter into it somewhere, but I’m unclear where.)

6. Links to relevant resources:

You’ll probably want to write some regular expressions with path_regexp to match the path patterns you want to transform, then use rewrite with the captured parts from your regexp to build the final path you want.

Effectively it might look something like this (plug in your own regexp), for example, this might strip a request if it has a path segment with 4 consecutive numbers:

@withYear path_regexp year ^/([0-9]{4})/(.*)$
rewrite @withYear /{re.year.2}
1 Like

So when I put that in the joanconcilio.com block and navigate to https://joanconcilio.com/2024/art/, both the browser URL bar and the PHP script report the URI is still /2024/art, not /art. Any suggestions?

www.joanconcilio.com, joanconcilio.com {
        @withYear path_regexp year ^/([0-9]{4})/(.*)$
        rewrite @withYear /{re.year.2}
        root * /srv/joanconcilio.com
        php_fastcgi localhost:9000
        file_server
}

Oh, so you want a redirect, not a rewrite. Use the redir directive for that, then.

A rewrite is transforming/changing the URL before it gets passed to subsequent handlers. That doesn’t change the URL in the browser.

A redirect is writing a response back to the client with the Location header set which tells the client “try again, but at this URL instead”. That changes the URL you see in the browser.

Either way, that regexp is probably incomplete for your needs. You’ll want to expand on that to cover other date patterns. You can play with regular expressions at https://regex101.com/ and pick the Golang variant in the sidebar, which Caddy uses.

2 Likes

Thank you so much! I think I had figured out all the pieces but not how to glue them together.

1 Like

This topic was automatically closed after 30 days. New replies are no longer allowed.