Regex on redir - solved?

Hi,

has the issue of using enabling regex matches (and captures) with redir been solved?
Alternatively, does rewrite now have redirect power?

If not, is it impossible to issue a 301 redirect like this (nginx code):

location ~ ^/(.*)/(.*)/(.*)/(.*).html {return 301 /x/1/$4/;

Please note that mere rewriting is not enough, which is easily possible with ‘rewrite’ in Caddy. If I simply rewrite the url and serve the content, Google will say I’ve got ‘duplicate content’ at both urls (the old one and the new one.)

The problem is that ‘redir’ doesn’t support re-using a part of the uri in the redirect command. In other words, redir doesn’t have any support for regex captures.

And there seems to be no workaround for this too.

EDIT:

Can I get regex capability in a redir block by using if has or if match along with a regular expression and then giving the actual redir command?

For example, can I give an if statement like this:

If {path} has /(. *)/(.* )/(. *)/(.* ).html

or

if {path} match /(. *)/(.* )/(. *)/(.* ).html

before giving the actual

/ {rewrite_uri}

Hi @elos42,

Regarding this, you’ll want to track the following on Github:

Requesting dynamic client-side redirects #1749

‘if’ statements for the HTTP Caddyfile #1948

I’m planning to try this today:

First: Have a series of rewrite rules which will not only rewrite the urls into the required format, but also add an identifier (prefix) at the beginning to show that it needs to be redirected later.

Then a redir block to catch all the rewritten requests and implement a redir on them, while stripping them of the prefix. I am using your solution from Redirect with if statements based on value

There’s also a second approach that I’m not sure of.

Do something like this:

redir {
if {path} matches ^/(.*)/blah(.*)/shoo
to /{2}/{1} 
}

The second approach is invalid syntax.

There is no to subdirective for redir, it exists only in rewrite. Likewise {1}, {2} etc will all resolve to an empty string as capture groups only work in rewrite as well.

https://caddyserver.com/docs/redir
https://caddyserver.com/docs/rewrite

Assuming you only need to redirect within the same domain name, what you can try is rewrite the URI and then redirect to the rewritten URI if it is different from the original request.

example.com {
  rewrite {
    r ^/(.*)/blah(.*)/shoo
    to /{2}/{1}
  }

  redir {
    if {uri} not {rewrite_uri}
    / {rewrite_uri}
  }
}

Thanks. From a processing standpoint, which is more efficient:

This

example.com {
  rewrite {
    r ^/(.*)/blah(.*)/shoo
    to /{2}/{1}
  }

  redir {
    if {uri} not {rewrite_uri}
    / {rewrite_uri}
  }
}

OR (will the second step of trying to strip the prefix even work?)

example.com {
  rewrite {
    r ^/(.*)/blah(.*)/shoo
    to /2bredirected/{2}/{1}
  }
  redir  /2bredirected/ /
}

I am thinking, if the second one works, it would be more efficient as it does not involve an if statement or any regex in the second half, and is a plain redirection, which should be easier on the server? I serve about 3 million unique visits in some months. Upto 500,000 unique visitors on some days (when stuff goes viral)

1 Like

I think it’s academic which would be more efficient, processing wise, because:

Unfortunately not. That’ll have every visitor that matches that rewrite get redirected to the site root (e.g. https://example.com/ - no path - because it’ll be sending a Location: / header).

Actually, both are incredibly similar. Unlike some other web servers, Caddy doesn’t assume everything is regex-able; doing x not y is a simple string comparison in code.

Likewise, the simple syntax of redir [from] [to] does a very similar string comparison - the from value is compared to the request URI, effectively in the form of if {uri} starts_with [from]. That’s how the redir statement determines whether or not it should actually operate for any given request.

In conclusion I’d have to say they’d be almost identical, or so similar as for it to be negligible.

The Replacer that turns a placeholder into its held value is likewise quite efficient. I actually personally benchmarked it once in a PR I submitted to accommodate escaping (such as for JSON purposes): https://github.com/mholt/caddy/pull/2075#issuecomment-374066714

Awesome! We always love to hear about users employing (or seeking to employ) Caddy to serve high traffic use cases.

1 Like

Thanks Matt. I’ve deployed caddy on the static content server, and it’s working fine. That server is under a heavier load, as you can imagine. For some reason, there are no cpu spikes as was seen in Nginx.

One of the reasons why I haven’t put caddy on the main (html) server is because I’m using google pagespeed plugin to inline critical CSS, thus improving page render time.

I don’t know how important it is to actual user experience, but Google seems to make a big deal of inlining critical CSS. I don’t know if there’s a WP plugin that’ll give me the same capability, without having to depend on the pagespeed module. I am not very keen on pagespeed module because it does a lot of rewrites and on-the-fly checking, which I’m sure is taxing.

1 Like

I’m sure it is taxing, but bear in mind: A WordPress plugin will be doing the exact same thing, but in PHP! :scream: A web server module is likely much faster than that!

As an occasional web developer myself, I would probably just run my pages through one of the free tools that spits out the critical inline CSS and inline it manually myself :smiley:

But if it can be done by the app, then it can also be cached. But when it’s done by PS, it’s not cached, as it’s done at the very last moment, just before the content leaves the server. So it does this every time, for every user, whereas if it was PHP that was doing it, it would be done only once.
It’s not easy to do inlining manually, because we create around 20 new pages every day. And we don’t have a full time coder. I’m myself not a coder :sweat:

Ahh, unfortunate. Well, it’s all about the tools available to you, then. There are a few WP plugins that do above-the-fold / critical CSS inlining / reflow prevention.

If you can get that generated and cached, you’ll definitely be doing much better than reprocessing each time. Best to cache using a WP plugin as well, rather than trying to use Caddy cache or something like Varnish; a WP cache plugin will come fully configured with what (and more importantly, what not) to cache, heh.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.