Hi! I’ve got a weird problem that Google indexed /url// for my site and now (because of how I use relative paths in the generated HTML) is trying to index /url/url/... everything which is kind of annoying to see (since it gives 404 obviously). So I thought I’d just redirect those double slashes to a single slash, right? Like this:
And curl -I http://localhost:1234/url// returns 404 rather than a redirect. Any ideas please? I’ve seen Collapsing multiple forward slashes in path only, but reading it did not enlighten me. It seems my config is done in a sane way, am I missing something obvious?
Ah, the problem is that we clean the path before passing the request to the path_regexp matcher, as a protection against maliciously-crafted requests. This protects from an attack which can bypass matchers (e.g. when a matcher is used to protect a certain path with basicauth, we don’t want someone to make a request like //foo/bar when the matcher is ^/foo which would skip past authentication).
This is kinda janky, but it uses an example from uri to collapse the slashes with the uri directive (instead of a matcher) and then using an expression matcher to compare if the path has changed from the original, and if so redir to the rewritten path. The route is necessary because uri is ordered afterredir.
I realized that there is only single url with such a problem, so I literally did redir /url// /url/ permanent and that works for now. I may need to revisit this if Google invents another one.