Catch-all too greedy?

teodorescuserban · December 14, 2023, 5:38pm

1. The problem I’m having:

I am using caddy to redirect multiple old domains and / or paths.
However, I see that while the https part is working well, the http requests just get served by my catch-all rule (to redirect to an error page).

If I replace the domain names in the config that I need to redirect with http://domain, https://domain everything works fine.

I was sure somehow that a domain block is the same as http://domain, https://domain. Unfortunately I was not able to find a clear explanation on why and how.

2. Error messages and/or full log output:

No error logs

3. Caddy version:

2.7.5

4. How I installed and ran Caddy:

built and ran a docker image

a. System environment:

Linux, docker.

b. Command:

caddy run --config /etc/caddy/Caddyfile --adapter caddyfile

c. Service/unit/compose file:

PASTE OVER THIS, BETWEEN THE ``` LINES.
Please use the preview pane to ensure it looks nice.

d. My complete Caddy config:

this config works properly over https, but it only serve the error pages defined in my :80 block on http

:80 {
    redir * https://error-pages.newdom.local 301
}

old1.example.local {
    redir /oldpath1* https://newpath1.newdom.local 301
    # ...
    redir * https://error-pages.newdom.local 301
}

old2.example.local {
    redir * https://some.random.domain.local 301
}

this one seems to work properly on both http and https:

:80 {
    redir * https://error-pages.newdom.local 301
}

http://old1.example.local, https://old1.example.local {
    redir /oldpath1* https://newpath1.newdom.local 301
    # ...
    redir * https://error-pages.newdom.local 301
}

http://old2.example.local, https://old2.example.local {
    redir * https://some.random.domain.local 301
}

5. Links to relevant resources:

teodorescuserban · December 14, 2023, 6:03pm

I’m reading through caddyhttp: Implement better logic for inserting the HTTP->HTTPS redirs by francislavoie · Pull Request #4033 · caddyserver/caddy · GitHub with hope but I must miss something…

francislavoie · December 14, 2023, 8:16pm

The order of the routes in the HTTP server when HTTP->HTTPS redirects are enabled is:

User-defined site starting with http:// (e.g. http://example.com)
HTTP->HTTPS redirects for HTTPS site addresses (e.g. example.com)
User-defined catch-all site (e.g. http:// or :80)
Always-included fallback catch-all redirecting HTTP traffic to HTTPS using the incoming Host header

teodorescuserban:

this config works properly over https, but it only serve the error pages defined in my :80 block on http

:80 {
    redir * https://error-pages.newdom.local 301
}

old1.example.local {
    redir /oldpath1* https://newpath1.newdom.local 301
    # ...
    redir * https://error-pages.newdom.local 301
}

old2.example.local {
    redir * https://some.random.domain.local 301
}

In this case, you have no (1), your two HTTPS sites are (2), and you have user-defined (3), and (4) is always included.

Caddy will serve a redirect from http://old1.example.local/ to https://old1.example.local/ first, and then the client after connecting to HTTPS will be served with the redirect from https://old1.example.local/ to https://error-pages.newdom.local/.

Make sure when testing you use the curl -vL (-L meaning Location header, to follow redirects). This is working as intended.

Also, you probably want to add {uri} at the end of all your redirects, to preserve the request URI, otherwise it gets dropped completely from the request. For example:

redir https://error-pages.newdom.local{uri} 301

teodorescuserban · December 14, 2023, 8:30pm

Well, doing simple examples works. Going a bit complicated, adding multiple domains and subdomains per address block will make the http requests fall through. I was not able to find any pattern here

The only consistent behaviour I found was to replace the simple domain name in the address block with http://domain, https://domain. In this case, no fall-through occurs, but no redirect to https also.

When things “don’t work” the json config have no routes for the http listener but the fallback one.

{
    "apps": {
        "http": {
            "servers": {
                "srv0": {
                    "listen": [
                        ":443"
                    ],
                    "routes": [
                        "..."
                    ]
                },
                "srv1": {
                    "listen": [
                        ":80"
                    ],
                    "routes": [
                        {
                            "handle": [
                                {
                                    "handler": "subroute",
                                    "routes": [
                                        {
                                            "handle": [
                                                {
                                                    "handler": "static_response",
                                                    "headers": {
                                                        "Location": [
                                                            "https://error-pages.fallbackddomain.org"
                                                        ]
                                                    },
                                                    "status_code": 301
                                                }
                                            ]
                                        }
                                    ]
                                }
                            ]
                        }
                    ]
                }
            }
        },
        "tls": {
            "automation": {
                "policies": [
                    {
                        "issuers": [
                            {
                                "module": "internal"
                            }
                        ]
                    }
                ]
            }
        }
    },
    "logging": {
        "logs": {
            "default": {
                "encoder": {
                    "format": "json",
                    "time_format": "iso8601"
                },
                "writer": {
                    "filename": "/logs/main.log",
                    "output": "file"
                }
            }
        }
    }
}

francislavoie · December 14, 2023, 8:45pm

It’s because Automatic HTTPS adds the routes at runtime, after the config is loaded. If you enable the debug global option, you’ll see a log http.auto_https adjusted config which outputs the transformed config (but it doesn’t encode the new catch-all routes correctly and they just show up as {} in the HTTP server’s routes; they work correctly though).

You haven’t shown a reproducible case. The config you showed earlier seems to work as intended, so I’m not sure what to tell you.

teodorescuserban · December 14, 2023, 8:46pm

I’m only curl -IX GET to check the Location header.

Yes, thank you. I am using it where needed.

For simple configurations it does. Until it doesn’t. Since the config is under my control completely (not having to deal with external imports etc), I’ll probably just create two address blocks for every domain, the first one the http with the redirect to https, and the second one as before.

Should do everything I need.

I would still be interested to hear what is happening with the routes suddenly when i am adding “something extra”.

I cannot post the actual config (I know, I am sorry), but I would still be interested to hear your opinion, @francislavoie about why removing just one address block makes it work fine, but adding it back (in any position in the config, even changing the domain names, even re-typing everything by hand, although I am using a jinja template to generate the whole thing) makes all http routes besides the fallback disappear.

francislavoie · December 14, 2023, 8:46pm

That’ll only show you the first redirect, but not the second one. When you request HTTP, it’ll get redirected to HTTPS, and then from HTTPS->HTTPS using your configured routes. You need to use -L to actually see the full chain.

teodorescuserban · December 14, 2023, 9:17pm

My point was that the first curl is going off the rails. I did just tried with -L. There’s no turning back. The second request just shows the headers coming from github pages (where the fallback error pages are).

For reference, this is the main part of the template:

http:// {
        import header
        import logging
        import bad_agents
        import bad_paths
        handle {
                redir https://error-pages.fallbackdomain.org 301
        }
}

{% for item in src %}
{{ item.domains|join(', ') }} {
        import header
        import logging
        import bad_agents
        import bad_paths
        handle {
                {% for redirect in item.redirects %}
                redir {{ redirect.matcher }} {{ redirect.target }} 301
                {% endfor %}
        }
}

{% endfor %}

and this is the change that fixes it:

 {% for item in src %}
+{% if use_https != 'off' %}
+{% for domain in item.domains %}
+http://{{ domain }} {
+    redir https://{{domain}}{url}
+}
+{% endfor %}
+{% endif %}
+
 {{ item.domains|join(', ') }} {

teodorescuserban · December 14, 2023, 9:21pm

My initial guess was that:

host1.domain.local {
  repond "boo"
}
host2.domain2.local, host14.domain3.local {
  respond "whee"
}

was ALWAYS the same as:

http://host1.domain.local {
  redir https://host1.domain.local{uri}
}
https://host1.domain.local {
  respond "boo"
}

http://host2.domain2.local {
  redir https://host2.domain2.local{uri}
}

http://host14.domain3.local {
  redir https://host14.domain3.local{uri}
}

https://host2.domain2.local, https://host14.domain3.local {
  respond "whee"
}

but it is not if no other http is defined besides the fallback http:// { ... }

teodorescuserban · December 14, 2023, 9:24pm

Yeah, it is strange. I have a 6K source.txt with the domains and the redirects. Only one of them is problematic and there is nothing special about that line. I have more lines like that, but only that one has issues. Even with the domain names changed on that line, even movesd in the file. I’ve tried everything :))

Thank you for the 1-2-3-4 explanation. I missed 1. I’ll make sure it wont happen in this cases.

francislavoie · December 14, 2023, 9:31pm

I don’t understand what you mean by this.

Please show the redirect you’re seeing with curl and what you expected to see.

Also do this:

Show what you see in your adjusted config.

teodorescuserban · December 14, 2023, 10:20pm

config not working (the error pages are on github pages):

curl -sLIX GET http://oldsite1.olddomain.com --resolve oldsite1.olddomain.com:80:127.0.0.1
HTTP/1.1 301 Moved Permanently
Location: https://error-pages.theerrorpagesdomain.com
Date: Thu, 14 Dec 2023 21:50:15 GMT
Content-Length: 0

HTTP/2 200
server: GitHub.com
content-type: text/html; charset=utf-8
...

If I remove one particular address block from the config (which has nothing to do with the domain I am testing):

curl -sLIX GET http://oldsite1.olddomain.com --resolve oldsite1.olddomain.com:80:127.0.0.1
HTTP/1.1 308 Permanent Redirect
Connection: close
Location: https://oldsite1.olddomain.com/
Server: Caddy
Date: Thu, 14 Dec 2023 21:58:40 GMT
Content-Length: 0

HTTP/2 301
alt-svc: h3=":443"; ma=2592000
location: http://newsite1.newdomain1.com
content-length: 0
date: Thu, 14 Dec 2023 21:58:41 GMT

HTTP/2 200
content-length: 3152
content-type: text/html

When the config “does not work”, the autosave.json looks like this - there is a single route on the port 80 listener which is the default “catch-all” redirect to the error pages:

        "http": {
            "servers": {
                "srv0": {
                    "listen": [
                        ":443"
                    ],
                    "logs": {
                        "logger_names": { "..." }
                    },
                    "routes": [ "..." ]
                },
                "srv1": {
                    "listen": [
                        ":80"
                    ],
                    "logs": {
                        "default_logger_name": "log0"
                    },
                    "routes": [
                        {
                            "group": "group973",
                            "handle": [
                                {
                                    "handler": "subroute",
                                    "routes": [
                                        {
                                            "handle": [
                                                {
                                                    "handler": "static_response",
                                                    "headers": {
                                                        "Location": [
                                                            "https://error-pages.theerrorpagesdomain.com"
                                                        ]
                                                    },
                                                    "status_code": 301
                                                }
                                            ]
                                        }
                                    ]
                                }
                            ]
                        }
                    ]
                }
            }
        }

Hm, actually, with the config “that works” I see the same content for the port 80 listener.

Would it be possible to send you the full logs privately?

francislavoie · December 14, 2023, 10:47pm

teodorescuserban:

config not working (the error pages are on github pages):

curl -sLIX GET http://oldsite1.olddomain.com --resolve oldsite1.olddomain.com:80:127.0.0.1
HTTP/1.1 301 Moved Permanently
Location: https://error-pages.theerrorpagesdomain.com

Okay so you’re saying that your http:// catch-all is being used instead of the automatic HTTP->HTTPS redirect for that domain?

In your example in OP you had redir * https://error-pages.newdom.local 301 in old1. Are you sure that’s not just working as intended?

I need a minimally reproducible example. Can you try to replicate it using *.localhost domains? (curl and browsers handle *.localhost correctly automatically, resolving to 127.0.0.1).

teodorescuserban · December 14, 2023, 11:19pm

Yes.

Yes, Not working as intended. The error pages should only pop up if you arrived at caddy without a host/domain already configured.

I’ll try to provide you with one tomorrow. Thank you very much for your help!

system · January 13, 2024, 11:20pm

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.