How to log outgoing requests?

Hello Choucri,

About the log, I don’t think I can help much.
I just read this How Logging Works — Caddy Documentation and of course log (Caddyfile directive) — Caddy Documentation.

I use the debug mode on top of my Caddyfile (or in the log directive block) but I don’t think this is what want.

Second question, I think I can help. Here is an example of catching a specific query and process a redirection on it :

localhost {

  [...]         # the rest of your config

  log {
      output /var/log/caddy/access.log
      format json
      level  debug
  }

  handle {
    @query_1 {
      expression {http.request.uri.query} == 'q=test'
    }
    redir @query_1 {path}?query=TEST
  }
}

@query_1 will be the alias for the specific query you want to catch. It uses a logical expression based on the placeholder {http.request.uri.query}. See JSON Config Structure - Caddy Documentation and Conventions — Caddy Documentation

The fact that this alias / rule is placed under the global handle block will make this functional for every path you request on localhost (notice as the redir directive takes care of keeping the {path} value in second argument).

It’s a start as you may want to forward the value or key name for that specific query.

The documentation is really helpful \o/

If you want to catch all ?q= queries and redirect it to ?query= and keep its value :

@query_1 {
    expression {http.request.orig_uri.query}.contains('q=')
}
redir @query_1 {path}?query={http.request.uri.query.q} 

Have fun!

Alex

Hi Alex,

Thank you very much for your quick response! Indeed, I’ve been through the links you mention and used the log directive, I even switched to JSON hoping to activate more logs, but no success…

Speaking of my other problem, basically I want to transform an incoming request:

http://domain1.com/foo/bar?baseUrl=http://domain2.com/p6&a=b
Into this: http://domain2.com/p6/foo/bar?a=b

I want to extract the value of the query parameter baseUrland add the URI to it while removing the baseUrlquery parameter, then issue the resulting request.

The closest I could get to, testing on my local machine was thanks to this config:

localhost {
    @p6core {
        # I wish there was a way catch this value
        # Unfortunately I had to put * just to test the presence of the parameter
        query baseUrl=*
    }
    route @p6core {
        rewrite * /p6{uri}
        # p6core:8080/p6 is the value of the `baseUrl` query parameter in this specific case, but it can be anything else
        # I had to hard code the value rather than read `baseUrl`
        reverse_proxy p6core:8080
    }
}

As you can see, I couldn’t find a way to:

  • Read the value of a query parameter and re-use it to rewrite my request
  • Parse a string into a URL then split it into a domain and a URI
  • Remove a query parameter

Thanks again for your help!
Choucri

Hi Choucri,

You are welcome!
Just to be clear : I’m new with caddy too.
Anyway, someone will correct me if I’m going too far :stuck_out_tongue:.

I thought your question was on a general level but you have a very specific need actually.
Let’s play a bit with your query then!

Here is what I got working :

Caddyfile
localhost {

  [...] # rest of the config

  handle /foo/bar/ {
    @query_p6 {
      query baseUrl=*
    }
    redir @query_p6 {http.request.uri.query.baseUrl}{path}?a={http.request.uri.query.a}
  }

  handle {
    templates
  }
}

localhost:3000 {

  [...] # rest of the config

  handle {
    @query {
      expression {http.request.uri.query}.contains("a=b")
    }
    respond @query "You got here from localhost:80 !!"
    respond "localhost:3000 default handling"
  }

} 

I did this using localhost and localhost:3000 but it shouldn’t be a problem to use something else.
We don’t really care about the second block (it just has to confirm if we came by our custom localhost redirection).

You can then request on localhost/foo/bar/?baseUrl=http://localhost:3000/p6&a=b and you’ll be redirected on localhost:3000/p6/foo/bar/?a=b.

We can only propagate the ‘a’ parameter with this one…

Still not what you want right? :slight_smile:.
I don’t know yet how to operate the query object on the fly : at line “redir @query_p6” we have to use baseUrl value AND remove it from the whole to be able to propagate all other parameters except that one. If that doesn’t matter to you, you can propagate all of it (it’s just a bit ugly).
The corresponding line would be :

redir @query_p6 {http.request.uri.query.baseUrl}{path}?{http.request.uri.query}

And we get redirected toward localhost:3000/p6/foo/bar/?baseUrl=http://localhost:3000/p6&a=b. (keeping all parameters)

I hope I don’t get lost in this and it is clear enough as I didn’t follow your exact example :slight_smile:.

Alex

Edit: I’m reading this cel-spec/langdef.md at master · google/cel-spec · GitHub
Is it possible to define a variable in the Caddyfile scope? Could be a lead.

I think you have the general idea here @dukeart, I don’t think it’s possible to do a rewrite as asked because rewrites only operate on the request path and not the full URL. A redirect is a possible workaround but wouldn’t work if that domain is meant to be internal-only.

I’m worried that if we can’t remove the baseUrl, from the query then you could easily end up in a redirect loop. I can’t think of a good way to do that right now either. Really feels like most of this logic should be done at the app level rather than at the web server level.

We decided for the time being to only support CEL expressions that return a boolean value. It’s certainly possible to improve though. I’m not sure I see how we’d write a CEL expression that sets a value but also returns a boolean result as required by matchers.

I try a few things but yes it gets more complicated for nothing :slight_smile:. Besides the loop cases indeed.

You mean like a package? Or directly through the json config?

No I mean like the at the app being proxied to or in the client-side JS that I assume is making the requests in the first place.

Oh right, of course. I agree, if not necessary at the server level we are good enough with the redirection :slight_smile:.

Thank you so much for all your feedback!

I didn’t know I could read the baseUrlparameter this way: {http.request.uri.query.baseUrl}! Thanks for the tip!

To give a bit more context, our product is offered to many customers. Every customer has her own backend server but all customers use the same UI. This is why we have a reverse proxy in place. And thanks to the baseUrlquery parameter, we manage to route the request to the appropriate backend.

These backends are behind firewalls and are not accessible from the Internet. Therefore, redirects cannot work. We have to rewrite the request and forward it. This why logging outgoing requests is so helpful and this is why I cannot use the redirdirective.

For information, this is how it could be implemented in Nginx (without dropping the query parameter):

server {
    if ($arg_baseUrl) {
        rewrite ^(.*)$ /baseUrl/$1;
    }

    location ~ /baseUrl/ {
        resolver 8.8.8.8; # Needed because of dynamic URL in proxy_pass
        proxy_pass $arg_baseUrl$request_uri;
    }
}

I wish the reverse_proxyor rewritedirectives were more powerful, or could be merged in a single directive similarly to Nginx? Plus the ability to drop a query parameter as easily as dropping a header.

Thanks again! It’s ashame I can’t go further for now, I really like the automatic HTTPS feature! It’s what brought me to Caddy in fact.

Choucri

1 Like

Oh, well something like this might work then:

@hasBaseUrl {
    query baseUrl=*
}
rewrite @hasBaseUrl /baseUrl/{path}

reverse_proxy /baseUrl/* {http.request.uri.query.baseUrl}

Just a guess.

I think this might fail if the baseUrl has a path on it though. If you were able to split the baseUrl into something like baseDomain and basePath you might get better results.

2 Likes

I’m a little lost. I thought this was about logging. What is the problem now, is it with rewrites?

@matt we’re responding to this part of the original post:

1 Like

Nice try! It doesn’t work either, I get the following error:

caddy     | {"level":"error","ts":1589237363.2443428,"logger":"http.log.error.log0","msg":"making dial info: upstream {http.request.uri.query.baseUrl}:: invalid dial address http://p6core:8080/p6:: address /p6core:8080/p6:: too many colons in address","request":{"method":"GET","uri":"/apis/v2/jobs/?baseUrl=http://p6core:8080/p6","proto":"HTTP/1.1","remote_addr":"172.26.0.1:35876","host":"localhost:8480","headers":{"Accept":["*/*"],"Accept-Language":["en,fr;q=0.8"],"Accept-Encoding":["gzip, deflate"],"Authorization":["Bearer eyJhbGciOiJSUzI1NiIsInR5cCI6IkpPU0UifQ.eyJpc3MiOiAiaHR0cDovL2FtYWx0by5jb20iLCAic3ViIjogImxvY2FsaG9zdC04NDgwIiwgImF1ZCI6IFsibG9jYWxob3N0LTg0ODAiXSwgImV4cCI6IDE1ODkyMzM3NjIsICJuYmYiOiAxNTg5MjMwMTYyLCAiaWF0IjogMTU4OTIzMDE2MiwgImp0aSI6ICJiMmp3dC0xLjAuMSIsICJ0eXAiOiAiSldUIiwgImh0dHA6Ly9hbWFsdG8uY29tL3JlYWxtIjogImIyIiwgImh0dHA6Ly9hbWFsdG8uY29tL3VzZXJfZW1haWwiOiAiY2hvdWNyaS5mYWhlZEBhbWFsdG8uY29tIiB9.KhGMNldDJTGwA7mGmA2hIb4tAmkOGXybAX33ROU9_w4xHa0POaMZiLrkfJXo0dMXPLRpPuvN_IxkFOGL22pch_vMGzZpkehvSiKiEg1AmwH6PE3Dw1yDOY3Q3AH-J0-I_Cuu9NSc14exfZ2H0qMFWYnFG7_-vtWxzqfda4rXynjBj7mgLQ6j5e9_rNW2fiCe48_80HWCNpsliAAkCNd8HzFeSpl3NaZ2FKMazdI7pNOWx1Hlrdm7AUIGjqp_XyQiH20gYkop9muAUs8AlvqX5UBMplarAC6OBfn4LE4u-1xEpk6RedSt8X6oVeMDDu_7byn9PIwpPa0OtiYR5l2WLg"],"Dnt":["1"],"User-Agent":["Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:76.0) Gecko/20100101 Firefox/76.0"],"Connection":["keep-alive"],"Referer":["http://localhost:8480/"]}},"duration":0.0000939}

Basically the value of {http.request.uri.query.baseUrl} is http://p6core:8080/p6:: instead of http://p6core:8080/p6, I don’t know where the extra 2 colons come from!

Sorry, I should have split the topic into 2. I tried editing the name of the topic but I can’t find the edit button. Sorry I’m still new to this forum.

That message is actually misleading, there’s just one : added, the other one is part of the error message (each nested error bubbled up is appended one after the other, usually separated by :).

I think what’s happening is that the reverse proxy code tries to assume a port for that upstream because it’s not able to parse it as a “host-port” segment (because there’s a path, and maybe also because of the scheme, in the string). Like I said earlier, I think you’d need to remove the /p6 path portion for this to be viable here.

So what exactly is your input and expected outputs?

For example, I want to boil it down to:

A request to https://example.com/foo/bar?a=b should be rewritten to /foo... with this config: … but actually the result is this instead: …

Then I will have a better idea of what the problem is. Thanks!

If still possible to edit the thread title, rename it in something like “query parameter transport to host” maybe and open a new one for the output log as we didn’t deal with this yet. (with great powers comes great responsabilities :stuck_out_tongue:)

1 Like

@choucrifahed @dukeart Ah, nevermind, I found it:

So, a few thoughts:

  1. This is really weird, I’ve never seen anything quite like this…
  2. I don’t understand how the above nginx config does it
  3. If we had a uri_re (URI regular expression) matcher that matched on the entire URI (URI is path + query string as a single string) would that work? It’s not hard to implement, but I want to make sure it’d be useful first.

(@francislavoie - we can split this thread into two, feel free – just choose a post where it’s a good spot to split and select all posts that should go with it.)

1 Like

Strictly speaking, closest approximation to nginx’s behavior here is something like @francislavoie’s attempt earlier, the issue is simply that while nginx handles the base URL + request URI as an acceptable upstream (presumably it takes it as an entire URL to make a request to, instead of preserving any original request URI), this doesn’t fly in Caddy, and especially not in Caddyfile.

We do a bunch of parsing on the upstream in the Caddyfile to split it up properly into JSON to actually use as Caddy config. And we explicitly disallow URIs in our upstream addresses because that implies URI rewriting, which we keep as a separate and distinct operation.

That means we’d need to take the base URL and strip it out so it’s actually just a scheme, hostname, and port at most and handle the URI rewriting. But, since we’re effectively just manipulating strings at this point, we’re beyond any kind of automated URL parsing.

So… Basically, regex could do this. But boy, it’d be an interesting regex. You’re basically just parsing an arbitrary URL with regex at that point.

You’re not wrong, though, @matt - can’t say I’ve ever seen this one before now. You know what they say, learn something new every day.

If the client could just be configured to send the base URL without any URI on it, and include the URI maybe as some other parameter like baseUri, I think this would be orders of magnitude easier to achieve, e.g. by rewriting to {http.request.uri.query.baseUri}{http.request.uri} and then proxying upstream to {http.request.uri.query.baseUrl}.

2 Likes

@Alex I cannot edit the title of the post, I searched again this morning, it seems I don’t have sufficient rights for that. Sorry for that! If anybody has sufficient rights to do it, please do! I don’t mind splitting the thread in 2 either.

@Francis Unfortunately, I cannot remove the /p6 part, it really depends on the backend, it could be anything else.

@ Matt I know the business logic is weird, but unfortunately I cannot change it. This URL rewrite scheme is the way we can serve all our customers the same UI hitting isolated backends potentially located in different cloud providers (Amazon, Digital Ocean…).

@ Matthew agreed. Unfortunately, I cannot change this business logic, we have dozens of customers that run various versions of our software and keeping backward compatibility is paramount. As I said in an earlier post, maybe in the future if the reverse_proxy directive were enriched with URL rewriting capabilities, it would solve the problem.

I cannot thank you enough for all your help! I feel frustrated not to move forward with Caddy, because beyond automatic HTTPS, great doc, easy to learn… it has an awesome community!

3 Likes

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.