Dynamically set reverse_proxy upstreams from custom module

I am working on a custom module that integrates with a popular service discovery. The module will essentially act as an ingress and load balancer based on the request.

Instead of implementing the reverse proxy using my own httputil.NewSingleHostReverseProxy(url) I would prefer to get the hard work, benefits, and options that are offered in the core reverse_proxy module.

  1. Is it a good idea to continue my use case of httputil.NewSingleHostReverseProxy(u) despite the lack of multiple upstreams?
  2. If I ordered the custom module to be triggered before the reverse_proxy, is there a way to “set” a variable in the config upon request that I can pass onto the reverse_proxy?
  3. Is there a way to use the caddyhttp.Handler that is provided in the caddyhttp.MiddlewareHandler?

I know this is a custom module, but again would love to get some of the benefits from the awesome work already completed in reverse_proxy.

1 Like

Cool!

You can embed the reverse proxy module in your own module, this would probably be much better. In config it looks like a wrapper.

But depending on what you’re doing you might want to implement a reverse proxy transport module. Can you provide more details about what your module does? Specifically?

I’m mobile right now but will answer your other questions when I get to my computer.

Thanks Matt!

One of the challenges for service mesh/discoveries tools I find often is handling Ingress capabilities. A lot of these tools, in this case Consul, are great at providing an API and DNS server for internal discovery, but they lack a real integration with a web server for ingress (e.g. external domain name points to this internal service).

What the module I am working on does is resolve the domain name (using a caddyfile) from the Consul API to get the IP and ports the internal service is available and then reverse proxies the request - also adding some useful headers and etc. However, that service might be available on more than one IP/port assignment. So the httputil.NewSingleHostReverseProxy does not really seem ideal although I could implement by own mini load balancer but that does not seem like a smart idea.

I just found this page with the Module Namespaces and I’m guessing that I should focus on implementing http.reverse_proxy.transport?

1 Like

The transport layer is separate from the upstream pool. The proxy module first makes a decision using the upstream pool as to which backend to contact, then once having done that, sends it through the configured transport.

What Matt was suggesting is making your plugin have the reverse_proxy module be a submodule of it. I.e. your plugin would be a parent to the reverse_proxy module. Your plugin would be an HTTP handler module which embeds a reverse_proxy.

You can use placeholders + the replacer to set data in the request context which can later be reused in the handler pipeline. You might set a placeholder value like http.your_custom_proxy.actual_upstream, and then you’d configure the reverse_proxy to use {http.your_custom_proxy.actual_upstream} as the placeholder for the dial address.

Alternatively, it might be possible to generalize the “upstream decision” part of the reverse_proxy module to make it pluggable, so a different type of “decider” could be configured. This might be kinda tricky though, and the value might be limited. But it’s an idea, if you’re willing to give it a shot.

2 Likes

Okay, back at the computer!

Probably not; the standard lib reverse proxy has many limitations. Our reverse proxy – particularly the streaming code that shuttles bytes between sockets – is loosely based on that one, but our demands are more rigorous. I rewrote the v2 reverse proxy from scratch, bringing over only the portions that were useful and already well-vetted. Even some of that needs work, for example handling errors, and tuning the flushing, etc.

Yep! Seems that @francislavoie has answered this pretty well already. I will clarify though that the variable need not be a placeholder per-se, which is exposed to the user, which may or may not be what you want. Caddy has a “bucket” of variables you can set on a request, and the caddyhttp package has these nifty functions called GetVar and SetVar:

Under the hood, it just uses the request’s context value, but it wraps up the getting/setting a little more elegantly. (Of course, we subvert some of Go’s standard context conventions, but we also aren’t a standard library per-se and we’re allowed to just say “Well, don’t do that” when trying to use it improperly. Anyway. Not a big deal.)

Actually, these variables are also exposed as placeholders, as {http.vars.*} but that’s a side-effect of using vars, as opposed to using placeholders as the actual solution. I’d recommend using vars.

I’m not sure what you mean by this, nor why this is useful to you?

Francis already clarified, but I will go into more detail.

The reverse_proxy module literally embeds the headers module:

That exposes all the same config surface to the user as if they were configuring the actual Headers handler separately.

Then we also wrap the provisioning:

This sets it up when the server starts (or the config loads).

And then here we finally use it when proxying, once for the request, once for the response:

And that’s it!

So, you can, if you need to, embed the reverse_proxy handler in your module if you find it useful. I just showed you one example of where something similar is done.

A few suggestions before going much further:

  • Always think in terms of JSON config first, then Caddyfile second. The JSON config is what really matters. One way or another, you can probably express it in Caddyfile, and it’s OK to think about how it would look in the Caddyfile, but always start with the JSON structure.

  • Think of the right place for the functionality you want to accomplish. It sounds like you need to manipulate the list of Upstreams. One way to do this is external to the entire config, i.e. some sort of admin plugin that can watch the consule API and get notified when upstreams change, then update the config (add/remove upstreams to the list in the JSON) and reload it. Another way would be to make a type that literally embeds the reverseproxy.Handler type (so that the JSON config looks exactly the same) but all it does is set the Upstreams list for the user:

type Handler struct {
    *reverseproxy.Handler
    Consul string `json:"consul,omitempty"`
}

Or something like that. But, I am still unclear on the exact requirements and vision, so maybe this is how it works, or maybe not. Just an idea.

On second thought, I do not think a Transport module is right for this use case. A Transport is for doing a RoundTrip, not choosing upstreams.

1 Like