Custom select policy for reverse_proxy module

I want to implement a select policy for reverse_proxy, which read upstream config from request header, the background is something like a API gateway, which manages many APIs from many application servers, the API gateway first get API info from database or some cache, then I want to use the reverse_proxy to proxy the requests, as now, caddy can not load modules dynamicly, so I want to let the API gateway add a request header like X-Caddy-Upstream-Dial, then implement a select policy like this:

// DynamicSelection is a dynamic policy that selects an host from request headers
type DynamicSelection struct {
	DialHeaderName        string `json:"dial_header_name"`
	LookupSRVHeaderName   string `json:"lookup_srv_header_name"`
	MaxRequestsHeaderName string `json:"max_requests_header_name"`
}

// CaddyModule returns the Caddy module information.
func (DynamicSelection) CaddyModule() caddy.ModuleInfo {
	return caddy.ModuleInfo{
		ID:  "http.reverse_proxy.selection_policies.dynamic",
		New: func() caddy.Module { return new(DynamicSelection) },
	}
}

// Provision 实现caddy.Provisioner
func (d *DynamicSelection) Provision(caddy.Context) error {
	if d.DialHeaderName == "" {
		d.DialHeaderName = DefaultDialHeaderName
	}
	if d.LookupSRVHeaderName == "" {
		d.LookupSRVHeaderName = DefaultLookupSRVHeaderName
	}
	if d.MaxRequestsHeaderName == "" {
		d.MaxRequestsHeaderName = DefaultMaxRequestsHeaderName
	}
	return nil
}

// Select returns an available host, if any.
func (d DynamicSelection) Select(pool reverseproxy.UpstreamPool, r *http.Request) *reverseproxy.Upstream {
	logger := caddy.Log()
	var upstream = reverseproxy.Upstream{
		Dial:      r.Header.Get(d.DialHeaderName),
		LookupSRV: r.Header.Get(d.LookupSRVHeaderName),
	}
	if upstream.String() == "" {
		logger.Error("no dynamic upstream header found")
		return nil
	}
	var maxRequests int
	var err error
	header := r.Header.Get(d.MaxRequestsHeaderName)
	if header != "" {
		maxRequests, err = strconv.Atoi(header)
		if err != nil {
			logger.Error("parse dynamic upstream header failed", zap.String(d.MaxRequestsHeaderName, header))
		}
	}
	upstream.MaxRequests = maxRequests
	upstream, err = pool.Handler.LoadOrStore(upstream.String(), upstream)
	if err != nil {
		logger.Error("load or store upstream failed", zap.String("upstream", upstream.String()))
	}
	return &upstream
}

// 默认Header名字
const (
	DefaultDialHeaderName        = "X-Caddy-Upstream-Dial"
	DefaultLookupSRVHeaderName   = "X-Caddy-Upstream-LookupSRV"
	DefaultMaxRequestsHeaderName = "X-Caddy-Upstream-MaxRequests"
)

var (
	_ caddy.Provisioner     = (*DynamicSelection)(nil)
	_ reverseproxy.Selector = (*DynamicSelection)(nil)
)

but now, I can not implement add upstream dynamicly for caddy does not public some related methods or attributes, so I want to know if it’s OK for caddy to add a method LoadOrStore like this:

upstream, err = pool.Handler.LoadOrStore(upstream.String(), upstream)

To do this, reverse_proxy.UpstreamPool need to hold a pointer to reverse_proxy.Handler, and reverse_proxy.Handler need to add the LoadOrStore method?

Could you specify which ones these are? I think we could change Caddy to expose the things you need.

By the way, I think you’ll need to add ,omitempty to your json struct field tags so that those fields can be empty.

For logging, you should get a logger in your Provision step and store it in your module struct. Don’t use caddy.Log(). See the docs here:

Edit: Okay it took me a bit of time to understand what you were trying to say about pool and LoadOrStore. You’re looking to dynamically add new upstreams to the pool during the selection phase.

I’m not sure that’s the right approach, that doesn’t feel right to me. I think instead, we should further modularize the proxy handler to allow for child modules that can have an opportunity to do this, instead of trying to overload this within selection policies. Could you open this up as a feature request on Github?

The trouble with this though is that upstreams still need to be provisioned. See the logic in reverseproxy.go where various fields are set on the upstream for it to be ready to go.

For logging, you should get a logger in your Provision step and store it in your module struct. Don’t use caddy.Log().

Ok.

Could you open this up as a feature request on Github?

Ok.

The trouble with this though is that upstreams still need to be provisioned. See the logic in reverseproxy.go where various fields are set on the upstream for it to be ready to go.

Yes, I noticed that, so I want to do it in such ways:

1.Modify UpstreamPool to struct and hold a pointer to reverse_proxy handler;
2. add a LoadOrStore for reverse_proxy handler to complete this, and the provision will be completed by LoadOrStore.

Actually, what I want is let caddy to support load various modules dynamicly, not just upstream. The background is like a centralized gateway, which may contains too many modules to static configurated, so it’s best for caddy to support a mode like kong’s cluster mode.

I don’t think this is a good idea, I don’t think UpstreamPool should be aware of the proxy handler. It’s a child of the handler, it shouldn’t know about its parent.

Instead, I think it would be better to create a module namespace with an interface that the reverse proxy module can call to pass it the pool (or the handler itself maybe) and the current request, which would be called near the start of ServeHTTP. If there’s no submodule loaded for this, it would be a no-op.

what you mean is some thing like this?


func init() {
	caddy.RegisterModule(RequestHeader{})
}

// DynamicUpstreams get dynamic upstreams
type DynamicUpstreams interface {
	Upstreams(*http.Request) []reverseproxy.Upstream
}

// RequestHeader stands for a upstream source from http reqeust header
type RequestHeader struct {
	DialHeaderName        string `json:"dial_header_name,omitempty"`
	LookupSRVHeaderName   string `json:"lookup_srv_header_name,omitempty"`
	MaxRequestsHeaderName string `json:"max_requests_header_name,omitempty"`

	logger *zap.Logger
}

// CaddyModule returns the Caddy module information.
func (RequestHeader) CaddyModule() caddy.ModuleInfo {
	return caddy.ModuleInfo{
		ID:  "http.reverse_proxy.dynamic_upstreams.request_header",
		New: func() caddy.Module { return new(RequestHeader) },
	}
}

// Provision 实现caddy.Provisioner
func (r *RequestHeader) Provision(ctx caddy.Context) error {
	r.logger = ctx.Logger(r)
	if r.DialHeaderName == "" {
		r.DialHeaderName = DefaultDialHeaderName
	}
	if r.LookupSRVHeaderName == "" {
		r.LookupSRVHeaderName = DefaultLookupSRVHeaderName
	}
	if r.MaxRequestsHeaderName == "" {
		r.MaxRequestsHeaderName = DefaultMaxRequestsHeaderName
	}
	return nil
}

// Upstreams returns an available host, if any.
func (r RequestHeader) Upstreams(request *http.Request) []reverseproxy.Upstream {
	var upstream = reverseproxy.Upstream{
		Dial:      request.Header.Get(r.DialHeaderName),
		LookupSRV: request.Header.Get(r.LookupSRVHeaderName),
	}
	if upstream.String() == "" {
		return nil
	}
	var maxRequests int
	var err error
	header := request.Header.Get(r.MaxRequestsHeaderName)
	if header != "" {
		maxRequests, err = strconv.Atoi(header)
		if err != nil {
			r.logger.Error("parse dynamic upstream header failed", zap.String(r.MaxRequestsHeaderName, header))
		}
	}
	upstream.MaxRequests = maxRequests
	return []reverseproxy.Upstream{upstream}
}

var (
	_ DynamicUpstreams = (*RequestHeader)(nil)
)

Then in reverseproxy.Handler, for example, add a DynamicUpsteams like this:

type struct Handler {

    // ...
    DynamicUpstreams map[string]json.RawMessage `json:"dynamic_upstreams,omitempty" caddy:"namespace=http.reverse_proxy.dynamic_upstreams"`

    dynamicUpstreams map[string]DynamicUpstreams
    // ...
}

then in Provision, load all dynamic_upstreams modules, and at last,

func (h *Handler) ServeHTTP(w http.ResponseWriter, r *http.Request, next caddyhttp.Handler) error {
    for _, loader := range h.dynamicUpstreams {
        upstreams := loader.Upstreams(r)
        h.Store(upstreams)
    }
    // ...
}

Do I understand right?

Yeah, something like that. Might need some polish to get to a point where it feels good to add to Caddy, but that’s the general idea of how I think it should work. Matt will need to weigh in, of course.

Ok, got it

Feel free to open a PR, it’ll be easier to review and suggest changes :+1:

Ok, thank you

@matt , as you specified on Github, I think there are two other problems:

First, it seems that the port must be specified like this, for example,

Dial: {http.request.header.X-Caddy-Upstream-Dial}:80

thus, if the different upstreams uses different ports, then this solution not work, if we can use a default port 80 when not specified in Dial?

Second, all different upstreams become one upstream in such solution, and the health check and circuit breaker will become a global switch, but actually each upstream stands for an independent service, should have it’s own health check and circuit breaker.

Well, you can’t dial an address without a port. I assume with a name like X-Caddy-Upstream-Dial that it would have the address to dial. I recommend just putting the port in that header.

Second, all different upstreams become one upstream in such solution, and the health check and circuit breaker will become a global switch

Correct, so be careful when configuring health checks, be mindful that they represent multiple backends and not just one.

actually each upstream stands for an independent service, should have it’s own health check and circuit breaker.

That gets expensive quickly and can lead to memory exhaustion, which is why I haven’t implemented it like that. Your reverse proxy basically becomes a monitoring service for an arbitrary, unbounded number of backends, for an unspecified amount of time.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.