Blue/Green deployments and A/B testing [session affinity and upstream weights]

dhufnagel · July 8, 2021, 9:24pm

Hi. I was looking for an reverse proxy for a blue/green deployment strategy or even an A/B testing strategy, but as far as my understanding from the docs, I do not see a possible way to achieve this with caddy in the current state. This would require an option for session affinity independent from the load balancing strategy, as well as an option to have different weights for upstreams.

Regardless of what I want to achieve, I think the session affinity provided by the lb_policy cookie should be extracted to an gloabl load balancing option. The lb_policy cookie just selects a random upstream and sticks to this one on subsequent requests, but in my opinion, one may want to use any of the other policies for the first selection of the upstream. and stick to it afterwards.
So I think if the session cookie would be applied after selecting a host, it would be a lot more useful.

To achieve blue/green deployment and A/B testing, one would also need to have an option to specify weights to the upstreams. This could be used to distribute the requests in the A/B test ratio and could also be used for blue/green deployments by setting the weight of one upstream to 0.

What do you think, are these valid use cases for caddy and may be worth taking a look at? Or is caddy not really the right tool for this?

francislavoie · July 8, 2021, 10:15pm

The nice thing about it is selection policies are pluggable, so you can implement your own to do what you need.

It just needs to be a module in the namespace http.reverse_proxy.selection_policies.* which implements the Selector interface.

dhufnagel · July 8, 2021, 10:47pm

Sure you can, but can’t you just have a single policy applied? You would also need metadata for upstream for the weights, so this cannot be simply done via a module, or can it?
As already mentioned, I think the session affinity should be separated from the load balancer policy, just as a general option.

francislavoie · July 8, 2021, 11:02pm

We probably just need to uncomment these lines so that selection policies can write affinity information to the upstreams

github.com

caddyserver/caddy/blob/9e16e80f3c9ae663e3ecf596a578a852172e38fb/modules/caddyhttp/reverseproxy/hosts.go#L90-L93

    
      
          	// TODO: This could be really useful, to bind requests
          	// with certain properties to specific backends
          	// HeaderAffinity string
          	// IPAffinity     string

None of the existing selection policies have needed it, so it was left out. But we can add them in should it be useful.

If you want to give a shot at implementing what you need, PRs are welcome.

dhufnagel · July 9, 2021, 7:31pm

Alright, I may give it a try. Is there currently an option to disable an upstream other than removing it completely?
Or will the a new option like this be necessary?

github.com

dhufnagel/caddy/blob/07efee182bee7cbddbeb612e46d00d508c4fc70a/modules/caddyhttp/reverseproxy/hosts.go#L95-L97

    
      
          	// If the upstream is inactive, it will not be available for
          	// connections.
          	Inactive bool `json:"inactive"`

francislavoie · July 9, 2021, 7:42pm

Typically you’d reload the config having removed the upstream. I’m not sure what value there is in having the upstream still configured.

dhufnagel · July 9, 2021, 7:52pm

The value is, that you can deactivate the upstream for new connections but letting already established session still connect to it. This is necessary for green/blue deployment in session aware backends.

francislavoie · July 9, 2021, 8:12pm

Maybe like Unselectable bool might be a better name for that? Or RejectsConnections bool? I’m not sure Inactive properly conveys the purpose.

An upstream could be marked unhealthy via upstreamHost.unhealthy but this field is managed by the active health checks, so it’s probably not appropriate to reuse for what you’re trying to do.

dhufnagel · July 9, 2021, 9:22pm

Yes, indeed the name might be a bit unintelligible. I changed it to RejectsNewConnections.

I also tried to reuse the cookie selection policy for my purpose, as this already implements sticky sessions. I changed it so it only changes the host from the cookie if it is not healthy. Otherwise it will stick with it, even if it is full.
Furthermore I added the ability to delegate the initial upstream selection to another selection policy.

I would be happy to get feedback for the ideas an work done so far.

As this started as a fun project for me this evening but may also be useful for others, I will take a look into the contribution guide and polish the changes.

elsgaard · July 13, 2021, 5:47pm

For our setup, we were also looking for an solution to disable/inactivate individual upstreams, but we ended up doing it differently. When upgrading the upstreams where we need to “disable” them in caddy to avoid new sessions, we simple change the HTTP response on the health check endpoint on the upstream server, the upstream is then reported unhealthy in caddy, and we can perform a graceful rolling update, when we need caddy to insert the upstream into service, we report 200 OK on the health endpoint. But I guess it’s just what fits the problem best, HAproxy can do some smart things like draining connections etc, we just moved this to the indivual upstream servers where we use caddy. So far it is working fine…

dhufnagel · July 13, 2021, 8:55pm

Thanks for sharing your solution. But I guess it has the same effect as removing the upstream from the config.
My solution would be more like draining established sessions because old sessions can still connect to the upstream (if it is healthy).
If there is interest in my work so far, I will continue working on it. I appreciate feedback to it, so it will not only fit my needs.

alban · July 13, 2021, 9:08pm

Hi there, i have exact same needs, I will have look at your work to give some feedbacks.

alban · July 16, 2021, 3:26pm

Hello, I read your code, so I will have to set ...upstreams[0].rejectsNewConnections = true in json config to use it ?
For a start I think it is ok, as futur improvement I would love to have something similar so health_uri

system · September 6, 2021, 9:25pm

This topic was automatically closed after 60 days. New replies are no longer allowed.