Need help with Production Performance


(Vitor Caetano) #1

Hi team, we recently bought 3 commercial licenses, although we are trying to spin 3 abiosoft/caddy-docker containers on our production servers. We did it as we tought that we might need some support from you.

Now we are facing a performance issue that I would like to describe to you and hope you can sheed some light in this case.

We prepared a Caddyfile with 17k domain entries that proxy to our web server.

Each one of this entries goes like this one

ourclientsdomain.com {
  tls {
    dns powerdns
  }
  proxyprotocol 0.0.0.0/0 ::/0
  rewrite {
      if {path} is /
      to /
  }
  rewrite {
    to {path} /proxy/{uri}
  }
  proxy /proxy frontssl:80 {
    without /proxy
    transparent
  }
  proxy / frontssl:80 {
    transparent
  }
}

All 17k certificates have been pre-generated and were written to a high performance SSD.

Then with all Caddy containers up and running we started load performance tests.

What we noticed is that the containers get high CPU load > 150% when we ramp 100 https calls. And we get timeouts for most of those https calls. Our production server do have lots of memory and CPU power, we are using m4.4xlarge EC2 instances.

On the other hand, if we prepare a Caddyfile with 1k of the same entries everything goes fine. And our 100 https calls test succeed with very little CPU load for each container < 5%

This simple test showed to me results that when the Caddyfile gets bigger with size greater then 1k with entries as the above, CPU will always get higher for client https calls. And timeouts are going to happen.

I’ve started investigating the Caddy’s source code now, and any help you can give me in this case will be of great value.

Thanks in advance.


(Matthew Fay) #2

Gonna ping @matt here for this one - but it might also help if you’d like to send Light Code Labs an email at their commercial support address.

I have one small question, though - I’m trying to wrap my head around what you achieve with this part:

  rewrite {
    to {path} /proxy/{uri}
  }
  proxy /proxy frontssl:80 {
    without /proxy
    transparent
  }
  proxy / frontssl:80 {
    transparent
  }

Does this not function the same without the rewrite and proxy /proxy? If the {path} doesn’t exist on disk, you’re simply prepending /proxy and then stripping it away again before sending it to the same upstream server, if I’m not mistaken.


(Vitor Caetano) #3

Hi, just prepending /proxy and then stripping it away again before sending it to the upstream.
This pattern forwards the path to the upstream. It is tested and its working. :wink:

I’ll send this topic to their commercial support address.

Thank you.


(Matthew Fay) #4

No doubt it works, it just might be a completely unnecessary configuration that introduces wasteful extra request processing.

{path} will go to the upstream regardless; there is no need to double handle the URI to do that.

Actually, that’s not 100% correct - the {uri} will go upstream regardless. Are you doing this because you want to strip the query / anchor portion before proxying?

If so, that could be achieved more neatly:

rewrite {
  to {path} {uri}
}
proxy / frontssl:80 {
  transparent
}

I mention this because it seems like avoiding unnecessary cycles manipulating the URI seems like the kind of perf that could matter serving 17k domains.

Just to confirm, too - since Caddy will be checking for the existence of {path} on disk during this rewrite, but you’re sending the request to another server anyway, do the site files actually exist on the Caddy host?


(Vitor Caetano) #5

Thanks, I can try your snippet tomorrow at office:

rewrite {
  to {path} {uri}
}
proxy / frontssl:80 {
  transparent
}

But in fact without the rewrite, I could not achieve the upstream to receive the URI part.
And the frontssl is one of our containers running on the same docker swarm cluster that our caddy container is running.


(Matthew Fay) #6

No worries.

I can tell you that’d be unintended behaviour - maybe a bug of some kind. Caddy should always preserve the requested URI when proxying upstream, this kind of rewrite shouldn’t be necessary for that purpose.

If you don’t actually need to explicitly strip the query and fragment from the URI in cases where the site files exist on disk, you can go without the rewrite entirely (assuming Caddy is working as intended).


(Matt Holt) #7

Think I figured out the latency problem. :slight_smile: Will send an email in a moment with details, and then we can follow up here once we’ve solved it for good.


(Vitor Caetano) #8

That’s good news Matt.
I’ll follow your email tomorrow. There is no need to hurry up things. Our production servers are in beta test right now.
Thanks for the good job!