Serving tens of thousands of domains over HTTPS with Caddy

matt · January 20, 2021, 3:24am

This guide is a free sample of what is available exclusively for sponsors in my Expert Caddy series, where I help you master the ways of the Caddy web server.

(This guide is still WIP.)

Many online businesses have invested thousands, even millions, of dollars for custom software and proprietary solutions to serve their customers’ websites over HTTPS. These traditional methods then cost thousands of dollars per year to maintain it in infrastructure, engineering, and payroll.

I’ll show you how to use Caddy to do this for free in just a few minutes.

These are techniques used by real companies using Caddy in production. Several of them are our sponsors!

Fathom Analytics
Transistor.FM
Oh Dear!
(Several more – working on getting permission to link them publicly)

If your company uses Caddy, please consider sponsoring my full-time development of Caddy. It’s a great look for your company, benefits your customers, and ensures Caddy’s ongoing open source maintenance!

Preface

There are a few common ways to serve lots of sites:

Subpaths, e.g. /sites/customer12 – just use path matchers; this is nothing special and not what this guide is about.
Subdomains, e.g. customer12.example.com – only mildly interesting; we’ll cover this but all you need is a single DNS entry and a wildcard certificate.
Custom domains, e.g. customer12.com – now we’re talkin’! This is the kind of feature that is complex, expensive, and high-maintenance with other self-hosted solutions. But not with Caddy.

This guide assumes your own service or Caddy module handles your unique application logic. We’ll just reverse_proxy localhost:9000 to represent this.

We also assume you’re using the Caddyfile for configuration. While the Caddyfile is easy to write, sometimes its structure is not compatible with advanced requirements. In those cases, a simpler, more elegant, and more concise JSON config can often be written by hand (possibly using the Caddyfile as a starting point). Do not be afraid to do this!

All our examples demonstrate fully-automatic HTTPS.

Subdomains (wildcard certificate)

This is easy:

*.example.com

reverse_proxy localhost:9000

With that, Caddy will serve all direct subdomains of example.com using a single wildcard certificate. Oh, but check your ACME CA’s policies about issuing wildcard certificates. At time of writing, some CAs like Let’s Encrypt only issue wildcard certificates using the DNS challenge. The BRs forbid the TLS-ALPN challenge from validating wildcards. The fate of the HTTP challenge with regards to wildcard certificates is still being determined.

So, it is best to plug in your DNS provider’s module, then enable the DNS challenge, making your Caddyfile more like this:

{
    acme_dns cloudflare topsecret123
}

*.example.com

reverse_proxy localhost:9000

You can also put your secret in an environment variable and use a placeholder if you want to keep it out of your config file.

Of course, this assumes all customers are served by the same backend. You can add more to automatically load balance:

reverse_proxy 10.0.0.1:9000 10.0.0.2:9000 10.0.0.3:9000 ...

Or you might put each customer on their own backend:

@customer1 host customer1.example.com
reverse_proxy @customer1 10.0.0.1:9000

@customer2 host customer2.example.com
reverse_proxy @customer2 10.0.0.2:9000

...

and so on. Alternatively, you could use the map handler:

map {labels.2} {backend} {
    customer1 10.0.0.1
    customer2 10.0.0.2
    ...
}
reverse_proxy {backend}:9000

Subdomains (individual certificates)

This is not recommended because some CAs like Let’s Encrypt enforce strict rate limits for subdomains: you’re better off using a wildcard like described above. But at time of writing, some CAs like ZeroSSL don’t have this restriction.

This is just as you’d expect:

customer1.example.com,
customer2.example.com,
... {
    reverse_proxy localhost:9000
}

Or if each customer has a different backend:

customer1.example.com {
    reverse_proxy 10.0.0.1:9000
}
customer2.example.com {
    reverse_proxy 10.0.0.2:9000
}
...

and so on.

Registered domains (on-demand)

If your customers can use a custom domain name with your service, you can serve those over HTTPS exactly the same way as described above with subdomains.

That’s all well and good, but doing that has some problems:

It requires hard-coding all the domains into the config. This might be fine if you’re a domain broker/registrar that is in control of the domains, but even then, you’ll have to update your config every time a customer signs up, cancels, or changes their domain.
It requires that all the domains’ DNS are properly set to point to your server, because Caddy will try to manage certificates for those domains when the config is loaded. Again, if you control the domain names, this might be a reasonable expectation. But usually, your customers are in control of their own domain names. You shouldn’t tell Caddy to manage certificates for domains you do not control the DNS for. You do not know if or when customers will update their domain’s DNS records.

The goal, then, is to be able to have as simple and static a config as possible, so that HTTPS just works for any domains your customers have as soon as they point their DNS to your service.

How can it be done? Simply by enabling On-Demand TLS. This is a feature that is exclusive to Caddy, and is a reason a lot of businesses choose it. With on-demand TLS, you do not need to write the domains into the config.

The first part of an on-demand config looks something like this:

# (Not a complete config yet!)
https:// {
    tls {
        on_demand
    }
    reverse_proxy localhost:9000
}

First we tell Caddy to listen on the HTTPS port, and since we’ve omitted a hostname from the site address, Caddy will accept all hosts that come into that port. Then we enable on-demand TLS, and we reverse proxy all requests to our application backend. When a TLS handshake is established for a ServerName that Caddy does not yet have a certificate for, it will attempt to obtain one during the handshake. Cool. HOWEVER…

We need to prevent abuse. With on_demand enabled without restrictions, Caddy would attempt to get a certificate for any ServerName it gets in any ClientHello (TLS handshake) from any clients. This is bad for a few reasons:

Clients are untrusted and can attempt handshakes with any server names they want.
CAs can rate limit failed attempts.
CAs can rate limit all attempts, actually! But clients do not rate limit themselves.
Attackers can point lots of domains to your server and successfully get certificates, resutling in disk space exhaustion and other problems.

Because of this, you MUST enable restrictions for on-demand TLS. The easiest way to do this is with the on_demand_tls global option. You can configure an “ask” endpoint that Caddy will make an HTTP request to and “ask” if it can obtain a certificate for the given hostname. You can also configure a simple throttle for the on-demand certificate requests.

A safer, more complete version of our on-demand config looks like this:

{
    on_demand_tls {
        ask      http://localhost:5555/check
    }
}

https:// {
    tls {
        on_demand
    }
    reverse_proxy localhost:9000
}

Now on-demand TLS is enabled because Caddy will ask http://localhost:5555/check?domain=example.com before issuing a certificate for a domain name in a TLS handshake. Typically, this endpoint would be a little program or script that checks your database to see if that domain name is recognized. Some users even configure this endpoint to be served by the same Caddy instance – very elegant! Make sure your endpoint returns 200 OK only for allowed domains or you could be blocked by CAs if your server is an abuse vector!

How you implement your ‘ask’ endpoint is up to you. The point is that you don’t want any client to be able to get your server to request any certificate they ask for whenever they ask for it. You should make sure the domain name is recognized by your system and possibly that it is properly configured for your needs before your approve it.

That’s about all there is to it. Many businesses are using on-demand TLS to serve tens of thousands of sites on a single instance, and yet others are using a cluster of Caddy instances, in which case Caddy coordinates certificate management across them automatically (and shares the assets instead of duplicating them).

Serving files

If you want to serve static files from a path that is based on the domain, you can replace the reverse_proxy from our examples with:

root * /path/{host}
file_server

Or really, you can replace the proxy or static file server with pretty much any logic that suits your needs.

basil · March 2, 2021, 9:57am

@matt Am I correct in saying the wildcard certificate applies to the subdomains e.g. *.example.com, but not the domain itself i.e. example.com?

francislavoie · March 2, 2021, 2:10pm

That’s correct. Wildcard certificates don’t apply for the main domain, only subdomains. You can tell Caddy to manage a separate certificate for that.

JB1 · May 16, 2021, 2:08pm

@matt I think this is a great feature, which I’ll use for setting up many domains. One current issue I found while experimenting with this feature, is that occasionally an invalid certificate is returned to the browser if the certificate doesn’t yet exist/is created on demand. Reloading the page doesn’t work, only when restarting the browser the certificate is valid. In this case, creating the new certificate seems to work fine, and no errors or whatsoever are displayed in the logs. As I said, this only happens occasionally.

matt · May 16, 2021, 4:27pm

This is the browser’s doing, Caddy is probably doing things right. Use curl to test instead of a browser.

Carter_Bryden · May 17, 2021, 10:57pm

I’ve noticed this in Chrome, and it’s definitely the browser. It seems like it’s caching something somewhere and the new cert doesn’t match up with what it’s expecting (which is now out of date). If I boot up Edge or Firefox that domain will load up with TLS just fine.