Question about List API for Caddy/CertMagic Storage

@kmott When I run your exact config (I only removed the storage portion because I don’t have that module, and changed the load_folders path to one that exists on my machine), I get this output in my logs:

2023/05/16 17:28:20.674 DEBUG   http    starting server loop    {"address": "[::]:10443", "tls": true, "http3": true}

Make sure you are running the config you think you are.

Oh, you replied while I was replying, I guess…

Anyway… it looks like Caddy can’t find the right certificate in its cache. I would expect the .key file to also be loaded…

Does it work with the default storage? Strip your config down to the basic working form and then make one change at a time until it doesn’t work.

Anyway… it looks like Caddy can’t find the right certificate in its cache. I would expect the .key file to also be loaded…

Yeah, that’s sort of what I was originally thinking… basically that I needed to do something else special in my Caddy config to tell it to load my certs from external Storage in to the Cache so they can be used.

Do I need to add something like this, maybe (app.tls.certificates.load_storage)?

EDIT: I don’t know if it matters, but an external process placed the files in Storage (Caddy did not do it itself), using CertMagic Storage.

Yeah that does matter, the files need to be in exactly the format Caddy expects.

Ah right yeah, if you’re putting your own certs and keys in storage which are not managed by Caddy (i.e. not issued by ACME) then yes you’ll need to configure load_storage to tell Caddy where to grab the certs and keys in storage (i.e. which path in the storage to use for the cert and key). By default, Caddy is only looking for managed certs, so if you’re providing your own (unmanaged) then you need to tell it where to look.

Oooookay I see what’s going on now. Here’s what’s happening:

If the files are in the place and format Caddy expects, with the right permissions (if applicable), it should be fine.

Your config tells Caddy to load certificates from the /tmp/certificates folder on disk (storage module is irrelevant). Note the docs describe the expected format:

PEM files which contain both a certificate and a key.

But that PEM file evidently contains a cert (and key, I guess) for these names: localhost *.localdev.klmh.co *.localdev.us-west.mywordpress.io 127.0.0.1 ::1 – none of these match what the TLS handshake contains: abc123.localdev.mywordpress.io.

These log entries:

2023/05/16 17:22:13.274 INFO    serving initial configuration
2023/05/16 17:22:13.277 DEBUG   caddy.storage.vault     List() at url   {"url": "https://vault.localdev.klmh.co:8443/v1/klm/secrets/metadata/production/api/certificates/certificates", "recursive": false}
2023/05/16 17:22:13.277 DEBUG   caddy.storage.vault     Using approle client token for auth
2023/05/16 17:22:13.281 DEBUG   caddy.storage.vault     List() at url   {"url": "https://vault.localdev.klmh.co:8443/v1/klm/secrets/metadata/production/api/certificates/certificates/acme-staging-v02.api.letsencrypt.org-directory/", "recursive": false}
2023/05/16 17:22:13.281 DEBUG   caddy.storage.vault     Using approle client token for auth
2023/05/16 17:22:13.284 DEBUG   caddy.storage.vault     List() at url   {"url": "https://vault.localdev.klmh.co:8443/v1/klm/secrets/metadata/production/api/certificates/certificates/acme-staging-v02.api.letsencrypt.org-directory/abc123.localdev.mywordpress.io/", "recursive": false}
2023/05/16 17:22:13.284 DEBUG   caddy.storage.vault     Using approle client token for auth
2023/05/16 17:22:13.288 DEBUG   caddy.storage.vault     Load() from url {"url": "https://vault.localdev.klmh.co:8443/v1/klm/secrets/data/production/api/certificates/certificates/acme-staging-v02.api.letsencrypt.org-directory/abc123.localdev.mywordpress.io/abc123.localdev.mywordpress.io.crt"}
2023/05/16 17:22:13.288 DEBUG   caddy.storage.vault     Using approle client token for auth
2023/05/16 17:22:13.291 DEBUG   caddy.storage.vault     List() at url   {"url": "https://vault.localdev.klmh.co:8443/v1/klm/secrets/metadata/production/api/certificates/certificates/acme-staging-v02.api.letsencrypt.org-directory/abc123.localdev.mywordpress.io/", "recursive": false}
2023/05/16 17:22:13.291 DEBUG   caddy.storage.vault     Using approle client token for auth
2023/05/16 17:22:13.296 INFO    tls     finished cleaning storage units

are Caddy cleaning up your storage. It appears there is a cert in it, so it loads it to see when it expires. It will clean out long-expired certs. Then there’s nothing to do so it’s done.

During the handshake, the client is presenting abc123.localdev.mywordpress.io but as you can see, Caddy can’t find a cert for it:

2023/05/16 17:22:20.992 DEBUG   tls.handshake   no matching certificates and no custom selection logic  {"identifier": "abc123.localdev.mywordpress.io"}
2023/05/16 17:22:20.992 DEBUG   tls.handshake   no matching certificates and no custom selection logic  {"identifier": "*.localdev.mywordpress.io"}
2023/05/16 17:22:20.992 DEBUG   tls.handshake   no matching certificates and no custom selection logic  {"identifier": "*.*.mywordpress.io"}
2023/05/16 17:22:20.992 DEBUG   tls.handshake   no matching certificates and no custom selection logic  {"identifier": "*.*.*.io"}
2023/05/16 17:22:20.992 DEBUG   tls.handshake   no matching certificates and no custom selection logic  {"identifier": "*.*.*.*"}
2023/05/16 17:22:20.992 DEBUG   tls.handshake   all external certificate managers yielded no certificates and no errors {"remote_ip": "127.0.0.1", "remote_port": "48750", "sni": "abc123.localdev.mywordpress.io"}
2023/05/16 17:22:20.992 DEBUG   tls.handshake   no certificate matching TLS ClientHello {"remote_ip": "127.0.0.1", "remote_port": "48750", "server_name": "abc123.localdev.mywordpress.io", "remote": "127.0.0.1:48750", "identifier": "abc123.localdev.mywordpress.io", "cipher_suites": [4866, 4867, 4865, 49196, 49200, 159, 52393, 52392, 52394, 49195, 49199, 158, 49188, 49192, 107, 49187, 49191, 103, 49162, 49172, 57, 49161, 49171, 51, 157, 156, 61, 60, 53, 47, 255], "cert_cache_fill": 0.0001, "load_if_necessary": true, "obtain_if_necessary": true, "on_demand": false}
2023/05/16 17:22:20.992 DEBUG   http.stdlib     http: TLS handshake error from 127.0.0.1:48750: no certificate available for 'abc123.localdev.mywordpress.io'

because no certificate was loaded that has or matches that name.

To clarify: the use of load_storage is very rare because typically people let Caddy manage the certificates automatically, which uses the configured storage, and so it will get and put certificates in storage automatically. Thus manually configuring cert loading from a storage module is very uncommon, and is only done if unmanaged certs are being stored in storage other than the local disk. I’m not sure if that’s your intention.

1 Like

If the files are in the place and format Caddy expects, with the right permissions (if applicable), it should be fine.

The certs were placed there by CertMagic Storage mod in the correct/expected format which then other Caddy instances will (effectively) “consume” for incoming TLS connections, no matter what distributed Caddy instance it is that receives the request.

Your config tells Caddy to load certificates from the /tmp/certificates folder on disk (storage module is irrelevant).

Correct, I have 2 sources of certificates (at least for now)–one on disk at /tmp/certificates, and another in remote Storage using the module previously mentioned.

To clarify: the use of load_storage is very rare because typically people let Caddy manage the certificates automatically, which uses the configured storage, and so it will get and put certificates in storage automatically. Thus manually configuring cert loading from a storage module is very uncommon, and is only done if unmanaged certs are being stored in storage other than the local disk. I’m not sure if that’s your intention.

I guess I’m not quite following on this part. My (probably incorrect) assumption was that external Storage could be the source-of-truth for all TLS certificates in a deployment, and so if I point any random Caddy instance at that Storage module, it will fetch and use the certificate for incoming connections as it gets a match.

But I think what you are saying is that because Caddy was not originally aware of those certificates, it considers them “unmanaged”, and I have to tell Caddy to use them via load_storage (along with my loading from /tmp/certificates directory)?

Assuming load_storage is the way to go, do I have to list every certificate in load_storage.pairs[] in order for Caddy to use them? (ref) I was really hoping that I could point any of my Caddy instances to the remote storage and just have them use the certificates without having to list them individually–is that possible? Or am I missing something (which is entirely possible, :slight_smile: ).

Did some testing, and if I add a block like this to my config, it all seems to work:

        "load_storage": {
          "pairs": [{
            "certificate": "certificates/acme-staging-v02.api.letsencrypt.org-directory/abc123.localdev.mywordpress.io/abc123.localdev.mywordpress.io.crt",
            "key": "certificates/acme-staging-v02.api.letsencrypt.org-directory/abc123.localdev.mywordpress.io/abc123.localdev.mywordpress.io.key",
            "format": "pem"
          }]
        }

I was really hoping to not have to enumerate each certificate to load from Storage, is there a way to do that? I did try an empty load_storage, but that didn’t seem to work: load_storage: {} (and neither did an empty load_storage.pairs: load_storage: { pairs: [{}] }

I don’t understand. Are the certs in the storage compatible with CertMagic/caddy? Like they’re in the format caddy expects and they have a JSON sidecar file?

If so, why not just use automatic HTTPS? Why are you loading them all manually?

Are the certs in the storage compatible with CertMagic/caddy? Like they’re in the format caddy expects and they have a JSON sidecar file?

Yes, they were placed there by a CertMagic Storage module, and have all of the requisite files in place. If I add the block for load_storage as in my previous post, it will find the certificates fine and serve the connection successfully.

If so, why not just use automatic HTTPS? Why are you loading them all manually?

I will eventually have many certs spread across many Caddy instances, and would rather not have to push that configuration down to Caddy and keep it updated. I would rather manage the TLS certificate’s outside of Caddy, pushing them to a central Storage engine, that I can then point my various Caddy instances at, which do nothing but serve TLS connections to clients using certificates from the Storage module.

Honestly, I would just recommend letting Caddy do that for you. It coordinates cert management in a cluster automatically. Just configure them with the same storage, give the instances their associated hostnames in their configs, and then let Caddy do the rest. I think you’re overcomplicating it, probably unknowingly :slight_smile:

I guess what I’m asking is if load_storage can behave the same as load_folders, but it sounds like that’s not possible?

FolderLoader loads certificates and their associated keys from disk by recursively walking the specified directories, looking for PEM files which contain both a certificate and a key.

Question: With load_folder, does Caddy periodically walk the list of folders and re-refresh (re-load?) the certs on-disk in case one was recently renewed? Or does it require a manual re-load of Caddy to pull the latest certs from load_folders directive?

Well, load_storage is intended to be very flexible. load_folders as you know expects a very specific format, one that honestly isn’t used much these days. I haven’t seen a use for this within Caddy in years. Automatic HTTPS has mostly supplanted the need to load certs manually.

The second one. Certificates that aren’t managed by Caddy are loaded at config-load, like with traditional web servers.

I really, really recommend capitalizing on Caddy’s auto-management of certificates. I don’t think it makes much sense to use Caddy if you’re going to be doing all the work yourself. Maybe try and help me understand the reason you need to manually do all the cert management and loading?

Maybe try and help me understand the reason you need to manually do all the cert management and loading?

My biggest issue is that if I use on_demand with Caddy solving for me, when that first connection comes in, it can take a REALLY long time (in perceived time to the user, 1m - 5m) to solve the ACME DNS challenge (since I don’t want to push config for 1000’s of hostnames to each Caddy instance).

What I really want is to be able to provide feedback for my users that once they create the site, it’s 100% ready at some point, which means I know for a fact it’s ready to receive requests.

As it stands now, if I use on_demand, they try to access it, and it just kinda sits there not doing anything, which appears to be broken to them (“hey, I can’t get access to this new site I created, what gives?”). Most people try other things (reload, close and re-open, etc) in < 5s if a server doesn’t respond, based on my experience.

I am trying some more tests using on_demand, but I may also try to see if there’s an easy way to point at Storage and somehow load + refresh discovered certs automatically, since I already know they are to be trusted (much like a load_folder).

Thank you.

But I don’t see on-demand TLS enabled in your config – are you using the DNS challenge with On-Demand TLS? That is a bit unusual as well, since this typically requires automated control over the DNS zone, which conflicts with On-Demand TLS’ primary use case, that is you don’t control the domain. Most people just enable on-demand and use the default HTTP-01 and TLS-ALPN-01 challenges, which work in about 2-5 seconds, not minutes.

Many SaaS using on-demand TLS will make that initial connection for the user behind the scenes, so that a certificate is provisioned before they even try to load it. I would just do that if I were you.

Again, just have your app connect to their domain before you tell them it’s ready to go. Then by the time they connect to it, the cert is already there, even if it does take a few minutes. But it should really only take a few seconds.

Most people just enable on-demand and use the default HTTP-01 and TLS-ALPN-01 challenges, which work in about 2-5 seconds, not minutes.

Yeah, so that’s the ticket, the missing piece of the puzzle (at least for my current half-baked implementation). All of this had started based on a legacy implementation that used LEGO with Route53, so I thought to my self “ehh, it’s all Let’s Encrypt, it should all work the same”, but of course, that was an invalid assumption (and entirely wrong, :rofl:).

My plan at this point is to throw out all of the DNS specific stuff, and just use TLS-ALPN. The only minor inconvenience is that I will need to use DNS for my localdev environment, since it’s not directly on the internet, but that’s a small price to pay.

Thank you (and francislavoie!) for your patience walking me thru the issues, and patiently pointing out where I was making incorrect assumptions–I :heart: Caddy a lot, and love how easy it is to hack on and extend. I’ve used many ingresses (NGINX, Apache2, lightty, Fabio, Ambassador/Emissary, Contour, Isitio, etc.), and nothing is as easy or as flexible as Caddy.

Great work and fantastic support. Thank you.

2 Likes

Excellent :raised_hands:

Yeah, I’ll be curious what your final config ends up being. It should be quite a bit simpler than what’s going on now.

Is any of this going to be public? Is it for a SaaS? Can you tell me? :D’’

Thanks for the positive feedback. Really good to know!

I’ll also shape the docs on the new site based on our conversation.

Nah, you can use Caddy’s internal issuer which has it set it up its own CA. No need for a publicly trusted certificate when doing local development.

1 Like

Is any of this going to be public? Is it for a SaaS? Can you tell me? :smiley:

Yeah, it’s for https://mywordpress.io, which is still in it’s infancy (#NoJudgement, lol).

Nah, you can use Caddy’s internal issuer which has it set it up its own CA. No need for a publicly trusted certificate when doing local development.

Yes, great point, I’ll do that.

1 Like

This topic was automatically closed after 30 days. New replies are no longer allowed.