Unable to generate certificates due to "x509: certificate signed by unknown authority" when trying to communicate with acme service

1. Caddy version: 2.6.2

2. How I installed, and run Caddy:

I’ve set this up using the (slightly old now) API Platform condig, which as worked fine in the past.

# Build Caddy with the Mercure and Vulcain modules
FROM caddy:2-builder-alpine AS api_platform_caddy_builder

RUN xcaddy build \
	--with github.com/dunglas/mercure/caddy \
	--with github.com/dunglas/vulcain/caddy \
        --with github.com/caddy-dns/cloudflare

# Caddy image
FROM caddy:2-alpine AS api_platform_caddy

WORKDIR /srv/api

COPY --from=api_platform_caddy_builder /usr/bin/caddy /usr/bin/caddy
COPY --from=api_platform_php /srv/api/public public/
COPY docker/caddy/Caddyfile /etc/caddy/Caddyfile

a. System environment:

Linux server running CentOS 6
Docker version 23.0.0, build e92dd87

b. Command:

docker-compose up -d

c. Service/unit/compose file:

services:
  caddy:
    build:
      context: /Users/me/src/project/api
      dockerfile: Dockerfile
      target: api_platform_caddy
    depends_on:
      php:
        condition: service_started
      pwa:
        condition: service_started
    environment:
      AUTH_DIRECTIVES: |
        basicauth * {
            user {REDACTED}
        }
      MERCURE_PUBLISHER_JWT_KEY: {REDACTED}
      MERCURE_SUBSCRIBER_JWT_KEY: {REDACTED}
      PWA_UPSTREAM: pwa:3000
      SERVER_NAME: showapi.goodcrm.co.uk
      TLS_DIRECTIVE: |
        tls {
          dns cloudflare {REDACTED}
        }
    image: apiv2_caddy:0.1.0
    labels:
      uk.co.goodcrm.description: Caddy reverse proxy
      uk.co.goodcrm.environment: prod
      uk.co.goodcrm.system: CRM
      uk.co.goodcrm.system.component: api-v2
      uk.co.goodcrm.system.component.service: caddy
    networks:
      default: null
    ports:
    - target: 80
      published: "8001"
      protocol: tcp
    - target: 443
      published: "8000"
      protocol: tcp
    - target: 443
      published: "8000"
      protocol: udp
    restart: unless-stopped
    volumes:
    - type: volume
      source: caddy_config
      target: /config
      volume: {}
    - type: volume
      source: caddy_data
      target: /data
      volume: {}
    - type: volume
      source: php_socket
      target: /var/run/php
      volume: {}

d. My complete Caddy config:

{
	# Debug
	{$DEBUG}

	# HTTP/3 support
	servers {
	}

	acme_ca https://acme.zerossl.com/v2/DV90
	email hello@goodcrm.co.uk
}

{$SERVER_NAME}

log 

{$TLS_DIRECTIVE} 

{$AUTH_DIRECTIVES}

# Matches requests for HTML documents, for static files and for Next.js files,
# except for known API paths and paths with extensions handled by API Platform
@pwa expression `( 
{header.Accept}.matches("\\btext/html\\b")
&& !{path}.matches("(?i)(?:^/docs|^/graphql|^/bundles/|^/_profiler|^/_wdt|\\.(?:json|html$|csv$|ya?ml$|xml$))")
)
|| {path} == "/favicon.ico"
|| {path} == "/manifest.json"
|| {path} == "/robots.txt"
|| {path}.startsWith("/_next")
|| {path}.startsWith("/sitemap")`

route {
	root * /srv/api/public
	mercure {
		# Transport to use (default to Bolt)
		transport_url {$MERCURE_TRANSPORT_URL:bolt:///data/mercure.db}
		# Publisher JWT key
		publisher_jwt {env.MERCURE_PUBLISHER_JWT_KEY} {env.MERCURE_PUBLISHER_JWT_ALG}
		# Subscriber JWT key
		subscriber_jwt {env.MERCURE_SUBSCRIBER_JWT_KEY} {env.MERCURE_SUBSCRIBER_JWT_ALG}
		# Allow anonymous subscribers (double-check that it's what you want)
		anonymous
		# Enable the subscription API (double-check that it's what you want)
		subscriptions
		# Extra directives
		{$MERCURE_EXTRA_DIRECTIVES}
	}
	vulcain
	push

	# Add links to the API docs and to the Mercure Hub if not set explicitly (e.g. the PWA)
	header ?Link `</docs.jsonld>; rel="http://www.w3.org/ns/hydra/core#apiDocumentation", </.well-known/mercure>; rel="mercure"`
	# Disable Google FLOC tracking if not enabled explicitly: https://plausible.io/blog/google-floc
	header ?Permissions-Policy "interest-cohort=()"

	# Comment the following line if you don't want Next.js to catch requests for HTML documents.
	# In this case, they will be handled by the PHP app.
	reverse_proxy @pwa http://{$PWA_UPSTREAM}

	php_fastcgi unix//var/run/php/php-fpm.sock
	encode zstd gzip
	file_server
}

3. The problem I’m having:

I’m unable to create a certificate to use due to “x509: certificate signed by unknown authority” errors when attempting to communicate with the acme server. I’m experiencing this issue with both the https://acme.zerossl.com/v2/DV90 server, and the letsencrypt staging server.

Oddly, I deploy this for various clients of our system, and some of these have deployed fine, and some have after many attempts to clear cache/re-pull images/tweak config, bug I can’t for the life of me work out what has got it working on some (eventually) and not on others.

4. Error messages and/or full log output:

{
  "level": "info",
  "ts": 1675729108.5560286,
  "msg": "using provided configuration",
  "config_file": "/etc/caddy/Caddyfile",
  "config_adapter": "caddyfile"
}
{
  "level": "warn",
  "ts": 1675729108.561108,
  "msg": "Caddyfile input is not formatted; run the 'caddy fmt' command to fix inconsistencies",
  "adapter": "caddyfile",
  "file": "/etc/caddy/Caddyfile",
  "line": 2
}
{
  "level": "info",
  "ts": 1675729108.5638237,
  "logger": "admin",
  "msg": "admin endpoint started",
  "address": "localhost:2019",
  "enforce_origin": false,
  "origins": [
    "//localhost:2019",
    "//[::1]:2019",
    "//127.0.0.1:2019"
  ]
}
{
  "level": "info",
  "ts": 1675729108.564518,
  "logger": "tls.cache.maintenance",
  "msg": "started background certificate maintenance",
  "cache": "0xc000359260"
}
{
  "level": "info",
  "ts": 1675729108.5648863,
  "logger": "http",
  "msg": "server is listening only on the HTTPS port but has no TLS connection policies; adding one to enable TLS",
  "server_name": "srv0",
  "https_port": 443
}
{
  "level": "info",
  "ts": 1675729108.564935,
  "logger": "http",
  "msg": "enabling automatic HTTP->HTTPS redirects",
  "server_name": "srv0"
}
{
  "level": "info",
  "ts": 1675729108.586613,
  "logger": "http",
  "msg": "enabling HTTP/3 listener",
  "addr": ":443"
}
{
  "level": "info",
  "ts": 1675729108.5866892,
  "logger": "tls",
  "msg": "cleaning storage unit",
  "description": "FileStorage:/data/caddy"
}
{
  "level": "info",
  "ts": 1675729108.586877,
  "msg": "failed to sufficiently increase receive buffer size (was: 208 kiB, wanted: 2048 kiB, got: 416 kiB). See https://github.com/lucas-clemente/quic-go/wiki/UDP-Receive-Buffer-Size for details."
}
{
  "level": "info",
  "ts": 1675729108.587017,
  "logger": "http.log",
  "msg": "server running",
  "name": "srv0",
  "protocols": [
    "h1",
    "h2",
    "h3"
  ]
}
{
  "level": "info",
  "ts": 1675729108.587188,
  "logger": "http.log",
  "msg": "server running",
  "name": "remaining_auto_https_redirects",
  "protocols": [
    "h1",
    "h2",
    "h3"
  ]
}
{
  "level": "info",
  "ts": 1675729108.587204,
  "logger": "http",
  "msg": "enabling automatic TLS certificate management",
  "domains": [
    "showapi.goodcrm.co.uk"
  ]
}
{
  "level": "info",
  "ts": 1675729108.5888839,
  "logger": "tls.obtain",
  "msg": "acquiring lock",
  "identifier": "showapi.goodcrm.co.uk"
}
{
  "level": "info",
  "ts": 1675729108.589751,
  "logger": "tls",
  "msg": "finished cleaning storage units"
}
{
  "level": "info",
  "ts": 1675729108.5907013,
  "msg": "autosaved config (load with --resume flag)",
  "file": "/config/caddy/autosave.json"
}
{
  "level": "info",
  "ts": 1675729108.590731,
  "msg": "serving initial configuration"
}
{
  "level": "info",
  "ts": 1675729108.6527288,
  "logger": "tls.obtain",
  "msg": "lock acquired",
  "identifier": "showapi.goodcrm.co.uk"
}
{
  "level": "info",
  "ts": 1675729108.6533053,
  "logger": "tls.obtain",
  "msg": "obtaining certificate",
  "identifier": "showapi.goodcrm.co.uk"
}
{
  "level": "warn",
  "ts": 1675729108.6725454,
  "logger": "http.acme_client",
  "msg": "HTTP request failed; retrying",
  "url": "https://acme.zerossl.com/v2/DV90",
  "error": "performing request: Get \"https://acme.zerossl.com/v2/DV90\": x509: certificate signed by unknown authority"
}
{
  "level": "warn",
  "ts": 1675729108.9304984,
  "logger": "http.acme_client",
  "msg": "HTTP request failed; retrying",
  "url": "https://acme.zerossl.com/v2/DV90",
  "error": "performing request: Get \"https://acme.zerossl.com/v2/DV90\": x509: certificate signed by unknown authority"
}
{
  "level": "warn",
  "ts": 1675729109.1888967,
  "logger": "http.acme_client",
  "msg": "HTTP request failed; retrying",
  "url": "https://acme.zerossl.com/v2/DV90",
  "error": "performing request: Get \"https://acme.zerossl.com/v2/DV90\": x509: certificate signed by unknown authority"
}
{
  "level": "error",
  "ts": 1675729109.1889954,
  "logger": "tls.obtain",
  "msg": "could not get certificate from issuer",
  "identifier": "showapi.goodcrm.co.uk",
  "issuer": "acme.zerossl.com-v2-DV90",
  "error": "registering account [mailto:me@example.com] with server: provisioning client: performing request: Get \"https://acme.zerossl.com/v2/DV90\": x509: certificate signed by unknown authority"
}
{
  "level": "error",
  "ts": 1675729109.3728323,
  "logger": "tls.obtain",
  "msg": "could not get certificate from issuer",
  "identifier": "showapi.goodcrm.co.uk",
  "issuer": "acme.zerossl.com-v2-DV90",
  "error": "account pre-registration callback: performing EAB credentials request: Post \"https://api.zerossl.com/acme/eab-credentials-email\": x509: certificate signed by unknown authority"
}
{
  "level": "error",
  "ts": 1675729109.3729205,
  "logger": "tls.obtain",
  "msg": "will retry",
  "error": "[showapi.goodcrm.co.uk] Obtain: account pre-registration callback: performing EAB credentials request: Post \"https://api.zerossl.com/acme/eab-credentials-email\": x509: certificate signed by unknown authority",
  "attempt": 1,
  "retrying_in": 60,
  "elapsed": 0.719900886,
  "max_duration": 2592000
}

5. What I already tried:

  • I’ve rebuild the containers using --no-cache.
  • I’ve ensured I’ve removed old images and pulled the latest one.
  • Tried the letsencrypt staging server, as well as the zerossl one show in this config.

One odd thing I notice, is that despite removing the images and rebuilding today, with a modified Caddyfile, the docker image still shows as 3 months old…

… but it contains the updates to the Caddyfile that I made today.

And all the Caddy images for the different clients are the same 3 months old, even the ones that are working.

The ones that are working could possibly be using a previously fetched certificate, I guess, as they are slightly newer client deployments, so the previous certificates may not have expired yet, and be store in the mounted volume, perhaps?

6. Links to relevant resources:

You can remove this, it doesn’t do anything. HTTP/3 is enabled by default since v2.6.0

That’s weird. The Caddy docker image should have the ca-certificates package installed, so it should trust ZeroSSL’s certificate.

Are you sure you’re not doing something weird with your build that would clobber the certificates in the container?

You could try adding curl to the container and trying to make requests from inside with curl -v, see what information you get from that.

I know, right!?

I don’t seem to to be able to get ‘into’ the container to check the existence of files or install/run curl. When I try I get told /bin/sh": stat /bin/sh: no such file or directory: unknown. This is the same for any command I try to exec/run in the container other than caddy.

I should add that the original containers have been around for a few months, so I wondered if the old CA data had somehow been cached, but I’ve blown away the volumes for one of these troublesome deployments, and completely rebuilt without cache, but still no job. I would consider removing all images, however as mentioned above, somehow some clients are working fine, so I don’t want to stop them in order to remove the underlying images, in case I can’t get them back up either.

I’m just rebuilding everything locally now (original issues was discovered when deploying remotely to our remote server). Will update with the results shortly when that finally completes.

That doesn’t seem right. The base Caddy image should be on alpine linux which definitely has a shell.

It looks more like you don’t have the base image you think you do.

OK. So if I build the image locally, it doesn’t have the same problem, and will initialise the DNS-challenge without any problems (although my token isn’t authorised on my IP address), without any x509 errors.

I’ve also just inspected the local & remote images and the created datetime on the problem containers is showing as “2022-10-14T…” so clearly there’s a problem with this build in that it’s not correctly re-crating these images. So looks like this might be an issue with Docker, and not Caddy.

1 Like

In case anyone comes across this post looking for a way around this, I ended up building the image locally, then saving & loading it on to the remove docker context, which resolved the issue.

Really not sure why Docker (with Buildkit) would go through all the build process, and then just use the old image, but that seems to have been the cause here.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.