1. The problem I’m having:
I’m running a fleet of Caddy servers as reverse proxies that all load a remote JSON configuration from an S3 bucket:
// /etc/caddy/caddy.json
{
"admin": {
"config": {
"load": {
"module": "http",
"method": "GET",
"url": "https://[redacted].s3.amazonaws.com/servers/configuration/caddy.json"
}
}
}
}
I use Caddy to manage TLS for custom domains in the app. When I make changes to the config file, e.g. add an upstream or a host (custom domain) for an upstream, I upload it to S3 and then I run caddy reload --config /etc/caddy/caddy.json
on all the servers in the fleet. Most of the time, this seems to work fine.
But, I noticed that on a rare occasion, requests to specific Caddy instance fail, Caddy simply returns an empty response and HTTP status 0.
Here’s a curl example when Caddy works:
$ curl --verbose https://privy.fairmint.cafe/series/series-a-roll-up/us
* Trying 99.83.186.151:443...
* Connected to privy.fairmint.cafe (99.83.186.151) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* CAfile: /etc/ssl/certs/ca-certificates.crt
* CApath: /etc/ssl/certs
* TLSv1.0 (OUT), TLS header, Certificate Status (22):
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.2 (IN), TLS header, Certificate Status (22):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS header, Finished (20):
* TLSv1.2 (IN), TLS header, Supplemental data (23):
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
* TLSv1.2 (IN), TLS header, Supplemental data (23):
* TLSv1.3 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS header, Supplemental data (23):
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
* TLSv1.2 (IN), TLS header, Supplemental data (23):
* TLSv1.3 (IN), TLS handshake, Finished (20):
* TLSv1.2 (OUT), TLS header, Finished (20):
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.2 (OUT), TLS header, Supplemental data (23):
* TLSv1.3 (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / TLS_AES_128_GCM_SHA256
* ALPN, server accepted to use h2
* Server certificate:
* subject: CN=privy.fairmint.cafe
* start date: Jan 15 05:13:19 2024 GMT
* expire date: Apr 14 05:13:18 2024 GMT
* subjectAltName: host "privy.fairmint.cafe" matched cert's "privy.fairmint.cafe"
* issuer: C=US; O=Let's Encrypt; CN=R3
* SSL certificate verify ok.
* Using HTTP2, server supports multiplexing
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* TLSv1.2 (OUT), TLS header, Supplemental data (23):
* TLSv1.2 (OUT), TLS header, Supplemental data (23):
* TLSv1.2 (OUT), TLS header, Supplemental data (23):
* Using Stream ID: 1 (easy handle 0x5590ffd69e90)
* TLSv1.2 (OUT), TLS header, Supplemental data (23):
> GET /series/series-a-roll-up/us HTTP/2
> Host: privy.fairmint.cafe
> user-agent: curl/7.81.0
> accept: */*
>
* TLSv1.2 (IN), TLS header, Supplemental data (23):
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* TLSv1.2 (IN), TLS header, Supplemental data (23):
* Connection state changed (MAX_CONCURRENT_STREAMS == 250)!
* TLSv1.2 (OUT), TLS header, Supplemental data (23):
* TLSv1.2 (IN), TLS header, Supplemental data (23):
* TLSv1.2 (IN), TLS header, Supplemental data (23):
* TLSv1.2 (IN), TLS header, Supplemental data (23):
< HTTP/2 200
< alt-svc: h3=":443"; ma=2592000
< cache-control: public, max-age=0, s-maxage=2
< content-security-policy: default-src * data: blob: filesystem: about: ws: wss: 'unsafe-inline' 'unsafe-eval';
< content-type: text/html
< date: Mon, 15 Jan 2024 13:58:18 GMT
< etag: W/"43620ad3c0a461cb87f21ff787969b86"
< last-modified: Mon, 15 Jan 2024 07:34:34 GMT
< server: Caddy
< server: AmazonS3
< vary: Accept-Encoding
< via: 1.1 c84ecfd128e1f4c41a53a2b42410f3b8.cloudfront.net (CloudFront)
< x-amz-cf-id: V9g2Ul6uVp8OCCFK4sLBf3-jM1Av_GtdIbsG7sBGf6R1GUD05rqm7Q==
< x-amz-cf-pop: IAD89-C3
< x-cache: Miss from cloudfront
< x-frame-options: SAMEORIGIN
<
* TLSv1.2 (IN), TLS header, Supplemental data (23):
<!doctype html><html lang="en"><head><meta charset="utf-8"/><link rel="shortcut icon" id="favicon" href="/favicon.png"/><link rel="preconnect" href="https://fonts.googleapis.com"/><link rel="preconnect" href="https://fonts.gstatic.com"/><link rel="preload" href="https://fonts.googleapis.com/css2?family=Inter:wght@300;400;500&subset=latin&display=swap" as="style" onload='this.onload=null,this.rel="stylesheet"'/><noscript><link rel="stylesheet" href="https://fonts.googleapis.com/css2?family=Inter:wght@300;400;500&subset=latin&display=swap"/></noscript><link rel="preload" href="https://fonts.googleapis.com/css2?family=Plus+Jakarta+Sans:wght@600;700&subset=latin&display=swap" as="style" onload='this.onload=null,this.rel="stylesheet"'/><noscript><link rel="stylesheet" href="https://fonts.googleapis.com/css2?family=Plus+Jakarta+Sans:wght@600;700&subset=latin&display=swap"/></noscript><link rel="preload" href="https://fonts.googleapis.com/css2?family=Reenie+Beanie" as="style" onload='this.onload=null,this.rel="style* TLSv1.2 (IN), TLS header, Supplemental data (23):
* Connection #0 to host privy.fairmint.cafe left intact
sheet"'/><noscript><link rel="stylesheet" href="https://fonts.googleapis.com/css2?family=Reenie+Beanie"/></noscript><meta name="viewport" content="width=device-width,initial-scale=1,maximum-scale=1,user-scalable=0"/><meta name="theme-color" content="#000000"/><link rel="manifest" href="/manifest.json" crossorigin="use-credentials"/><title>Signup to our Investment Portal</title><script src="https://js.stripe.com/v3/"></script><script src="https://crypto-js.stripe.com/crypto-onramp-outer.js"></script><script defer="defer" src="/static/js/main.03d5c5be.js"></script><link href="/static/css/main.cac43388.css" rel="stylesheet"></head><body><noscript>You need to enable JavaScript to run this app.</noscript><div id="root"></div></body></html>
Here’s curl example when it does NOT work:
$ curl --verbose https://privy.fairmint.cafe/series/series-a-roll-up/us
* Trying 99.83.186.151:443...
* Connected to privy.fairmint.cafe (99.83.186.151) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* CAfile: /etc/ssl/certs/ca-certificates.crt
* CApath: /etc/ssl/certs
* TLSv1.0 (OUT), TLS header, Certificate Status (22):
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.2 (IN), TLS header, Certificate Status (22):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS header, Finished (20):
* TLSv1.2 (IN), TLS header, Supplemental data (23):
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
* TLSv1.2 (IN), TLS header, Supplemental data (23):
* TLSv1.3 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS header, Supplemental data (23):
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
* TLSv1.2 (IN), TLS header, Supplemental data (23):
* TLSv1.3 (IN), TLS handshake, Finished (20):
* TLSv1.2 (OUT), TLS header, Finished (20):
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.2 (OUT), TLS header, Supplemental data (23):
* TLSv1.3 (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / TLS_AES_128_GCM_SHA256
* ALPN, server accepted to use h2
* Server certificate:
* subject: CN=privy.fairmint.cafe
* start date: Jan 15 05:13:19 2024 GMT
* expire date: Apr 14 05:13:18 2024 GMT
* subjectAltName: host "privy.fairmint.cafe" matched cert's "privy.fairmint.cafe"
* issuer: C=US; O=Let's Encrypt; CN=R3
* SSL certificate verify ok.
* Using HTTP2, server supports multiplexing
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* TLSv1.2 (OUT), TLS header, Supplemental data (23):
* TLSv1.2 (OUT), TLS header, Supplemental data (23):
* TLSv1.2 (OUT), TLS header, Supplemental data (23):
* Using Stream ID: 1 (easy handle 0x561fc6b16e90)
* TLSv1.2 (OUT), TLS header, Supplemental data (23):
> GET /series/series-a-roll-up/us HTTP/2
> Host: privy.fairmint.cafe
> user-agent: curl/7.81.0
> accept: */*
>
* TLSv1.2 (IN), TLS header, Supplemental data (23):
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* TLSv1.2 (IN), TLS header, Supplemental data (23):
* Connection state changed (MAX_CONCURRENT_STREAMS == 250)!
* TLSv1.2 (OUT), TLS header, Supplemental data (23):
* TLSv1.2 (IN), TLS header, Supplemental data (23):
* TLSv1.2 (IN), TLS header, Supplemental data (23):
< HTTP/2 200
< server: Caddy
< content-length: 0
< date: Mon, 15 Jan 2024 07:56:33 GMT
<
* Connection #0 to host privy.fairmint.cafe left intact
The reason why I believe the issue is with reloading is because in the access logs I can see the request being handled suspiciously fast, in less than a millisecond, basically telling me that request wasn’t proxied anywhere and Caddy just decided to return. I’ve tested and that’s the same speed with which Caddy handles the request if the new host (custom domain) wasn’t found in the configuration file for any of the upstreams.
2. Error messages and/or full log output:
Here’s a log line when Caddy handles the request suspiciously fast and returns an empty response with status 0 which I don’t understand:
{
"level": "info",
"ts": "2024-01-15T07:49:21.107Z",
"logger": "http.log.access",
"msg": "handled request",
"request": {
"remote_ip": "[redacted]",
"remote_port": "41750",
"proto": "HTTP/2.0",
"method": "GET",
"host": "[redacted]",
"uri": "[redacted]",
"headers": {
"Sec-Fetch-Dest": [
"document"
],
"Sec-Fetch-Site": [
"cross-site"
],
"Te": [
"trailers"
],
"User-Agent": [
"Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:122.0) Gecko/20100101 Firefox/122.0"
],
"Accept": [
"text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8"
],
"Accept-Language": [
"en-US,en;q=0.5"
],
"Cookie": [],
"Cache-Control": [
"no-cache"
],
"Accept-Encoding": [
"gzip, deflate, br"
],
"Upgrade-Insecure-Requests": [
"1"
],
"Sec-Fetch-Mode": [
"navigate"
],
"Pragma": [
"no-cache"
]
},
"tls": {
"resumed": false,
"version": 772,
"cipher_suite": 4865,
"proto": "h2",
"server_name": "[redacted]"
}
},
"user_id": "",
"duration": 0.000208013, // suspiciously fast
"size": 0,
"status": 0,
"resp_headers": {
"Server": [
"Caddy"
]
}
}
Here’s an access log line from another server in the fleet that properly reloaded the remote configuration file and is processing requests as expected:
{
"level": "info",
"ts": "2024-01-15T11:18:05.884Z",
"logger": "http.log.access",
"msg": "handled request",
"request": {
"remote_ip": "[redacted]",
"remote_port": "31453",
"proto": "HTTP/2.0",
"method": "GET",
"host": "[redacted]",
"uri": "[redacted]",
"headers": {
"Sec-Ch-Ua-Platform": [
"\"macOS\""
],
"Accept-Encoding": [
"gzip, deflate, br"
],
"If-None-Match": [
"W/\"43620ad3c0a461cb87f21ff787969b86\""
],
"If-Modified-Since": [
"Mon, 15 Jan 2024 07:34:34 GMT"
],
"Sec-Ch-Ua": [
"\"Not_A Brand\";v=\"8\", \"Chromium\";v=\"120\", \"Google Chrome\";v=\"120\""
],
"Sec-Ch-Ua-Mobile": [
"?0"
],
"Upgrade-Insecure-Requests": [
"1"
],
"User-Agent": [
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36"
],
"Purpose": [
"prefetch"
],
"Sec-Fetch-Site": [
"none"
],
"Accept-Language": [
"en-US,en;q=0.9,hr;q=0.8,sr;q=0.7,bs;q=0.6"
],
"Cookie": [],
"Sec-Purpose": [
"prefetch;prerender"
],
"Accept": [
"text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7"
],
"Sec-Fetch-Mode": [
"navigate"
],
"Sec-Fetch-User": [
"?1"
],
"Sec-Fetch-Dest": [
"document"
]
},
"tls": {
"resumed": false,
"version": 772,
"cipher_suite": 4865,
"proto": "h2",
"server_name": "[redacted]"
}
},
"user_id": "",
"duration": 0.031532041, // notice how it takes a bit more as expected
"size": 0,
"status": 304,
"resp_headers": {
"X-Amz-Cf-Pop": [
"IAD89-C3"
],
"Server": [
"Caddy",
"AmazonS3"
],
"Last-Modified": [
"Mon, 15 Jan 2024 07:34:34 GMT"
],
"Cache-Control": [
"public, max-age=0, s-maxage=2"
],
"Via": [
"1.1 a075746ea1824aa1c02a5e26a9e968e4.cloudfront.net (CloudFront)"
],
"Date": [
"Mon, 15 Jan 2024 11:18:05 GMT"
],
"Etag": [
"W/\"43620ad3c0a461cb87f21ff787969b86\""
],
"Vary": [
"Accept-Encoding"
],
"Content-Security-Policy": [
"default-src * data: blob: filesystem: about: ws: wss: 'unsafe-inline' 'unsafe-eval';"
],
"X-Cache": [
"Miss from cloudfront"
],
"X-Amz-Cf-Id": [
"TIG2vgzcGhYyrrCvIfPA0Exm-DBQ4ibxgXR4ZH0klIqyp-7PyZDFXg=="
],
"X-Frame-Options": [
"SAMEORIGIN"
]
}
}
3. Caddy version:
2.6.4
4. How I installed and ran Caddy:
a. System environment:
Docker (Linux)
b. Command:
caddy run --config /etc/caddy/caddy.json
d. My complete Caddy config:
My caddy JSON config simply loads real configuration file from S3:
{
"admin": {
"config": {
"load": {
"module": "http",
"method": "GET",
"url": "https://[redacted].s3.amazonaws.com/servers/configuration/caddy.json"
}
}
}
}
Here’s the config file that’s stored on S3.
I understand you said not to redact anything except credentials but I redacted domain names. Since this config file works on other servers and only occasionally doesn’t work upon reload I think that’s fine and there’s not an issue with a domain or DNS records.
{
"apps": {
"http": {
"servers": {
"proxy_status_server": {
"listen": [
":8082"
],
"automatic_https": {
"disable": true,
"disable_redirects": true
},
"routes": [
{
"match": [
{
"path": [
"/custom-domains-proxy-status"
]
}
],
"handle": [
{
"handler": "static_response",
"status_code": 200,
"body": "OK"
}
],
"terminal": true
}
]
},
"tls_terminator": {
"listen": [
":443"
],
"routes": [
{
"match": [
{
"host": [
"subdomain.company.com"
]
}
],
"handle": [
{
"handler": "reverse_proxy",
"upstreams": [
{
"dial": "app.company.com:443"
}
],
"transport": {
"protocol": "http",
"tls": {}
},
"headers": {
"request": {
"set": {
"Host": [
"{http.reverse_proxy.upstream.host}"
],
"X-Served-For": [
"{http.request.host}"
],
"X-Forwarded-Proto": [
"https"
],
"X-Forwarded-Host": [
"{http.reverse_proxy.upstream.host}"
],
"X-Forwarded-For": [
"{http.request.remote.host}"
],
"X-SaaS-Domains-IP": [
"{http.request.remote.host}"
]
}
}
}
}
],
"terminal": true
}
],
"logs": {}
}
}
},
"tls": {
"automation": {
"policies": [
{
"on_demand": true
}
],
"on_demand": {
"ask": "https://[redacted_domain]/control/caddy/ask",
"rate_limit": {
"interval": "10m",
"burst": 100
}
}
},
"cache": {
"capacity": 100000
}
}
},
"admin": {
"identity": {
"issuers": [
{
"module": "acme",
"email": "[redacted]"
}
]
}
},
"logging": {
"logs": {
"default": {
"exclude": [
"http.log.access"
],
"writer": {
"output": "file",
"filename": "/var/log/caddy/caddy.log",
"roll": true,
"roll_size_mb": 64,
"roll_keep": 20
},
"encoder": {
"format": "json",
"time_format": "iso8601"
}
},
"log0": {
"writer": {
"output": "file",
"filename": "/var/log/caddy/access.log",
"roll": true,
"roll_size_mb": 64,
"roll_keep": 20
},
"encoder": {
"format": "json",
"time_format": "iso8601"
},
"include": [
"http.log.access"
]
}
}
},
"storage": {
"module": "s3",
"host": "s3.amazonaws.com",
"bucket": "[redacted_bucket_name]",
"prefix": "[redacted]",
"insecure": false
}
}
Is it possible that reload doesn’t trigger Caddy to fetch the changed file from S3? We trigger Caddy reload every 10 minutes and sometimes this issue is happening for hours on one of the servers until we restart Caddy on that server.
If it helps, I’m seeing a lot of these error lines in my caddy logs:
{"level":"error","ts":"2024-01-15T14:12:24.934Z","logger":"http.log","msg":"setting HTTP/3 Alt-Svc header","error":"no port can be announced, specify it explicitly using Server.Port or Server.Addr"}
I see someone else had a similar issue and they correlated it to this error line: