K8s + ingress-nginx + caddy = 502 bad gateway (sometimes)

1. The problem I’m having:

Im trying to serve my ReactJs app from k8s cluster (GCP GKE) whitch results in 502 Bad Gateway.
I am using ingress-nginx (installed with terraform+helm) as my LoadBalancer controller.
My app, has a standard k8s files for deploying it: Deployment, Service, Ingress. They 100% work and are configured as per my requirements.

Whenever I’m trying to access my app from the browser I get the 502 Bad Gateway error message from nginx, but not always, now what do I mean.

If I declare Site block In my Caddyfile like this, I get the 502:

test.bildu.lt {
   ...
}

But If I declare my Site block like this, It works perfectly:

:80 {
   ...
}

Obviously I don’t want to leave :80 as my Site block. Though I could because my network is private and the only way to the app is through LB with Ingress rules, but I want the fine grained control that I can achieve with Site blocks.

I understand that this issue might be more related to my LB (ingress-nginx) controller rather than caddy, but maybe, just maybe I’m missing something and someone know what that is.

2. Error messages and/or full log output:

Some nginx shenanigans:

2023/03/11 01:22:42 [error] 317#317: *385210 connect() failed (111: Connection refused) while connecting to upstream, client: [SORRY, this is personal, my IP], server: test.bildu.lt, request: "GET / HTTP/1.1", upstream: "http://10.48.0.95:80/", host: "test.bildu.lt"
2023/03/11 01:22:42 [error] 317#317: *385210 connect() failed (111: Connection refused) while connecting to upstream, client: [SORRY, this is personal, my IP], server: test.bildu.lt, request: "GET / HTTP/1.1", upstream: "http://10.48.0.95:80/", host: "test.bildu.lt"
2023/03/11 01:22:42 [error] 317#317: *385210 connect() failed (111: Connection refused) while connecting to upstream, client: [SORRY, this is personal, my IP], server: test.bildu.lt, request: "GET / HTTP/1.1", upstream: "http://10.48.0.95:80/", host: "test.bildu.lt"
[SORRY, this is personal, my IP] - - [11/Mar/2023:01:22:42 +0000] "GET / HTTP/1.1" 502 552 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/110.0.0.0 Safari/537.36 Edg/110.0.1587.63" 478 0.004 [venteur-bildu-fe-service-http] [] 10.48.0.95:80, 10.48.0.95:80, 10.48.0.95:80 0, 0, 0 0.002, 0.001, 0.001 502, 502, 502 3effd6ab4763137451f2a5f7e20e3b0c

10.48.0.95 at this moment is the correct pod IP (using Service resource).

Nothing from caddy outputs nor debug - the generic messages about loading the configuration.

{"level":"info","ts":1678498547.5579953,"msg":"using provided configuration","config_file":"/etc/caddy/Caddyfile","config_adapter":"caddyfile"}
{"level":"warn","ts":1678498547.5624495,"msg":"Caddyfile input is not formatted; run the 'caddy fmt' command to fix inconsistencies","adapter":"caddyfile","file":"/etc/caddy/Caddyfile","line":32}
{"level":"info","ts":1678498547.565772,"logger":"admin","msg":"admin endpoint started","address":"localhost:2019","enforce_origin":false,"origins":["//localhost:2019","//[::1]:2019","//127.0.0.1:2019"]}
{"level":"warn","ts":1678498547.5667331,"logger":"http","msg":"automatic HTTPS is completely disabled for server","server_name":"srv0"}
{"level":"info","ts":1678498547.5667405,"logger":"tls.cache.maintenance","msg":"started background certificate maintenance","cache":"0xc0001bb260"}
{"level":"info","ts":1678498547.6276002,"logger":"http","msg":"enabling HTTP/3 listener","addr":":443"}
{"level":"info","ts":1678498547.6284835,"msg":"failed to sufficiently increase receive buffer size (was: 208 kiB, wanted: 2048 kiB, got: 416 kiB). See https://github.com/quic-go/quic-go/wiki/UDP-Receive-Buffer-Size for details."}
{"level":"debug","ts":1678498547.6288266,"logger":"http","msg":"starting server loop","address":"[::]:443","tls":true,"http3":true}
{"level":"info","ts":1678498547.6296363,"logger":"http.log","msg":"server running","name":"srv0","protocols":["h1","h2","h3"]}
{"level":"info","ts":1678498547.6293275,"logger":"tls","msg":"cleaning storage unit","description":"FileStorage:/data/caddy"}
{"level":"info","ts":1678498547.6303158,"logger":"tls","msg":"finished cleaning storage units"}
{"level":"info","ts":1678498547.6308212,"msg":"autosaved config (load with --resume flag)","file":"/config/caddy/autosave.json"}
{"level":"info","ts":1678498547.6308525,"msg":"serving initial configuration"}

3. Caddy version:

v2.6.4 h1:2hwYqiRwk1tf3VruhMpLcYTg+11fCdr8S3jhNAdnPy8=

4. How I installed and ran Caddy:

a. System environment:

GCP GKE cluster
Container runtime containerd://1.6.9
Kubelet version v1.24.9-gke.3200

b. Command:

Image built with docker.

c. Service/unit/compose file:

Dockerfile:

### STAGE 1: Build ###
FROM node:latest AS build
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci
COPY . .
RUN npm run build

### STAGE 2: Run ###
FROM caddy:latest
COPY Caddyfile /etc/caddy/Caddyfile
COPY --from=build /app/build /var/www/html

d. My complete Caddy config:

{
	auto_https off
	debug
}

(cache) {
	@html {
		path /* #includes *.css *.js
		not path /api*
	}
	@assets {
		path *.png *.svg *.ico *.mp3 *.json
	}
	@content {
		not path *.png *.svg *.ico *.mp3 *.json
	}

	header @content {
		Cache-Control "max-age=0"
	}
	header @html {
		Cache-Control "max-age=2630000"
	}
	header @assets {
		Cache-Control "max-age=15780000"
	}
}

test.bildu.lt {
	tls {$SITE_TLS_EMAIL}

    root * /var/www/html
    try_files {path} /index.html
    file_server

	encode zstd gzip

	header {
		Strict-Transport-Security "max-age=31536000; includeSubDomains; preload; always"
		Content-Security-Policy "*****REDACTED*****"
		Permissions-Policy "geolocation=(), midi=(), sync-xhr=(), microphone=(), camera=(), magnetometer=(), gyroscope=(), fullscreen=(), payment=()"
		X-Frame-Options "SAMEORIGIN"
		X-Content-Type-Options "nosniff"
		X-XSS-Protection "1; mode=block"
		Referrer-Policy "strict-origin"
		Expect-CT "max-age=604800"
	}

    import cache
}

5. Links to relevant resources:

This might help: Making sense of auto_https and why disabling it still serves HTTPS instead of HTTP

If I understand correctly what you’re trying to do then I think you might want to do

http://test.bildu.lt {
...

instead of

test.bildu.lt {
...

A man sent from heaven. Thanks.

1 Like

If I may add a comment on top.
The reason why am I turning auto_https off is because if it is on, I get an issue with too many redirects.

Ideally I would love to have TLS between my LB and caddy, but It is something I’m actively looking at.