The correct way to deploy caddy to AWS ECS (Fargate), task definition keeps respawning

1. The problem I’m having:

I don’t know whether this is the appropriate place to ask this, but maybe someone who uses caddy has done this before.

Continuing from what I did before (links attached below). I try to run the caddy on AWS ECS (Fargate). From the log, I can see that caddy is already running. But somehow, the task definition keeps respawning after a few minutes. The cause is that the health check failed (Task failed container health checks).

First of all, because the load balancer always gets status 308 when doing a health check, I added status 308 to the list of success codes, but now I use 200-499 to allow all status codes just to make sure whether this is the cause or not. And yet, it still happening.

The second thing that I did is to add a specific response when there is a request with a specific header (User-Agent: "ELB-HealthChecker/2.0"). Nothing change with this, the task definition still respawning.

The last thing that I did is to add a new port (8000) that will just send 'OK' with 200 status code. And yes, the respawn is still happening after I tried this.

Below is the health check configuration that I use in AWS.

Target group:

protocol            = "HTTP"
port                = "8000"
path                = "/"
matcher             = "200-499"
healthy_threshold   = 3
unhealthy_threshold = 3
timeout             = 5
interval            = 150

Task definition:

"command": ["CMD-SHELL", "curl -f http://localhost:8000 || exit 1"],
"startPeriod": 30,
"retries": 3,
"timeout": 5,
"interval": 150

I use this when I tried the third option that I mentioned above:

"command": ["CMD-SHELL", "curl -f http://localhost:8000 || exit 1"],

Before, I use this:

"command": ["CMD-SHELL", "curl -f http://localhost|| exit 1"],

This is the latest task definition mapping port that I use to allow inbound from port 80, 443, and 8000:

    "portMappings" : [{
      "containerPort" : 80,
      "hostPort" : 80
    }, {
      "containerPort" : 443,
      "hostPort" : 443
    }, {
      "containerPort" : 8000,
      "hostPort" : 8000
    }],

With this kind of configuration, which endpoint/port I should use for the health check? Any help will be appreciated. Thank you.

2. Error messages and/or full log output:

February 15, 2024 at 12:36 (UTC+7:00)	{"level":"info","ts":1707975381.4023943,"msg":"serving initial configuration"}	prod-caddy-lb-task
February 15, 2024 at 12:36 (UTC+7:00)	{"level":"info","ts":1707975381.4022532,"msg":"autosaved config (load with --resume flag)","file":"/config/caddy/autosave.json"}	prod-caddy-lb-task
February 15, 2024 at 12:36 (UTC+7:00)	{"level":"warn","ts":1707975381.404415,"logger":"tls","msg":"storage cleaning happened too recently; skipping for now","storage":"{\"client_type\":\"simple\",\"address\":[\"some-ip:6379\"],\"host\":[],\"port\":[],\"db\":0,\"timeout\":\"5\",\"username\":\"\",\"password\":\"REDACTED\",\"master_name\":\"\",\"key_prefix\":\"caddy\",\"encryption_key\":\"\",\"compression\":false,\"tls_enabled\":false,\"tls_insecure\":true,\"tls_server_certs_pem\":\"\",\"tls_server_certs_path\":\"\",\"route_by_latency\":false,\"route_randomly\":false}","instance":"b7e9ab8b-f94b-4ed5-83bd-ac6743546394","try_again":1708061781.4044127,"try_again_in":86399.999999649}	prod-caddy-lb-task
February 15, 2024 at 12:36 (UTC+7:00)	{"level":"info","ts":1707975381.4053752,"logger":"tls","msg":"finished cleaning storage units"}	prod-caddy-lb-task
February 15, 2024 at 12:37 (UTC+7:00)	{"level":"info","ts":1707975428.6299021,"logger":"http.log.access.log2","msg":"handled request","request":{"remote_ip":"172.31.41.87","remote_port":"5080","client_ip":"172.31.41.87","proto":"HTTP/1.1","method":"POST","host":"52.77.150.113","uri":"/","headers":{"X-Forwarded-For":["135.125.218.67"],"X-Forwarded-Proto":["https"],"Content-Length":["20"],"Accept":["*/*"],"Content-Type":["application/x-www-form-urlencoded"],"X-Forwarded-Port":["443"],"X-Amzn-Trace-Id":["Root=1-65cda304-418702083d82abab56c539c8"],"Accept-Encoding":["gzip, deflate"],"User-Agent":["Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.129 Safari/537.36"]}},"bytes_read":0,"user_id":"","duration":0.000060274,"size":0,"status":308,"resp_headers":{"Location":["https://52.77.150.113/"],"Content-Type":[],"Server":["Caddy"],"Connection":["close"]}}	prod-caddy-lb-task
February 15, 2024 at 12:49 (UTC+7:00)	{"level":"info","ts":1707976167.572865,"logger":"admin","msg":"stopped previous server","address":"0.0.0.0:2020"}	prod-caddy-lb-task
February 15, 2024 at 12:49 (UTC+7:00)	{"level":"info","ts":1707976167.5724022,"logger":"http","msg":"servers shutting down with eternal grace period"}	prod-caddy-lb-task
February 15, 2024 at 12:49 (UTC+7:00)	{"level":"warn","ts":1707976167.5723004,"msg":"exiting; byeee!! 👋","signal":"SIGTERM"}	prod-caddy-lb-task
February 15, 2024 at 12:49 (UTC+7:00)	{"level":"info","ts":1707976167.5721169,"msg":"shutting down apps, then terminating","signal":"SIGTERM"}	prod-caddy-lb-task
February 15, 2024 at 12:49 (UTC+7:00)	{"level":"info","ts":1707976167.5729537,"msg":"shutdown complete","signal":"SIGTERM","exit_code":0}	prod-caddy-lb-task

3. Caddy version:

Version 2.7.6

4. How I installed and ran Caddy:

Install it using a Dockerfile.

a. System environment:

AWS ECS (Fargate)

b. Command:

# On Dockerfile
CMD ["/usr/bin/caddy", "run", "--config", "/etc/caddy/Caddyfile"]

c. Service/unit/compose file:

# Dockerfile

FROM caddy:2.7-builder-alpine AS builder

RUN go install github.com/caddyserver/xcaddy/cmd/xcaddy@latest
RUN xcaddy build master \
  --with github.com/caddy-dns/route53 \
  --with github.com/pberkel/caddy-storage-redis \
  --with github.com/ueffel/caddy-brotli

FROM caddy:2.7-alpine

COPY --from=builder /usr/bin/caddy /usr/bin/caddy
COPY Caddyfile /etc/caddy/Caddyfile

RUN /usr/bin/caddy fmt --overwrite /etc/caddy/Caddyfile

# Expose ports
EXPOSE 80

CMD ["/usr/bin/caddy", "run", "--config", "/etc/caddy/Caddyfile"]

d. My complete Caddy config:

{
	admin 0.0.0.0:2020

	on_demand_tls {
		ask https://api.different-domain.id/domain-checker
	}

	storage redis {
		host 127.0.0.1
		port 6379
		db 0
		timeout 5
		key_prefix "caddy"
		tls_enabled false
		tls_insecure true
	}
}

*.company-domain.id {
	tls company.email@gmail.com {
		dns route53 {
			max_retries 10
			access_key_id ...
			secret_access_key ...
		}
	}

	@awsHealthCheck {
		header User-Agent "ELB-HealthChecker/2.0"
	}
	respond @awsHealthCheck 200

	encode gzip

	reverse_proxy /storefront/* https://api.different-domain.id {
		header_up oo-api-key some-api-key
	}

	reverse_proxy /page/* some-subdomain.ap-southeast-1.elasticbeanstalk.com {
		header_up Host {host}
		header_up X-Real-IP {header.X-Forwarded-For}
		header_down Access-Control-Allow-Origin "*"
	}

	reverse_proxy /* {
		to target-ip-with-port

		lb_policy least_conn
		fail_duration 30s

		header_up Host {host}
		header_up X-Real-IP {header.X-Forwarded-For}
		header_up Access-Control-Allow-Origin *
		header_up Access-Control-Allow-Methods "GET, POST, PUT, PATCH, OPTIONS, DELETE"
		header_down Access-Control-Allow-Origin *
		header_down Access-Control-Allow-Methods "GET, POST, PUT, PATCH, OPTIONS, DELETE"
	}

	@assets path /js* /css* /favicon.ico
	header @assets Cache-Control "public, max-age=31536000;"

	log {
		output stdout
		format json
	}
}

:443 {
	tls company.email@gmail.com {
		dns route53 {
			max_retries 10
			access_key_id ...
			secret_access_key ...
		}
	}

	@awsHealthCheck {
		header User-Agent "ELB-HealthChecker/2.0"
	}
	respond @awsHealthCheck 200

	encode gzip

	reverse_proxy /storefront/* https://api.different-domain.id {
		header_up oo-api-key some-api-key
	}

	reverse_proxy /page/* some-subdomain.ap-southeast-1.elasticbeanstalk.com {
		header_up Host {host}
		header_up X-Real-IP {header.X-Forwarded-For}
		header_down Access-Control-Allow-Origin "*"
	}

	reverse_proxy /* {
		to target-ip-with-port

		lb_policy least_conn
		fail_duration 30s

		header_up Host {host}
		header_up X-Real-IP {header.X-Forwarded-For}
	}

	@assets path /js* /css* /favicon.ico
	header @assets Cache-Control "public, max-age=31536000;"

	log {
		output stdout
		format json
	}
}

# NOTE: For health checks
:8000 {
	respond "OK"
}

5. Links to relevant resources:

You don’t need this line, the builder image already comes with xcaddy.

Are you sure this is what you want? This’ll build the latest version from Caddy’s git repo, which has no stability guarantees.

Remove this, it’s a no-op (does nothing).

Remove these, those are response headers but you’re using them with header_up which sets request headers. It’s a mismatch.

You upstream app should read from X-Forwarded-For instead.

Is this running inside your container? It might be that curl isn’t installed in the container.

You can add it with RUN apk add curl

Sorry for the very late reply.

Regarding the master branch that I use. We want to try the import/export storage that exists in the master branch. But, we decided to just not use it and just let letsencrypt/zerossl do the job.

And you’re correct about the last part. The issue is actually because we don’t have curl in the container. That’s why the health check always failed no matter what we changed.

1 Like

It’s in v2.7.0, so you don’t need to use master for that anymore. It first landed in cmd: Implement storage import/export by Phrynobatrachus · Pull Request #5532 · caddyserver/caddy · GitHub

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.