1. The problem I’m having:
During automated deployments in AWS ECS Fargate, we’re experiencing DNS resolution issues with Caddy. Our setup consists of two containers in the same task definition: an API (Node.js) and Caddy as reverse proxy. The workflow is:
- ECS deploys a new task
- A script updates Route53 A record with the new task’s IP and waits for INSYNC status
- API container starts and waits for health check
- Caddy container starts and proxies to the API (127.0.0.1:3000)
After a CI/CD pipeline deployment, Caddy seems unable to properly handle DNS resolution, and requests fail. However, if we manually stop the task and let ECS create a new one automatically, everything works perfectly. This suggests there might be an issue with how Caddy handles DNS resolution during the initial deployment.
2. Error messages and/or full log output:
{
"level": "info",
"ts": 1734084306.6575897,
"logger": "http.handlers.reverse_proxy",
"msg": "selected upstream",
"dial": "127.0.0.1:3000",
"total_upstreams": 1
}
3. Caddy version:
Caddy 2.8.4-alpine
4. How I installed and ran Caddy:
a. System environment:
- AWS ECS Fargate
- Two containers in same task definition
- Container 1: Node.js API
- Container 2: Caddy reverse proxy
- AWS Route53 for DNS management
b. Command:
caddy run --config /etc/caddy/Caddyfile --adapter caddyfile
c. Service/unit/compose file:
container_definitions = jsonencode([
{
name = "caddy"
image = "caddy:2.8.4-alpine"
essential = true
command = [
"sh", "-c", <<-EOT
echo '{
log {
output stdout
format json
level DEBUG
}
storage file_system {
root "/data/caddy"
}
}
https://api-v1.domain.com {
reverse_proxy 127.0.0.1:3000 {
header_up Access-Control-Allow-Origin "*"
header_up Access-Control-Allow-Methods "*"
header_up Access-Control-Allow-Headers "*"
header_up Access-Control-Allow-Credentials "true"
header_up Access-Control-Expose-Headers "*"
health_uri /api/health
health_interval 60s
max_fails 10
}
}' > /etc/caddy/Caddyfile && sleep 40 &&
caddy run --config /etc/caddy/Caddyfile --adapter caddyfile
EOT
]
portMappings = [
{
containerPort = 80
},
{
containerPort = 443
}
]
}
])
d. My complete Caddy config:
caddy
{
log {
output stdout
format json
level DEBUG
}
storage file_system {
root "/data/caddy"
}
}
https://api-v1.domain.com {
reverse_proxy 127.0.0.1:3000 {
header_up Access-Control-Allow-Origin "*"
header_up Access-Control-Allow-Methods "*"
header_up Access-Control-Allow-Headers "*"
header_up Access-Control-Allow-Credentials "true"
header_up Access-Control-Expose-Headers "*"
health_uri /api/health
health_interval 60s
max_fails 10
}
}
5. Links to relevant resources:
The key aspect is that manually stopping and letting ECS recreate the task always fixes the issue, suggesting there might be something in how Caddy handles DNS resolution during the initial deployment that differs from subsequent task creations.