Caddy 502 only on uploads

1. Caddy version (caddy version):

v2.4.5

2. How I run Caddy:

Just the official docker container on a Ubuntu server. It’s paired with another docker container that hosts my backend with Django. The details are visible in my docker-compose file below.

a. System environment:

Official docker container caddy:latest
Docker version 20.10.7, build 20.10.7-0ubuntu1~20.04.2

b. Command:

docker-compose up -d

c. Service/unit/compose file:

version: "3.3"

services:
  redis:
    image: redis:6.0.9

  django:
    image: 743943630914.dkr.ecr.eu-central-1.amazonaws.com/esmerise-django:latest
    deploy:
      restart_policy:
        condition: on-failure
    build:
      context: ..
      dockerfile: ./docker/Dockerfile.django
    volumes:
      - ../esmerise:/app/esmerise

  caddy:
    image: 743943630914.dkr.ecr.eu-central-1.amazonaws.com/esmerise-caddy-dev:latest
    command: ["caddy", "run", "--config", "/etc/caddy/Caddyfile"]

    restart: on-failure

    build:
      context: ..
      dockerfile: ./docker/Dockerfile.dev.caddy

    links: ["django"]
    ports:
      - 80:80
      - 443:443
    volumes:
      - ../web-build:/www/data

d. My complete Caddyfile or JSON config:

{
    debug
    on_demand_tls {
        ask      http://django:8001/api/v1/academies/tls
        interval 2m
        burst    5
    }
}

https:// {
    root * /www/data

    tls {
        on_demand
    }

    route {
        @api path /api/*
        reverse_proxy @api http://django:8001
    }

    route {
        @ws {
            path /ws/*
            header Connection *Upgrade*
            header Upgrade websocket
        }
        reverse_proxy @ws http://django:8001
    }

    route {
        try_files {path} /index.html
        file_server /*
    }
}

3. The problem I’m having:

Hi! I’m having some problems with Caddy. The server works well in every situation but one: when I try to test an upload that takes more than a few seconds, I get a 502 error. The error arises in a range from 30s to 1m after the upload starting time. Uploads that take less work fine.

It seems to be more timing related than dimension, because if I play with connection throttling uploading little files, I get the same error. On the docker logs I always find two Caddy entries, that I reported below. Nothing from Django.

The patch request for the upload goes to django through the: reverse_proxy @api http://django:8001 that you can see in the Caddyfile above.

4. Error messages and/or full log output:

 caddy_1   |{"level":"debug","ts":1633601769.9896336,"logger":"http.handlers.reverse_proxy","msg":"upstream roundtrip","upstream":"django:8001","request":{"remote_addr":"83.49.93.252:62435","proto":"HTTP/2.0","method":"PATCH","host":"dev.esmerise.com","uri":"/api/v1/academies/2/chapters/1/content/32","headers":{"Referer":["https://dev.esmerise.com/cooking/chapter/1/content/32/edit?category=video"],"Sec-Ch-Ua":["\"Google Chrome\";v=\"93\", \" Not;A Brand\";v=\"99\", \"Chromium\";v=\"93\""],"Sec-Ch-Ua-Mobile":["?0"],"Content-Type":["multipart/form-data; boundary=----WebKitFormBoundaryMBNVKtH02Ezx95AA"],"User-Agent":["Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/93.0.4577.63 Safari/537.36"],"Origin":["https://dev.esmerise.com"],"Accept-Language":["en-GB,en;q=0.9,es-ES;q=0.8,es;q=0.7,en-AU;q=0.6,it-IT;q=0.5,it;q=0.4,en-US;q=0.3"],"Cookie":["_fbp=fb.1.2631369334978.1223390255; __stripe_mid=88d74cc4-f42d-46ea-9dc3-1908ac218a6cf345c4; __stripe_mid=88d74cc4-f42d-46ea-9dc3-1908ac218a6cf345c4; _fbc=fb.1.2632566224470.IwAR3G3f3qCrMHzQp3bxuLS57zJfMJ94gmBu2CmvoxrX5H3hBjj3SrYbSLzcE; __stripe_sid=88f0e473-5139-43dc-b860-a6488551c7b96c8654"],"X-Forwarded-For":["83.49.93.252"],"Dnt":["1"],"Authorization":["Token 8a41e1c69568972c602ed85cf062f6f3b7859db1"],"Accept":["application/json"],"Sec-Ch-Ua-Platform":["\"macOS\""],"Sec-Fetch-Site":["same-origin"],"Content-Length":["1564096991"],"Sec-Fetch-Mode":["cors"],"Sec-Fetch-Dest":["empty"],"Accept-Encoding":["gzip, deflate, br"],"X-Forwarded-Proto":["https"]},"tls":{"resumed":false,"version":772,"cipher_suite":4865,"proto":"h2","proto_mutual":true,"server_name":"dev.esmerise.com"}},"duration":60.159858076,"error":"readfrom tcp 172.18.0.5:57596->172.18.0.2:8001: write tcp 172.18.0.5:57596->172.18.0.2:8001: use of closed network connection"}
 caddy_1   | {"level":"error","ts":1633601769.9899337,"logger":"http.log.error","msg":"readfrom tcp 172.18.0.5:57596->172.18.0.2:8001: write tcp 172.18.0.5:57596->172.18.0.2:8001: use of closed network connection","request":{"remote_addr":"83.49.93.252:62435","proto":"HTTP/2.0","method":"PATCH","host":"dev.esmerise.com","uri":"/api/v1/academies/2/chapters/1/content/32","headers":{"Sec-Fetch-Mode":["cors"],"Sec-Fetch-Dest":["empty"],"Accept-Encoding":["gzip, deflate, br"],"Content-Length":["1564096991"],"Sec-Ch-Ua-Mobile":["?0"],"Content-Type":["multipart/form-data; boundary=----WebKitFormBoundaryMBNVKtH02Ezx95AA"],"User-Agent":["Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/93.0.4577.63 Safari/537.36"],"Origin":["https://dev.esmerise.com"],"Referer":["https://dev.esmerise.com/cooking/chapter/1/content/32/edit?category=video"],"Sec-Ch-Ua":["\"Google Chrome\";v=\"93\", \" Not;A Brand\";v=\"99\", \"Chromium\";v=\"93\""],"Authorization":["Token 8a41e1c69568972c602ed85cf062f6f3b7859db1"],"Accept":["application/json"],"Sec-Ch-Ua-Platform":["\"macOS\""],"Sec-Fetch-Site":["same-origin"],"Accept-Language":["en-GB,en;q=0.9,es-ES;q=0.8,es;q=0.7,en-AU;q=0.6,it-IT;q=0.5,it;q=0.4,en-US;q=0.3"],"Cookie":["_fbp=fb.1.2631369334978.1223390255; __stripe_mid=88d74cc4-f42d-46ea-9dc3-1908ac218a6cf345c4; __stripe_mid=88d74cc4-f42d-46ea-9dc3-1908ac218a6cf345c4; _fbc=fb.1.2632566224470.IwAR3G3f3qCrMHzQp3bxuLS57zJfMJ94gmBu2CmvoxrX5H3hBjj3SrYbSLzcE; __stripe_sid=88f0e473-5139-43dc-b860-a6488551c7b96c8654"],"Dnt":["1"]},"tls":{"resumed":false,"version":772,"cipher_suite":4865,"proto":"h2","proto_mutual":true,"server_name":"dev.esmerise.com"}},"duration":60.160255159,"status":502,"err_id":"zj9a9r1j0","err_trace":"reverseproxy.statusError (reverseproxy.go:858)"}

5. What I already tried:

I tried setting all the timeouts values to none, checking the server resources to see if I had memory issues, to no awail.

6. Links to relevant resources:

Your docker-compose config does a few weird things:

  • Why do you have both image and build? Typically those are mutually exclusive.
  • The command line is unnecessary, because that should already be the default.
  • You’re missing volumes for /data and /config, as described in the docs on Docker Hub. This is very important, because otherwise you may lose your certs and keys when recreating the container, forcing your server to reissue certificates. This is especially bad because you’re using on_demand, which means you’ll probably have quite a lot of certificates. You might start hitting rate limits.

Here, you probably want to use handle, not route, and it can be simplified:

    handle /api/* {
        reverse_proxy http://django:8001
    }

    handle {
        root * /www/data
        try_files {path} /index.html
        file_server
    }

The purpose of route is to override the directive order Caddy uses for sorting directives inside the route.

The route directive doesn’t provide mutual-exclusivity – that’s the job of handle. This means that only the first matching handle will get executed, and it will skip any subsequent one. This makes sense for fallback logic, where in this case if it’s not a request to your API, it goes to your static file server.

Also, the @ws chunk is redundant, because they’re both requests to /api/*, and are being proxied to the same place. The websocket matcher is only necessary if you need to route websocket requests to somewhere else.

Also, moving root closer to the try_files and file_server makes it a bit easier to reason about.

Seems like your django app is closing the connection before the upload is done. Then Caddy throws up its arms saying “welp, can’t upload anymore, I got nowhere to send it”.

This isn’t a problem with Caddy, it’ll be a problem with either django or I guess maybe guincorn (if that’s what you’re using?).

2 Likes

Thank you for your time Francis, your response helped me resolve the issue and also contains some really appreciated insights :slightly_smiling_face:

I was using Daphne, which is a ASGI server.

I posted here because I had a very similar config with Nginx instead of Caddy, that did not present this problem, so I assumed there was something wrong with Caddy itself.

I indeed found out that the problem was Daphne related, as switching to Uvicorn solved the problem.

What was the problem, specifically, in case someone finds this and has the same issue?

1 Like

:sweat_smile: I took the strategic choice to try switching to Uvicorn after a bit of unsuccessful Daphne-digging: a fortunate maneuver that prevented me to actually find what was wrong with Daphne.

1 Like

Fair enough :+1: