"error": "short write" with heavy and slow pages

Hi,

(and thanks for all that amazing stuff)

1. The problem I’m having:

Caddy is (newly) setup in between legacy IIS/ARR reverse proxies and IIS application servers. Intent is to filter some requests as it becomes tricky to do so with regex and url_rewrite on IIS.

Communication scheme:
client --https–> IIS/ARR --http–> caddy --http–> IIS/App

Caddy is doing its job perfectly relaying requests and responses except in one case, with one heavy page wich is about 9MB and renders in a bit more of 20 seconds.

Bypassing Caddy, IIS/ARR is providing HTTP 200 response with 9MB in 22 seconds.
With Caddy, IIS/ARR is providing HTTP 502 response with default IIS error page for this error in ~20 seconds. Caddy is logging an HTTP 0 (see log below).
messages : “aborting with incomplete response” and “short write” are provided.

I digged into documentation for any timeout/limit available directive and tested various combinations without any success.

2. Error messages and/or full log output:

I have to redact hostname, application is not public.

{
  "level": "error",
  "ts": "2024-04-25T18:18:55.464Z",
  "logger": "http.handlers.reverse_proxy",
  "msg": "aborting with incomplete response",
  "upstream": "{upstream2}",
  "duration": 21.179585692,
  "request": {
    "remote_ip": "10.20.3.8",
    "remote_port": "65277",
    "client_ip": "10.20.3.8",
    "proto": "HTTP/1.1",
    "method": "GET",
    "host": "redacted",
    "uri": "/Default.aspx?pageId=1549&ArtId=1129",
    "headers": {
      "Accept-Language": [
        "en-US,en;q=0.9,fr;q=0.8,fr-FR;q=0.7"
      ],
      "Upgrade-Insecure-Requests": [
        "1"
      ],
      "X-Arr-Log-Id": [
        "e5330f39-9f22-4566-a046-279b64edeb53"
      ],
      "Accept-Encoding": [
        "gzip, deflate, br, zstd"
      ],
      "Sec-Fetch-Site": [
        "none"
      ],
      "X-Original-Url": [
        "/Default.aspx?pageId=1549&ArtId=1129"
      ],
      "Priority": [
        "u=0, i"
      ],
      "User-Agent": [
        "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36 Edg/124.0.0.0"
      ],
      "X-Forwarded-For": [
        "10.20.3.8"
      ],
      "X-Forwarded-Host": [
        "redacted"
      ],
      "Sec-Fetch-Dest": [
        "document"
      ],
      "Max-Forwards": [
        "10"
      ],
      "Sec-Ch-Ua-Mobile": [
        "?0"
      ],
      "Cache-Control": [
        "max-age=0"
      ],
      "Sec-Fetch-Mode": [
        "navigate"
      ],
      "Sec-Fetch-User": [
        "?1"
      ],
      "Sec-Ch-Ua-Platform": [
        "\"Windows\""
      ],
      "X-Forwarded-Proto": [
        "http"
      ],
      "Cookie": [],
      "Sec-Ch-Ua": [
        "\"Chromium\";v=\"124\", \"Microsoft Edge\";v=\"124\", \"Not-A.Brand\";v=\"99\""
      ],
      "Accept": [
        "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7"
      ]
    }
  },
  "error": "short write"
}

3. Caddy version:

v2.7.6 h1:w0NymbG2m9PcvKWsrXO6EEkY9Ru4FJK8uQbYcev1p3A=

4. How I installed and ran Caddy:

caddy is ran as a service with systemctl

a. System environment:

ubuntu 22.04
go 1.22.2
using xcaddy with coraza

(coraza is in readonly mode)

b. Command:

sudo service caddy start

c. Service/unit/compose file:

d. My complete Caddy config:

upstream configuration is coming from a map directive.
Configuration is pretty long, shorten to reverse_proxy configuration for better readability.

reverse_proxy {
	to {upstream1}
	to {upstream2}
	lb_policy cookie
	
	@error status 403 404 500 502 503
	handle_response @error {
		# the below directive is important if we want to keep track of 
		# the respone headers
		
		copy_response_headers
		root * /reverseproxy/content/
		rewrite * /{rp.status_code}.html
		file_server
	}
}

5. Links to relevant resources:

Thanks a lot for any clue, advice,
Stephane

That means the response headers have already been written but the connection was severed and the full write couldn’t finish. Since the response headers were already written we couldn’t change the response to an error status, it was too late. So we just abort. Usually this happens because of network interruptions.

1 Like

Thanks for your response @matt . So if I understand well, the issue is the delay between response headers provided by the app server and the delivery of the response body.
Any possibility to configure Caddy to wait more for the response body to come ?

Still can’t figure out why network would be cutting off specifically when calling this page.
Joined a screenshot of tcpdump .185 is caddy, .94 is the app server .8 is the frontend reverse proxy.
I am not a network specialist, but the red RST does not seem friendly with the connection whilst the HTTP 200 has been provided two frames before.

I must be missing something …

Hmm, maybe, but I dunno about that. I think it has more to do with the connection getting severed before Caddy finishes writing the response.

So it does seem that there is a long delay by the backend before data transmission, Caddy keeps the connection alive in the meantime. This should be fine, looks normal.

It looks like the app server gives Caddy 200 OK after a bunch of data transfer from the backend to Caddy… why does the 200 status show up after all the data transfer?

1 Like

Hi,

As a follow-up to previous messages, page works fine when Coraza setting SecResponseBodyAccess is off.
So, it appears to be Coraza related.

thanks again for your help and prompt responses
stephane

1 Like