So the failing file was almost 3.6 GB in size (just noting)
(for ease of parsing I’m reposting the log) I’m tagging the timestamp with ** ** below as its important to associate what the caddy vs HFS logs were showing for the same window of time.
2021/11/10 **17:43:15.226** error http.log.error readfrom tcp 127.0.0.1:57165->127.0.0.1:81: client disconnected
{“request”:
{“remote_addr”: “139.xyz.xyz.45:56128”, “proto”: “HTTP/2.0”, “method”: “POST”, “host”: “web.thedomain.com”, “uri”: “/UL/”, “headers”:
{“Cookie”:
[“HFS_SID_=0.808680337155238”],
“Authorization”: [“Basic dWw6dGlnZXIhdXBsb2FkMDAuLg==”],
“Sec-Ch-Ua”: ["“Microsoft Edge”;v=“95”, “Chromium”;v=“95”, “;Not A Brand”;v=“99"”],
“Sec-Ch-Ua-Platform”: ["“Windows”"],
“Accept”: [“text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,/;q=0.8,application/signed-exchange;v=b3;q=0.9”],
“Sec-Fetch-User”: ["?1"], “Accept-Encoding”: [“gzip, deflate, br”], “Accept-Language”: [“en-US,en;q=0.9”], “Cache-Control”: [“max-age=0”],
“Sec-Ch-Ua-Mobile”: ["?0"],
“Content-Type”: [“multipart/form-data; boundary=----WebKitFormBoundaryn74EBtEG32Wq2typ”],
“User-Agent”: [“Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4638.69 Safari/537.36 Edg/95.0.1020.44”],
“Sec-Fetch-Site”: [“same-origin”],
“Sec-Fetch-Mode”: [“navigate”],
“Content-Length”: [“3595133559”],
“Upgrade-Insecure-Requests”: [“1”],
“Origin”: [“https://web.thedomain.com”],
“Referer”: [“https://web.thedomain.com/UL/”],
“Sec-Fetch-Dest”: [“document”]},
“tls”: {“resumed”: false, “version”: 772, “cipher_suite”: 4865, “proto”: “h2”, “proto_mutual”: true, “server_name”: “web.thedomain.com”}},
“duration”: 22.2100242,
“status”: 502,
“err_id”: “qqcbcm7b4”,
“err_trace”: “reverseproxy.statusError (reverseproxy.go:886)
}
Between the caddy server and the target server running HFS is there really a 6 hour difference or am I just looking at one is in a timezone based config on one host and the other host is configured for UTC? Or is this logging from a separate test? I’m trying to align the two log sets… The 11.42:53 timeframe from your HFS server log seems to end in a successful "Fully uploaded TheFileToUpload.exe - 3.3 G @ 12.4 MB/s
11/10/2021 11:42:53 AM ul 139.xyz.xyz.45 57165 Uploading TheFileToUpload.exe
11/10/2021 11:43:15 AM ul 139.xyz.xyz.45 57165 Disconnected - 952 bytes sent
11/10/2021 11:43:15 AM ul 139.xyz.xyz.45 57163 Uploading TheFileToUpload.exe
11/10/2021 11:43:25 AM ul 139.xyz.xyz.45 57156 Disconnected by server: inactivity - 4047 bytes sent
11/10/2021 11:43:25 AM ul 139.xyz.xyz.45 57158 Disconnected by server: inactivity - 937 bytes sent
11/10/2021 11:43:25 AM ul 139.xyz.xyz.45 57160 Disconnected by server: inactivity - 7843 bytes sent
11/10/2021 11:43:25 AM ul 139.xyz.xyz.45 57162 Disconnected by server: inactivity - 336 bytes sent
11/10/2021 11:43:26 AM ul 139.xyz.xyz.45 57157 Disconnected by server: inactivity - 9600 bytes sent
11/10/2021 11:43:26 AM ul 139.xyz.xyz.45 57161 Disconnected by server: inactivity - 4684 bytes sent
11/10/2021 11:43:26 AM ul 139.xyz.xyz.45 57159 Disconnected by server: inactivity - 953 bytes sent
11/10/2021 11:47:52 AM ul 139.xyz.xyz.45 57163 Fully uploaded TheFileToUpload.exe - 3.3 G @ 12.4 MB/s
I’ll ask around if there is a way to get deeper debugging out of caddy because the fact that the “details” around when this fails and when it does not are not giving us a clear indication of trigger or fault. The 502 status code seems to be pointing away from caddy:
The HyperText Transfer Protocol (HTTP) 502 Bad Gateway server error response code indicates that the server, while acting as a gateway or proxy, received an invalid response from the upstream server .
Other than network failure in the transmission (connections being dropped or reset for some reason) I can not account for why you would see things fail this way.
My next move if I were attempting to isolate what is going on here is drop one level lower an attempt to capture the network traffic at the caddy server with tcpdump to understand if there was actual resets or other types of network layer faults emerging at the TCP conversation level between the “browser” client uploading the file and the caddy server. Is it a true user within a browser that is initiating the upload or is this some form of application that is emulating a browser type to start the transmission of the file to HFS?
Sorry I dont have a more valuable statement to make with regard to what can be causing this. I’ll attempt to bump this to attention (and call out to the rest of the community to share if they have had file transfer through caddy issues in the past).