I’ve been struggling with tracking the details of this one down enough to report it…
… so really I’m here asking “What can I provide to help track down this issue and help get it fixed?”
The Problem
Some requests made via Caddy to a backend server stall and don’t complete. I’ve only seen it for static files so far.
Requests made directly to the backend server complete as expected.
Workaround
Restarting Caddy when the problem occurs resolves the issue - until it occurs again.
Currently every week or so for a specific file (the PDF mentioned below), although it should be noted - I restarted the server for a config change and saw that the PDF could not be downloaded the next day.
What I’m seeing
Currently I’m seeing requests for some files stall / not complete, the curl
output looks like this:
The IP address and location have been adjusted / masked in this output.
λ curl --verbose https://www.example.com/wp-content/uploads/sites/99/2018/09/Website-Catalogue-18.7MB.pdf
* Trying 192.168.0.103...
* TCP_NODELAY set
* Connected to www.example.com (192.168.0.103) port 443 (#0)
* schannel: SSL/TLS connection with www.example.com port 443 (step 1/3)
* schannel: checking server certificate revocation
* schannel: sending initial handshake data: sending 189 bytes...
* schannel: sent initial handshake data: sent 189 bytes
* schannel: SSL/TLS connection with www.example.com port 443 (step 2/3)
* schannel: failed to receive handshake, need more data
* schannel: SSL/TLS connection with www.example.com port 443 (step 2/3)
* schannel: encrypted data got 3676
* schannel: encrypted data buffer: offset 3676 length 4096
* schannel: sending next handshake data: sending 93 bytes...
* schannel: SSL/TLS connection with www.example.com port 443 (step 2/3)
* schannel: encrypted data got 186
* schannel: encrypted data buffer: offset 186 length 4096
* schannel: SSL/TLS handshake complete
* schannel: SSL/TLS connection with www.example.com port 443 (step 3/3)
* schannel: stored credential handle in session cache
> GET /wp-content/uploads/sites/99/2018/09/Website-Catalogue-18.7MB.pdf HTTP/1.1
> Host: www.example.com
> User-Agent: curl/7.55.1
> Accept: */*
>
Where the request doesn’t appear to complete and I have to Ctrl+C
to cancel / break out.
In a web browser this is seen as the loading spinner continuing to go around.
In this particular case this problem has occurred repeatedly once every couple of weeks for this specific 18.7MB PDF file.
Previously though I’ve experienced cases were it appears a WordPress page is not loading backend or frontend - only to open up Chrome Developer console Network tab and hit refresh to see that is a JavaScript file which is not completing loading blocking the page from completing loading for the user.
It’s been more consistently occurring with this larger 18.7MB PDF file - where as when previously spotted it might be a different .js
file each time.
Obviously it might / or might not be occurring for other files and I’m just not noticing.
Configuration Source Server
The backend server responds to static file requests with future expiry times.
In the case of the PDF file, the headers of the response are as follows:
HTTP/1.1 200 OK
Date: Wed, 21 Nov 2018 03:10:24 GMT
Server: Apache
Last-Modified: Fri, 14 Sep 2018 15:46:05 GMT
Accept-Ranges: bytes
Content-Length: 19623339
Cache-Control: max-age=2592000
Expires: Fri, 21 Dec 2018 03:10:24 GMT
X-Content-Type-Options: nosniff
Connection: close
Content-Type: application/pdf
The backend server consistently replies with “Connection: close
” so there is no “keep-alive” to the source server.
And when Caddy works the response headers are as follows:
HTTP/1.1 200 OK
Accept-Ranges: bytes
Cache-Control: max-age=2592000
Content-Length: 19623339
Content-Type: application/pdf
Date: Wed, 21 Nov 2018 03:02:43 GMT
Expires: Fri, 21 Dec 2018 03:02:43 GMT
Last-Modified: Fri, 14 Sep 2018 15:46:05 GMT
Server: Caddy
Server: Apache
X-Cache-Status: hit
X-Content-Type-Options: nosniff
Configuration - Caddy
Config file is similar to the following:
(wordpress_default) {
tls support@example.com
cache {
status_header X-Cache-Status
}
}
example.com,
www.example.com {
import wordpress_default
proxy / 192.168.152.198 {
transparent
}
}
I don’t store the cache in specific location so I assume it goes to temp and gets cleared on Caddy restart
More notes
- I am currently seeing the following issue also (may be unrelated / related):
- I did note that “range” requests are not supported, this is happening more often / notably with a large file - perhaps “range requests” / locking around this is at fault as they are more likely to happen with large files?
- I’ve setup
pprof
on one of the domains now so may be I can capture some detail when I spot it next.
What next?
- What if anything can I do / provide when it occurs next?
- Is a stack dump from
pprof
helpful?