Hello francislavoie
Sadly, there are only one basicauth
block and have removed it and tested it, but situation same.
Hello matt
I have done what you said and got profile, but I could not upload it here (only images are allowed, it seems?) . I extract GIF with go tool pprof profile_file
and below is here.
If there are any way I can upload profile file, I will upload it asap.
To add a bit information of profile;
I take 30 minutes of profile (as shown in gif), but reload finished within 20 minutes (maybe about 18 minutes). This time varies by situations I saw if there are many users accessing caddy web server, it took more than 30 minutes.
You could upload the profile to pastebin or a github gist maybe. Anywhere is fine.
Iām not gleaning much from that image, but it looks like the proxy is slow to write, seems like due to TLS encryption? Are you writing very big responses frequently? Does your machine have hardware acceleration for AES?
Either way I donāt see how that relates to the reloads being slow.
Iām only now noticing that your systemd config is non-standard. We ship a recommended one with our debian package, with additional details here if want to set it up manually:
It might be good to switch to using caddy run
to reduce the variables at play here (most users use caddy run
)
Iāve changed systemd service file to standard, and also manually run caddy with caddy run
, but ā¦ whyā¦ no progress stillā¦
possibly this is related? I do not know what to do more
Thanks for the profile!
So these are the top CPU cycles:
File: caddy
Type: cpu
Time: Apr 8, 2021 at 10:16pm (MDT)
Duration: 30mins, Total samples = 24.43s ( 1.36%)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top
Showing nodes accounting for 14.58s, 59.68% of 24.43s total
Dropped 831 nodes (cum <= 0.12s)
Showing top 10 nodes out of 273
flat flat% sum% cum cum%
3.93s 16.09% 16.09% 4.49s 18.38% syscall.Syscall
3.14s 12.85% 28.94% 3.14s 12.85% crypto/aes.gcmAesEnc
2.01s 8.23% 37.17% 2.01s 8.23% golang.org/x/crypto/blowfish.encryptBlock
2.01s 8.23% 45.40% 2.01s 8.23% runtime.epollwait
1.57s 6.43% 51.82% 1.57s 6.43% runtime.futex
0.58s 2.37% 54.20% 0.58s 2.37% vendor/golang.org/x/crypto/curve25519.ladderstep
0.42s 1.72% 55.91% 0.42s 1.72% runtime.memmove
0.40s 1.64% 57.55% 0.40s 1.64% crypto/sha256.block
0.30s 1.23% 58.78% 4.27s 17.48% runtime.findrunnable
0.22s 0.9% 59.68% 1.30s 5.32% runtime.mallocgc
So over the 30 minute profile, it gathered 24s of CPU time, most of which were spent in system calls and AES/blowfish encryption. I also want to know the answer to Francisā questions - what kind of hardware are you running this on (specifically)?
Another significant chunk of time is in runtime.findrunnable, which appears to have a related bug here: runtime: scheduler is slow when goroutines are frequently woken Ā· Issue #18237 Ā· golang/go Ā· GitHub ā not sure if relevant to your situation.
It still doesnāt account for the full 18-20 minutes. Iām not sure why yet.
Sorry for late answer, there was some personal pressing matter to concern
Anyway, Hardware caddy is running on is bare-metal on premise server, HP ProLiant DL320e Gen8
model, which has cpu of Intel Xeon(R) CPU E3-1240 V2 3.40GHz 4Core 8Thread (HyperThreading)
, 8GB RAM.
I hope this is not related, but if it does, thenā¦ maybe I should find another way to deal with; I was okay with nginx, probably I have to switch backā¦?
No, if you switch now, all the troubleshooting effort will go to waste and Iāll get mad. Letās figure this out.
Can I access the instance currently exhibiting the behavior?
OK will never try switching back.
Well there is some firewall issues, so you may need to use VPN, can I contact you privately?
Maybe we could jump on a video chat? Not today ā probably tomorrow or something ā but I can troubleshoot it with you, with you driving. (I wonāt take over for liability reasons.)
Iād love it Thanks for your concern, deeply appreciate it. When is your viable time? As our timezone is different, Iād prepare if I can
Looks like it does have AES-NI support, FWIW
It seems AES-NI support already enabled on system.
user@localhost:~$ sudo sort -u /proc/crypto | grep module
module : aes_x86_64
module : aesni_intel
module : crc32_pclmul
module : crct10dif_pclmul
module : ghash_clmulni_intel
module : kernel
module : pcbc
user@localhost:~$ sudo cpuid | grep -i aes | sort | uniq
AES instruction = true
I wonder if thereās something weird with the VPN connection or something else about the network. Can you reproduce the behavior in any other environment?
For the case of network, I do not have to use VPN myself; It is public IP address range with firewall enabled (because of some special environment I am in, every device within this network is given public IP address).
I cannot reproduce same behavior in any other environment (tested in aws vm, but is fast )
This topic was automatically closed after 30 days. New replies are no longer allowed.
After discussing on a call, we determined that this is probably because Caddy (actually, Go standard library) is doing a graceful shutdown of the servers: http package - net/http - Go Packages
Shutdown gracefully shuts down the server without interrupting any active connections. Shutdown works by first closing all open listeners, then closing all idle connections, and then waiting indefinitely for connections to return to idle and then shut down. If the provided context expires before the shutdown is complete, Shutdown returns the contextās error, otherwise it returns any error returned from closing the Serverās underlying Listener(s).
By default, no grace period is set, so there is no timeout. If a client holds onto connections for 20+ minutes (like if they are downloading a big file very slowly), it could cause connections to be retained. This is an ideal user experience but maybe not so ideal for servers.
The grace period is configurable in JSON: JSON Config Structure - Caddy Documentation - and I bet setting it (say, 5 to 10 seconds, maybe 30 if you really want to) should cause reloads to finish closer to that amount of time.
It is not currently configurable in the Caddyfile, which would be a nice global option to have.
And hereās a quick PR