I’m pretty sure I ran into the same issue as here: Caddy 2 stops answering requests after a few hours #3725, and increasing NOFILE’s soft limit (the hard limit was already high) fixed it too. Most sites were behind Cloudflare.
But we always recommend increasing NOFILE in production anyway, so yes definitely do that. (As is the case with other web servers too.) Our official service file increases it to the maximum.
I think it needs to be LimitNOFILE=1048576:1048576, to increase the soft limit as well. As my hard limit was already high too. The issue I linked to ran into the same issue.
Edit: I also did set an idle timeout, but that didn’t seem to help. However, maybe 30s is still too much, or something else was wrong. More testing would be nice, but risky to do on prod systems.