A https request with a domain that hasnāt been loaded yet hits the server. Caddy then tries to load the cert from the storage backend. If there isnāt a cert, it asks on demand tls if it can get one (ask http://myapi.com/query). It canāt get one, so the request stops there.
So because my storage backend is S3, and Caddy checks it on each request on new/declined domains, itās possible to effectively make Caddy DDoS my S3 backend.
One solution would be to have Caddy first check on_demand_tls.ask, and if that returns OK 200, then try and load the domain, for domains that hasnāt been loaded yet.
Iād rather have it check my ask endpoint before my storage backend.
4. Error messages and/or full log output:
This happens everytime I try and hit caddy with a domain that isnāt allowed by my ask endpoint:
Thatās a separate issue, but several S3 services now have strong read after write consistency, so I believe it should work just fine, as long as your S3 service supports that.
Hey look, Francis actually hit the nail on the head here. S3 is not a compatible storage mechanism, period ā at least, not for any deployments at scale, or with any significant volume.
S3 does not provide atomic operations (that I know of). Unfortunately, because of that, it is not a suitable storage backend for multiple clients, since you cannot guarantee exclusivity / concurrent safety like with databases or regular file systems.
So to clarify, our service (Caddy / CertMagic) supports what you want to do just fine. Itās either a faulty implementation of the interface methods (Load/Store/List/Delete/etc) or a faulty storage backend (lacks atomic operations, is expensive to access, is slow, etc).
For that specifically, I would suggest filing an issue with the authors of the S3 plugin(s). I just checked the CertMagic wiki and there appear to be quite a lot now:
But the Caddy project does not endorse any of these.
There is a DynamoDB implementation that does support locking, but be aware that DynamoDB is very expensive and slow, and it looks like even this implementation has reports of a locking bug:
All I can say is that itās up to plugin authors to implement reliable storage modules (even if that means not implementing modules for certain backends that are unsuitable).
This is certainly a possibility. Could you open an issue to propose this? Here in CertMagic:
I see. Apparently updates to a single key in S3 are atomic, but:
Amazon S3 does not support object locking for concurrent writers. If two PUT requests are simultaneously made to the same key, the request with the latest timestamp wins. If this is an issue, you must build an object-locking mechanism into your application.
Your point was that cost was an issue (if I understand correctly, because of lots of storage reads), but if you run your own Consul or Redis, that would be negligible.
If I used a managed solution, cost could be an issue. But with self-hosted or even a āsmallā managed one, performance could definitely be an issue as well. At least compared to my ask endpoint which Iād have full control over.