Firstly, my apologies for deleting the Help template, it seems to assume that I’ve been writing code for Caddy… whereas the help I’m looking for is to ask whether Caddy is suitable for my requirements
I need to run a proxy in a cluster of Docker containers, with the following qualities:
- forward (not reverse)
- archiving all traffic
I’ve been prototyping with Docker + nginx and so far have come up with the following: docker-compose.yaml · GitHub
Currently I’m trying to work on the MITM behaviour, such that SSL certificates are generated on-the-fly (and cached), so that the client will not error out when performing SSL validation (the client will have the self-signed certificate installed and trusted locally)
After that, I need to work on writing raw request and response bytes to disk. WARC format is preferable, but any format is acceptable if I can write a conversion to WARC. I had previously implemented a prototype in node.js where I was able to duplicate the socket stream and write raw duplex stream buffers to a file.
Last but not least, the proxy will be used by multiple “workers”, and I need a way to record alongside the archived traffic, which worker generated the traffic.
Is Caddy suitable for this kind of solution?
Thanks in advance
I believe this plugin will do what you need?
It can simply shuttle TCP/UDP packets to upstreams you define, including as a “forward” proxy configuration. You can terminate TLS, and tee reads/writes too.
Thanks Matt, this looks promising! Here is the prototype I’m working on: GitHub - mattfysh/proxy-proto
So far I’ve got the proxy operating transparently to the worker across both http and https
The next thing I’m trying to figure out is how to tee the writes to disk, ideally this would occur at layer 4, but I’m not sure if that would be compatible with keepalive connections (each new http request needs to produce a write, even if the socket was reused)
Is there a prebuilt handler for writing to disk? I couldn’t spot one in the list
Also is the L4 app capable of writing both request bytes and response bytes to disk? (ie. not just request)
Nope, not yet – I just made the bare minimum layer 4 module and made sure it’s very extensible. So it should be very straightforward to write a new module for it!
I think so! You can intervene with both Read and Write calls.
Thanks Matt! Is this true for handlers inside a tee branch? I’m new to go and happy to write the file IO module, but just can’t see how a handler in a tee branch gets access to the response bytes from the upstream?
I’ve read the L4Tee module and can see how it branches the connection, but not sure how responses written back to the main “next” fork is available in the other fork created by the tee?
I think so; however, having multiple handlers write to a connection can break things real fast if you aren’t careful, I’d imagine. I haven’t tried the writing from branches thing, but I believe it could work?
If you want one fork to write to another fork, I don’t think that’s really how it works. A write from any fork would just write to the client. You could probably wrap the connection before teeing, though, so that you can intercept read and write calls.
I don’t think I’m understanding how the tee branch works… the port 3210 upstream that I’ve added into the tee branch seems to be capable of blocking both branches/forks
Here is the config I’m using
- http: 
- handler: tee
- handler: proxy
- handler: proxy
On port 3210 I’m simply piping the socket to stdout in node.js
Then my client is calling
curl http://google.com which works without the “tee” handler, but doesn’t receive any response when I add in the tee branch
Any thoughts on what I’m doing wrong here? Thanks in advance
I haven’t really tested it like that, so it could be an oversight or a bug or something (I don’t know of anyone who uses tee yet, I just used it once to do multiple reads in parallel).
I’m getting married in 12 hours, would you be able to dive into the code to figure out what’s happening? I will likely be unavailable for a few days.
That’s amazing Matt! I hope you have an incredible wedding and huge congrats to the newlyweds!
Hey Matt, hope you had an incredible wedding day I’m sure it was a day to remember!
just wanted to let you know that I’ve found the issue… the TeeReader in tee.go was connected to the reader after some bytes were already read, so it wasn’t writing those early bytes to the PipeWriter… ie. the initial chunk of bytes used to test if the incoming connection is HTTP
The solution is to use
cx.Wrap while overriding Read & Write methods, I’ve created this log module that logs request & response bytes to stdout for now, and in contrast to the tee module - it does log the bytes written to the buffer during recording: https://github.com/unwebio/caddy-l4/blob/master/modules/l4log/log.go
The next thing I need to work on is storing these bytes somewhere with timestamp and remote IP address (either a file system or S3), I’ll look into certmagic to see if I can reuse its pluggable storage solution.
After that I think the trickiest part will be support for http/tcp keepalive connections… and trying to understand how to write HTTP/2.0 traffic to WARC format. Will keep you posted; but for now I will assume you are off enjoying a well-deserved honeymoon somewhere nice & warm! Cheers
Ohh; I didn’t think of that, and in fact I don’t think that method even existed when I wrote the Tee module. So it’s quite likely that kind of bug exists there. Worth a PR, do you think?
The rest of your plugin work sounds like it’s coming along nicely! Let me know if you have any more questions.
Hahaha, not quite; it was 15 degrees F and icy in the mountains
This topic was automatically closed after 30 days. New replies are no longer allowed.