How can I debug caddy plugin?

1. The problem I’m having:

I am new in Go and Caddy plugin, recentlly I developed a WAF plugin referred to the docs example, its principle like the picture.

  1. User’s request reach caddy and be forwarded to the WAF engine by WAF plugin.
  2. WAF response the action about this request, pass or blocked.
  3. If pass, the user’s request will be forwarded to the upstreams. If blocked, the caddy WAF plugin return the error page to the user.
    The principle of WAF plugin

I have deployed it to my stage environment, but I found it has some new problems compared to the vanilla caddy:

  1. Sometimes it will block the requests(not security reason) until restart the caddy or deactivate the WAF plugin.
  2. Sometimes it takes up a huge amount of memory even OOM.
  3. Sometimes it takes up more memory than usual.
  4. After I reload the caddy, I noticed that there are double current Established TCP to the WAF engine, seems run 2 instances of WAF plugin?

How can I debug the memory problem? I have tried the method in docs, but I can’t find the reason.

Additional, does caddy exec some function(like conn.Close()) after reload the caddy? How can I debug the reload progress?

2. Error messages and/or full log output:

Nothing

3. Caddy version:

v2.8.4

4. How I installed and ran Caddy:

a. System environment:

Debian 12 amd64, run with systemd

b. Command:

xcaddy run

d. My complete Caddy config:

(waf) {
	route {
		waf_chaitin {
			waf_engine_addr 192.168.1.101:8000
			pool_size 10
		}
	}

https://*.buct.edu.cn {
	encode zstd gzip
	import waf
        reverse_proxy * {
		to http://192.168.1.200
        }

	@site1 host site1.buct.edu.cn
	handle @site1 {
		encode zstd gzip
		import waf
		reverse_proxy * http://192.168.1.201
	}

	@site2 host site2.buct.edu.cn
	handle @check {
		encode zstd gzip
		import waf
		reverse_proxy * http://192.168.1.202
	}

5. Links to relevant resources:

1 Like

There’s instructions in the xcaddy README.

Yes while reloading there’s two copies of all modules loaded for a short period of time. Read Architecture — Caddy Documentation to understand why.

Make sure all your goroutines etc are properly terminated. Ideally you don’t do any async work during individual requests, you should pre-compute anything you need either on Provision (don’t take too long though otherwise it delays config reload) and/or in the background via a goroutine/timer spawned during Provision.

I mean I don’t know what your plugin is doing so the above is just a shot in the dark.

2 Likes

Fine, referred to caddy/modules/caddyhttp/reverseproxy/reverseproxy.go at master · caddyserver/caddy · GitHub , I realized that there is a function Cleanup() will be called when caddy reload instead call function Stop(), which was not included in my previous code.


So it seems the problem 4 has been solved. Thank you. :+1:
But the problem 1-3 still existed, because I tried to profill the heap file dump from localhost:2019/debug/pprof/heap?debug=1 , I can’t see any function name in it :smiling_face_with_tear:

Right, Start/Stop is for Caddy Apps (different kind of module). For other kinds of modules like HTTP handlers, then Cleanup() is the opposite of Provision().

:man_shrugging:

You might need a debugger to trace some of the code paths to see where it’s stuck. If you use vscode, you can create a file .vscode/launch.json with this content:

{
	"version": "0.2.0",
	"configurations": [
		{
			"name": "Attach to Process",
			"type": "go",
			"request": "attach",
			"mode": "local",
			"processId": <caddy process id>
		}
	]
}

Then you’ll be able to attach a debugger to a running process.

To understand how to analyze the profile you’ve collected, read this article:

2 Likes

Fine, for the problem 1, I have found when under high-load, there is a potential deadlock in the connection pool of my plugin. I have introduced a new connection pool to solve it. But ideally, I would like to reuse the code of the connection pool of caddy instead of write a new one. But I am hard to found where is the outgoing connection pool of caddy, just found the incoming pool called listeners. :smiling_face_with_tear:

For the problem 2-3, I tried to profill the heap file with the go tool directly instead of download it via browser, I eventually see the function name in it. :+1:

It seems that caddyhttp.(*responseRecorder).Write comsumed too many menory? :thinking:

Here are the heap dump files:

1 Like

What that tells me is that you’re likely not closing the body when you’re done with it, so the response recorder buffers don’t get reclaimed/garbage collected correctly. Something like that.

I mean, we don’t know what your plugin code is, so we can only make assumptions. We don’t have much to go on.

1 Like

Thanks your inspried. Here are my plugin code.

  1. My Caddy WAF plugin: GitHub - W0n9/caddy_waf_t1k: A WAF plugin for Caddy Server using Chaitin SafeLine as backend engine.
  2. A library connect between caddy and WAF Engine by forward the incoming content: GitHub - W0n9/t1k-go

The code in my plugin which called the WAF detect function is here: caddy_waf_t1k/waf.go at c34e10f9659efac6bb325b89d5aa3c4bbb32adad · W0n9/caddy_waf_t1k · GitHub

// ServeHTTP is the main handler function for the CaddyWAF middleware. 
// It processes incoming HTTP requests and determines if they should be blocked 
// based on the WAF (Web Application Firewall) engine's detection logic. 
// If an error occurs during detection, it logs the error and passes the request 
// to the next handler in the chain. If the request is blocked, it redirects 
// the response based on the WAF result.
//
// Parameters:
// - w: The ResponseWriter to write the HTTP response.
// - r: The incoming HTTP request.
// - next: The next handler in the middleware chain.
//
// Returns:
// - An error if one occurs during the handling of the request.
func (m CaddyWAF) ServeHTTP(w http.ResponseWriter, r *http.Request, next caddyhttp.Handler) error {

	result, err := m.wafEngine.DetectHttpRequest(r)
	if err != nil {
		m.logger.Error("DetectHttpRequest error", zap.String("request", r.Host), zap.String("path", r.URL.Path), zap.String("method", r.Method), zap.Error(err))
		return next.ServeHTTP(w, r)
	}
	if result.Blocked() {
		return m.redirectIntercept(w, result)
	}
	return next.ServeHTTP(w, r)
}

And here are the code in the connection library which about process the Body:

  1. t1k-go/detection/request.go at 90e3f5c82bc631f6a0b942efe520ce539519bed1 · W0n9/t1k-go · GitHub
  2. t1k-go/detect.go at 90e3f5c82bc631f6a0b942efe520ce539519bed1 · W0n9/t1k-go · GitHub
// Because the content of the request body needs to be checked, 
// the request body needs to be cached.
func (r *HttpRequest) Body() (uint32, io.ReadCloser, error) {
	bodyBytes, err := io.ReadAll(r.req.Body)
	if err != nil {
		return 0, nil, err
	}
	r.req.Body = io.NopCloser(bytes.NewReader(bodyBytes))
	return uint32(len(bodyBytes)), io.NopCloser(bytes.NewReader(bodyBytes)), nil
}
// Forwarded the Body to the WAF Engine
// using the custom TCP connection
{
		bodySize, bodyReadCloser, err := req.Body()
		if err == nil {
			defer bodyReadCloser.Close()
			sec := t1k.MakeReaderSection(t1k.TAG_BODY, bodySize, bodyReadCloser)
			err = t1k.WriteSection(sec, w)
			if err != nil {
				return err
			}
		}
	}

Can you give some advices to my codes, looking forward to your reply.