Logging extension can change request headers?

Hi,

I tried to implement a log field filter, that filters the username name out of an Authorization header and logs it. The use case is, that a reverse proxied application does the basic authentication itself, but I still want to see username in the access log. Caddy sadly only fills the “user_id” field in the structured log, if it has done the authentication itself. I implemented the log filter as follows: (leaving out some of the not so interesting parts)

type BasicAuthFilter struct{}

// CaddyModule returns the Caddy module information.
func (BasicAuthFilter) CaddyModule() caddy.ModuleInfo {
	return caddy.ModuleInfo{
		ID:  "caddy.logging.encoders.filter.basic_auth_user",
		New: func() caddy.Module { return new(BasicAuthFilter) },
	}
}

func (f *BasicAuthFilter) Filter(in zapcore.Field) zapcore.Field {
	authHeader, ok := in.Interface.(caddyhttp.LoggableStringArray)
	if !ok {
		return in
	}
	fakeReq := &http.Request{Header: http.Header{"Authorization": authHeader}}
	userID, _, ok := fakeReq.BasicAuth()
	if ok {
		authHeader[0] = userID
	}
	return in
}

and in the Caddyfile

http://:80 {
	log {
		output file access.log {
			roll_size 20MiB
			roll_keep 5
		}
		format filter {
			wrap formatted "{request>host} {request>remote_addr} - {request>headers>Authorization>[0]} [{ts}] \"{request>method} {request>uri} {request>proto}\" {request>tls>version} {status} {size} \"{request>headers>User-Agent>[0]}\" \"{request>headers>X-Forwarded-For>[0]}\" {duration}" {
				time_format "02/Jan/2006:15:04:05 -0700"
			}
			# wrap json
			fields {
				request>headers>Authorization basic_auth_user
			}
		}
	}
	redir /packagecontrol /packagecontrol/ permanent
	handle /packagecontrol/* {
		reverse_proxy /packagecontrol/* 127.0.0.1:9002 {
			header_up Host {http.request.host}
			header_up X-Real-IP {http.request.remote}
			header_up X-Forwarded-For {http.request.remote}
			header_up X-Forwarded-Port {http.request.port}
			flush_interval 100ms
		}
	}
}

It actually logs the request fine in the access log

localhost [::1]:13079 - - [25/Sep/2021:18:39:25 +0000] "GET /packagecontrol/stats HTTP/1.1" - 401 92 "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.61 Safari/537.36" "-" 21.4393511
localhost [::1]:13079 - admin [25/Sep/2021:18:40:01 +0000] "GET /packagecontrol/stats HTTP/1.1" - 401 92 "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.61 Safari/537.36" "-" 28.7772383

BUT the logging extension actually changes the authorization header of the request and the reverse proxied application just gets the filtered out username instead of “Basic <Base64-string>”

If I use wrap json, the output is the following:

{
    "level": "error",
    "ts": 1632595260.4081564,
    "logger": "http.log.access.log0",
    "msg": "handled request",
    "request": {
        "remote_addr": "[::1]:4983",
        "proto": "HTTP/1.1",
        "method": "GET",
        "host": "localhost",
        "uri": "/packagecontrol/stats",
        "headers": {
            "User-Agent": [
                "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.61 Safari/537.36"
            ],
            "Accept-Encoding": [
                "gzip, deflate, br"
            ],
            "Connection": [
                "keep-alive"
            ],
            "Sec-Ch-Ua-Platform": [
                "\"Windows\""
            ],
            "Sec-Fetch-Site": [
                "none"
            ],
            "Authorization": [
                "admin"
            ],
            "Sec-Ch-Ua-Mobile": [
                "?0"
            ],
            "Sec-Fetch-User": [
                "?1"
            ],
            "Sec-Fetch-Dest": [
                "document"
            ],
            "Accept-Language": [
                "de-DE,de;q=0.9,en-US;q=0.8,en;q=0.7"
            ],
            "Upgrade-Insecure-Requests": [
                "1"
            ],
            "Dnt": [
                "1"
            ],
            "Accept": [
                "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9"
            ],
            "Sec-Fetch-Mode": [
                "navigate"
            ],
            "Cache-Control": [
                "max-age=0"
            ],
            "Sec-Ch-Ua": [
                "\"Chromium\";v=\"94\", \"Google Chrome\";v=\"94\", \";Not A Brand\";v=\"99\""
            ]
        }
    },
    "common_log": "::1 - - [25/Sep/2021:20:41:00 +0200] \"GET /packagecontrol/stats HTTP/1.1\" 401 92",
    "user_id": "",
    "duration": 28.3982782,
    "size": 92,
    "status": 401,
    "resp_headers": {
        "Content-Type": [
            "text/plain; charset=utf-8"
        ],
        "Www-Authenticate": [
            "Basic realm=\"Login Required\""
        ],
        "Date": [
            "Sat, 25 Sep 2021 18:41:00 GMT"
        ],
        "Content-Length": [
            "92"
        ],
        "Server": [
            "Caddy"
        ]
    }
}

I have a few questions:

  • Am I doing something wrong?
  • Why can the logging filter change the request? Logging happens after response is send to the client, right?
  • I know logging is zero-allocation, but doesn’t it seem dangerous if a logging module can interfere with actual request? (I guess the request header-values (which are []string) just get passed through to the logger)
  • Is there a way to add new fields to the structured log from third-party caddy module?
1 Like

The user_id log field is taken from the http.auth.user.id placeholder. See in code:

What you want probably is to set that placeholder value based on the response headers. For that, I recommend an HTTP handler module that reads the response headers on the way back up the middleware chain to parse it, then set the placeholder value. You can put your code after calling next.ServeHTTP() for it to be on the way back up the chain.

1 Like

Well it’s a request header, not a response header. A HTTP Handler seems overkill for an info that is already there in the structured log, just not in the right form.

Log filters don’t have access to the http.Request object, so it won’t be possible that way. You can’t add another field from a filter because you can only operate on the current field.

  • Why can the logging filter change the request? Logging happens after response is send to the client, right?
  • I know logging is zero-allocation, but doesn’t it seem dangerous if a logging module can interfere with actual request? (I guess the request header-values (which are []string) just get passed through to the logger)

Any answers to these questions?

I’m pretty sure it’s because that’s how slices work: pointers to the underlying array. You’re assigning to authHeader[0] so why wouldn’t it change the value? :thinking:

Is that what you’re referring to?

The technical reason for the modification of the request is clear.
But why is the request modified by the log filter before it is transmitted to the backend. I was under the impression that the log is written after the reverse-proxied response is sent to the client.

1 Like

I think it’s because the log message is prepared before handling happens, but additional fields may be added to the log before it gets written (like the response details). I guess Zap happens to apply the filters immediately when fields are added, rather than just before writing.

See here, accLog is created before request handling, then additional fields are added after the response was completed in the defer closure, at which point the log is actually written.

Your filter is probably running at the point when .With() is called.

1 Like

This can’t be intended behaviour, can it? Should there be something done about it, so that log filters can’t interfere with requests before they are completed?

Sure, it’s not intended, but also you shouldn’t be writing code that applies modifications to the request during filtering. That’s not something we intend to cover. Filtering should be pure, you’re kinda breaking the contract with your quoted code.

Admittedly, the docs are pretty scant on what is and isn’t expected: logging package - github.com/caddyserver/caddy/v2/modules/logging - pkg.go.dev

But I do think it’s not a good idea to modify the underlying request values in a log filter.

So that means if I wanted to log the username from the basic authentication header, I would have to write a HTTP handler module? There is no way to do this with a log filter by, I don’t know, changing something in the zapcore.Field?

1 Like

I think so, yes. Zap is unfortunately very rigid. It’s that way for optimization reasons though.

1 Like

Alright thanks for your patience and answers :grinning_face_with_smiling_eyes:

2 Likes

I would love to have a filter module that can manipulate logs as a whole, but unfortunately it’s set up to be per-field which means you can’t really do things like “exclude all headers except this one”.

See this issue for context:

1 Like