Errors Directive - Ambiguous Behavior

LewPayne · March 20, 2019, 11:46pm

I’m using Caddy v0.11.5 and I’m noticing that the errors directive exhibits ambiguous behavior.

The docs clearly state that the errors directive will record errors (defined as HTTP status codes of 400 or greater) to the designated handler:

errors allows you to [...] and enable error logging. Without this middleware, error responses (HTTP 400+) are not logged and the client receives a plaintext error message. Using an error log, *the text of each error will be recorded so you can determine what is going wrong without exposing those details to the clients.*

However, the following configuration does not record HTTP 400+ errors:

errors /www/weblogs/errors.log {
    rotate_age 30   # Days to keep
    rotate_keep 7   # 7 files max
    rotate_size 100 # Max size (MB)
    rotate_compress
}

Strangely enough (and hence my claim of ambiguous behavior), the following configuration will:

errors /www/weblogs/errors.log {
    # custom error handler (with invalid filename)
    *  filename_that_does_not_actually_exist.html
    #
    rotate_age 30   # Days to keep
    rotate_keep 7   # 7 files max
    rotate_size 100 # Max size (MB)
    rotate_compress
}

Using the above misconfiguration, the errors.log will actually record three (3) error lines for each 404 error that transpires - one for the fact that the intended (by client) page was missing, one for the fact that the custom error page is missing, and one for the fact that a 404 took place.

Is there a way to simply have all 400+ HTTP status errors record in the errors file, without having to misconfigure the errors directive to do so? Yes, I can grep through the access log to find them, but I should be able to see them in the errors file as well, without all the associated noise. It doesn’t matter to me if they still continue to log in the access log, as long as I can log them separately as well. In some cases, I don’t have the access logs enabled - I only care about logging errors, so that they can be investigated.

Whitestrake · March 21, 2019, 12:26am

Unfortunately, the errors directive does two things:

Serves custom error pages for the client on 4xx and 5xx status
Logs panics / server errors (i.e. 5xx)

It does not log 4xx status results; these are considered normal operation of the server (as these are client errors) and are only printed to access logs.

Misconfiguring errors to output its own errors whenever it tries and fails to serve a 4xx error page is a pretty… interesting… workaround that might be the only way to get 4xx results in the error log.

matt · March 21, 2019, 2:38am

@whitestrake Yep, that’s mostly right - but technically any error value returned from an HTTP handler in the middleware chain (regardless of the HTTP status code) will be logged. So that’s up to the handlers, written in Go, to decide, whether and which errors are reported.

Generally, logging 4xx error details isn’t particularly useful to the server because they are client errors. The status codes are still logged in the access logs (the log directive).

But I can see how this particular situation you describe is a bit tricky. Could you open an issue to request improvements to the errors directive?

LewPayne · March 21, 2019, 3:47pm

Thank you, Matt. Note there are two separate issues here:

The documentation does not clearly describe the current behavior (which your post greatly clarifies), and;
The errors directive is inconsistent in its logging, depending on setup.

From a philosophical (or purist’s) standpoint: If an HTTP handler were to return a “404” for any reason (e.g., geo-country restriction), the request would then be logged via the errors directive. However, if the main caddy process (consider it the “default” handler) detects a 404 error, it would not be logged. Therein lies the inconsistency.

@MariusWiencke Errored requests will be logged to both the access log and the error log.

As far as 4xx errors not being particularly useful because they’re client (user agent) errors, that’s not entirely true. The easiest way to check for anomalies on a customer’s site, over time, is by observing the error log. In many cases, 404 errors are the result of the customer mistyping an asset (if allowed to edit portions of their html via a tool such as SurrealCMS), forgetting to upload a referenced asset (i.e., image, video, etc), uploading it with the wrong name, or similar.

A bit of humor… the above can all be considered client (as in customer) errors, which result in server errors - of conditions which can and should be mitigated via human interaction - but they are not entirely client (user-agent) errors, since the missing assets referenced by the html were intended to exist and be reachable on the server. In a way, they are server-side errors (i.e., a missing asset) that happen to be discovered by the client (agent).

Anyway, I know this is lengthy, and my point is to add clarity by example. On a personal note, I’m now a huge fan of software distributed via self-contained golang executables (caddy, gnatsd, etc). You’ve created a great product, which is a breeze to install, straightforward to configure, and integrated to the extent that it should be (e.g., acme, optional plugins, etc), without bloat. Thank you; I will be having my customers pay for the product.

system · June 19, 2019, 3:47pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.