Logging to Splunk

I created a TA for Splunk to intepret Caddy access logs: Add-on for Caddy | Splunkbase.

I also included a quick and dirty dashboard:

2 Likes

Awesome! Thanks for sharing. I’ve tweeted it from the Caddy account.

Hey man thanks for making this, I’m using this myself but I already had a caddy sourcetype and kinda jankly extracted fields. However, the sourcetype name I use is caddy:caddy, I tried changing within eventypes.conf to

[caddy_access]
search = (sourcetype=caddy:caddy)

but I can’t seem to get it to extract fields.

I even tried making a local folder within the app directory and changing everything that referenced the sourcetype of caddy_access to caddy:caddy but still couldn’t get the fields to extract. Not sure if it’s because rsyslogd is appending a timestamp and syslog tag to the logs before it’s sent…Not sure but would love some advice.

1 Like

Hi Jake.

Is Caddy logging in json? This Splunk TA is designed for that scenario, rather than extracting key value pairs from CEF/similar.

Can you share a screenshot of what your raw caddy logs like in splunk?

Greg

1 Like

Here’s an example of the logs that are sent. It’s sent by rsyslog which appends the timestamp, hostname, and syslog tag to the beginning of the log data. Let me know if you can think of any solutions to fix this. It seems that this is causing Splunk to not index the json fields.

I would use a SEDCMD directive in props.conf for the caddy_access sourcetype in order to strip everything up to the first open curly brace, allowing Splunk to intepret the rest as json.

Before:

After:

props.conf

[caddy_access]
SEDCMD=s/^[^\{]+//g
SHOULD_LINEMERGE = false
LINE_BREAKER = ([\r\n]+)
MAX_TIMESTAMP_LOOKAHEAD = 14
NO_BINARY_CHECK = true
TIME_FORMAT = %s.%3N
TIME_PREFIX = "ts":
TZ = UTC
KV_MODE = json
TRUNCATE = 4096
EVENT_BREAKER_ENABLE = true
EVENT_BREAKER = ([\r\n]+)
...

That SEDCMD in props.conf must be present on server is first processing your data i.e. if you’re tailing the log file using a universal forwarder and there’s a heavy forwarder between that UF and your indexers, it’ll need to be present on the HF.

1 Like

This worked! Thanks and sorry, I’m more of an analyst with Splunk than an engineer so I appreciate the pointer. Trying to learn more everyday.

1 Like

All good - glad it worked and happy to help.

I added some of the above content to the Troubleshooting tab on the app’s Splunkbase page.

This scenario - json but with surrounding fluff - isn’t uncommon and isn’t exactly intuitive to deal with.

2 Likes

Something more to add, a change needs to be made to the default props.conf, as it’s written now the fields uri_path, uri_query, and domain were not being extracted.

Before:

EXTRACT-uri_path = "uri": "(?<uri_path>[^\?"]+)(?<uri_query>\?[^"]+)*"
EXTRACT-uri_query = "uri": "[^"]+(?<uri_query>\?[^"]+)"
EXTRACT-domain = "host": "(?<domain>[^\:"]+)

After:

EXTRACT-uri_path = \"uri\":\"(?<uri_path>[^\?\"]+)(?<uri_query>\?[^\"]+)*\"
EXTRACT-uri_query = \"uri\":\"[^\"]+(?<uri_query>\?[^\"]+)\"
EXTRACT-domain = \"host\":\"(?<domain>[^\:\"]+)

Without escaping the quotes, the fields will not be indexed.

1 Like

Good spot - will update the TA on Splunkbase.

@matt - any chance of getting the protocol logged in the access logs? i.e. http or https.

Isn’t the protocol already known? After all that’s up to the server configuration. If it’s on port 80, it’s HTTP, port 443 is HTTPS, that kind of thing.

If it’s 80 or 443 we can make a fair assumption. If it has 443 somewhere in the port number (8443 etc.) we could probably also take a reasonable guess.

Beyond that it’s not really possible to know.

Just trying to log the full url for compliance with Splunk’s web data model, which expects proxy data to include the requested URL rather than just the host, uri and query parameters.

1 Like

We log tls in the access logs which details the TLS handshake. That’ll be empty for HTTP. See How Logging Works — Caddy Documentation

1 Like

Oh good point! I forgot about that.

Thanks all - I’ve updated the props.conf file to add https:// or http:// to the calculated url field based on whether request.tls.server_name is defined:

EVAL-url = if(isnotnull('request.tls.server_name'),"https","http") . "://" . 'request.host'.'request.uri'

Resulting values at search time:

image

2 Likes