Development: Using Caddy to deter brute force attacks in WordPress

That one’s basically a header_regexp matcher that just looks for the Referer header (and yes, there’s a spelling mistake there and it’s been in HTTP for decades) to check that it starts with the given domain; if the header isn’t there then the request likely didn’t come from a real browser and can be ignored. The regexp itself will probably be the same thing in Caddy as it would nginx. Remember to use https://regex101.com with its Go mode for testing your regexp.

Also btw, consider using abort if you’re blocking bots, it’s more efficient because Caddy will just drop the connection immediately instead of sending back a response.

1 Like

I really want to get my head around headers and be comfortable working with them, so, back to first principles. I refer back to a query I raised in the thread The mysterious header .

The Referer header

I go to the WP login page on the site xxx.udance.com.au/wp-login.php and see a whole bunch of stuff in the inspect screen.

I click on the first entry under the Name column and the inspect screen changes.

Under the Name column, I see a list of names of what I’m not sure? On the title row, I see six headings Headers ... Cookies. Within the body, I see several sections General, Response Headers and Request Headers. Within each of these sections, I see what look like fields and values.

My first question is 'Where do I find the field Referer header?’

header_regexp syntax

I refer to the documentation on header_regexp

The syntax is as follows:

header_regexp [<name>] <field> <regexp>

The documentation says that <name> is optional, but recommended. Why?
The <field> I figure i’lI get from somewhere on the inspect screen

The <regexp> i think I use in this instance is http(|s)://xxx\.udance\.com\.au/(wp-comments-posts|wp-login)\.php$ after reverse engineering the NginX code sample.

Putting it all together

# I think I need the 'not' version of the regexp, but I'm not sure how to get that.
@noreferrer header_regexp <name> <field>  http(|s)://xxx\.udance\.com\.au/(wp-comments-posts|wp-login)\.php$
abort @noreferrer

There are plenty of blanks to be filled in. Help!

Those are all the individual HTTP requests the browser made to load that page. First the HTML page is loaded, and the HTML page has a bunch of <script src="javascript.js"> and <link rel="stylesheet" href="styles.css"> elements in the <head> which are references to JS and CSS. The browser sees those, pulls those down as well (they’re necessary to render the page with pretty styling and with reactive behaviour). Also you have some images (.svg, .jpg, etc) that are from <img> tags. Most of these are versioned with ?ver= queries, so that if the version is bumped on them they won’t be cached by the browser (if the URL isn’t exactly the same, the browser won’t load those files from its cache and will instead fetch them freshly from the server).

Headers is a list of all the request headers (which the browser sent to your server) and response headers (which your server sent back to the browser), and the top few lines are the basic properties of the HTTP request (method, url, status, remote address/destination of the request).

The Preview tab will just show you a rendering of the response content. It won’t always be useful, but it’s handy if for example some image is loaded but rendered off-screen and you want to see it, or to see some JSON data fetched by the browser in full, etc.

The Response tab is usually the “raw” body of the response, not rendered.

The Initiator tab is debugging information to see what exactly triggered the HTTP request to happen. It might be some click event if you clicked on something, or it could just be because it was a <script> tag, etc. Mainly useful for developers of the website, not so useful as a user.

The Timing tab will show you how long each part of the request lifecycle took. Useful for tracking down performance issues in loading pages or whatever.

The Cookies tab will show you all the cookies that this website has told your browser to hold onto and send back on every request.

Yep, headers are key-value pairs. A header may be specified more than once as well (you may have seen multiple Server: response headers for example).

Referer is a request header. (Whenever you don’t know what a specific header means, MDN is definitely the best resource to find out):

It will only show up on pages where you transitioned from one page to another. So if you just open up a tab and go to a page, you won’t have that header sent. If you click on a link on the page, then you will have that header in the request for that link.

The name field is necessary if you want to extract a result from the regexp. That’s called a “capture group”. Capture groups are anything within ( ) parentheses (regex101 will point out what parts are capture groups and what was captured in the right sidebar). Caddy will write the capture group results to placeholders like {re.<name>.<capture group>}.

You need to have defined a name on your matcher to be able to use the placeholder. If you don’t care about grabbing the output, then you don’t need to set a name.

Capture groups can either be numeric, i.e. assigned numbers in the order that the groups of parentheses appear in the regexp string, or they can have named capture groups which involve syntax like (?<group-name>/the-match) (in this case the < > are necessary and part of the syntax, not a “placeholder”) and this would let you grab the value with a placeholder like {re.foo.group-name} (given the regexp name was set to foo). Some reading on Go regexp syntax (includes Go code, hopefully you can follow along anyways):

The field in this case would literally be Referer since that’s the header you want to match on.

Yeah that’s probably pretty close. I’d suggest changing (|s) to simply s? because ? means “character appears 0 or 1 time” making the s optional.

So it might look like this:

@noreferrer header_regexp Referer https?://xxx\.udance\.com\.au/(wp-comments-posts|wp-login)\.php$
abort @noreferrer

I omitted the name because you aren’t using the result. But in this case the matching group 1 would have either wp-comments-posts or wp-login depending on which page the request came from. If you had set the name to referer, say, then you could use {re.referer.1} to get that page name.

Thank you for the crash course on headers and related content.: :ok_hand:

Don’t I need the not version of the regexp? I wasn’t sure how to achieve that. Atm, I believe the abort happens for valid referrals.

It seems to me that Referer matches the Request URL field on the inspect screen. Are they one and the same?

Thanks for the tip!

Thank you. That’s super useful!

Ah, that clears that up.

Yes, that’s much better!

Yeah, I guess so. Just use the not matcher in front of header_regexp :joy:

“inverting” a regexp is actually pretty difficult to do properly, cause it sometimes ends up involving negative lookaheads which can be pretty bad for performance and aren’t supported in all regexp engines (I’m not sure if Go supports them, I’d need to look into it)

If you clicked on a link or whatever that brought you right back to the same page, then yeah it will be the same. But if your last request was a navigation away from elsewhere, then it’ll be different.

1 Like

D’oh! :open_mouth:

I’m just about ready to pull it all together. Just one question before I do. Is there an easy way to test the last bit of code that denies access to no referrer requests?

Yeah, just don’t use not and if it blocks requests that should normally work, but then doesn’t when you add not, then you’re good to go. Because boolean logic, it can only be one or the other :sweat_smile:

Or try making requests with curl, cause that won’t add the Referer header.

1 Like

Thank you @francislavoie I have learnt so much from making this WP support doc Caddy ready.

1 Like

This is a draft proposal for Caddy equivalent constructs to be added to the end of the stated sections in the WP support doc Brute Force Attacks. I’ve attempted to align these to examples provided for other webservers, in particular NginX. Please review and provide any feedback before I issue a WP doc update request.

Protect Your Server #

For Caddy, you can use the error directive to protect your site. In the example below, wp_admin has been locked down.

    # Trigger a 401 error for wp_admin
    error /wp-admin* "Unauthorized" 401

    # Handle the error by serving an HTML page
    handle_errors {
        rewrite * /401.html
        file_server
    }

Password Protect wp-login.php #

For Caddy, you can password protect your wp-login.php file using the basicauth directive.

    basicauth /wp-login.php {
        # Add separate lines for each additional user
        user1 password-hash1
    }

Caddy configuration does not accept plaintext passwords; you MUST hash them before putting them into the configuration. The caddy hash-password command can help with this.

Limit Access to wp-login.php by IP #

For Caddy, use the remote_ip request matcher to limit access to wp-login.php by IP address.

    @blacklist {
        # All except the specifed addresses
        not remote_ip forwarded 203.0.113.15 203.0.113.16 203.0.113.17
        # or for the entire network
        # not remote_ip forwarded 203.0.113.0/24
        path /wp-login.php
    }

    # Block access to wp-login.php for blacklisted addresses
    respond @blacklist "Forbidden" 403 {
        close
    }

Deny Access to No Referrer Requests #

For Caddy, use the header_regexp request matcher to deny access to no referrer requests.

    # Stop spam attack logins and comments
    @noreferrer not header_regexp Referer https?://example\.com/(wp-comments-posts|wp-login)\.php$
    abort @noreferrer

Using abort for blocking bots is more efficient because Caddy will just drop the connection immediately instead of sending back a response.

A couple of notes:

  1. While I’ve tried to match existing examples closely, I’ve deviated somewhat for the section Limit Access to wp-login.php by IP. Rather than use the same error/handle_errors technique used in the section Protect Your Server, which would have aligned me with existing examples for other webservers, I opted to use respond. My reason for doing this is that I wanted to show the use of error, respond and abort across different examples.
  2. The Protect Your Server example is valid for Caddy 2.4.4 and later. Earlier versions of Caddy will require an order directive in the global section. I decided to leave this out of the proposal.
{
    order error before respond
}
1 Like

A WordPress doc update request has been submitted here Using Caddy to deter brute force attacks in WordPress · Issue #23 · WordPress/Documentation-Issue-Tracker · GitHub

2 Likes

Damn it! I forgot to test this. Now, that I have, if I remove not there is no difference to the result.

If abort ran I would have seen a blank screen. I guess this would suggest that the regexp matcher isn’t correct.

    @noreferrer header_regexp Referer https?://xxx\.udance\.com\.au/(wp-comments-posts|wp-login)\.php$
    abort @noreferrer

Aargh! :scream:

EDIT: Looking at the examples in Deny Access to No Referrer Requests, I notice they all use http_referer rather than Referer and it seems they treat the path separately from the domain… Is there any significance in this?

Nginx treats the headers differently and requires a http_ prefix for headers. It’s because they throw all their variables in the same bucket. PHP does the same sort of thing. Caddy doesn’t need to do that because things are “namespaced” with placeholders etc.

Hard to say why it’s not working for you. I’m not too sure.

I’ll do a deeper dive tonight when I get home and see if I can spot anything. If I debug, will I be able to see the value of Referer in the process log?

Access logs in general should have the values (under the request headers). Or look at the requests in your browser for it.

I checked the access log and the Referer appears to be correct.

{"level":"info","ts":"2021-08-28T23:16:19.905+0800","logger":"http.log.access.log0","msg":"handled request","request":{"remote_addr":"10.1.1.4:55589","proto":"HTTP/1.1","method":"GET","host":"xxx.udance.com.au","uri":"/favicon.ico","headers":{"X-Forwarded-Proto":["https"],"User-Agent":["Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.159 Safari/537.36"],"Pragma":["no-cache"],"Sec-Ch-Ua-Mobile":["?0"],"Sec-Gpc":["1"],"Sec-Fetch-Dest":["image"],"Accept-Language":["en-AU,en-GB;q=0.9,en-US;q=0.8,en;q=0.7"],"Cookie":["wordpress_test_cookie=WP%20Cookie%20check"],"Sec-Ch-Ua":["\"Chromium\";v=\"92\", \" Not A;Brand\";v=\"99\", \"Google Chrome\";v=\"92\""],"Sec-Fetch-Site":["same-origin"],"Accept":["image/avif,image/webp,image/apng,image/svg+xml,image/*,*/*;q=0.8"],"Accept-Encoding":["gzip, deflate, br"],"Cache-Control":["no-cache"],"Sec-Fetch-Mode":["no-cors"],"Referer":["https://xxx.udance.com.au/wp-login.php"],"X-Forwarded-For":["10.1.1.222"]}},"common_log":"10.1.1.4 - - [28/Aug/2021:23:16:19 +0800] \"GET /favicon.ico HTTP/1.1\" 302 0","duration":0.259694065,"size":0,"status":302,"resp_headers":{"Status":["302 Found"],"X-Powered-By":["PHP/7.4.21"],"Content-Type":["text/html; charset=UTF-8"],"Link":["<https://xxx.udance.com.au/wp-json/>; rel=\"https://api.w.org/\""],"X-Redirect-By":["WordPress"],"Location":["https://xxx.udance.com.au/wp-includes/images/w-logo-blue-white-bg.png"],"Server":["Caddy"]}}

What is interesting is when I do a caddy adapt --pretty, I see extra backslashes in the regexp pattern. I’m not sure if this is expected?

                                                {
                                                        "match": [
                                                                {
                                                                        "header_regexp": {
                                                                                "Referer": {
                                                                                        "pattern": "https?://xxx\\.udance\\.com\\.au/(wp-comments-posts|wp-login)\\.php$"
                                                                                }
                                                                        }
                                                                }
                                                        ],
                                                        "handle": [
                                                                {
                                                                        "abort": true,
                                                                        "handler": "static_response"
                                                                }
                                                        ]
                                                }

For comparison, this is what’s in the Caddyfile

    @noreferrer header_regexp Referer https?://xxx\.udance\.com\.au/(wp-comments-posts|wp-login)\.php$
    abort @noreferrer

That should be fine. That’s just JSON escaping the backslashes for itself, but when the JSON is deserialized into the Go structs, it loses the extra backslashes.

Okay. I believe there is an issue somewhere. I’ll try to explain as best I can.

I set up a minimal Caddyfile for the WP test site.

:80 {
    log {
        output file /var/log/caddy/access.log
    }

    root * /usr/local/www/wordpress
    php_fastcgi 127.0.0.1:9000
    file_server
}

I access the script wp-login.php in a browser window. This is what I see in the browser:

This is what I see in the access log:

{"level":"info","ts":1630290363.6400852,"logger":"http.log.access.log0","msg":"handled request","request":{"remote_addr":"10.1.1.4:17262","proto":"HTTP/1.1","method":"GET","host":"xxx.udance.com.au","uri":"/wp-login.php","headers":{"Upgrade-Insecure-Requests":["1"],"X-Forwarded-Proto":["https"],"Sec-Fetch-Site":["none"],"Sec-Ch-Ua":["\"Chromium\";v=\"92\", \" Not A;Brand\";v=\"99\", \"Microsoft Edge\";v=\"92\""],"Sec-Ch-Ua-Mobile":["?0"],"Sec-Fetch-User":["?1"],"X-Forwarded-For":["10.1.1.222"],"Accept-Encoding":["gzip, deflate, br"],"Accept-Language":["en-US,en;q=0.9"],"Cookie":["wordpress_test_cookie=WP%20Cookie%20check"],"Sec-Fetch-Dest":["document"],"Cache-Control":["max-age=0"],"Sec-Fetch-Mode":["navigate"],"User-Agent":["Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.159 Safari/537.36 Edg/92.0.902.84"],"Accept":["text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9"]}},"common_log":"10.1.1.4 - - [30/Aug/2021:10:26:03 +0800] \"GET /wp-login.php HTTP/1.1\" 200 6126","duration":0.264896908,"size":6126,"status":200,"resp_headers":{"X-Powered-By":["PHP/7.4.21"],"Expires":["Wed, 11 Jan 1984 05:00:00 GMT"],"Cache-Control":["no-cache, must-revalidate, max-age=0"],"Content-Type":["text/html; charset=UTF-8"],"Set-Cookie":["wordpress_test_cookie=WP%20Cookie%20check; path=/; secure"],"Server":["Caddy"],"X-Frame-Options":["SAMEORIGIN"]}}
{"level":"info","ts":1630290364.0914872,"logger":"http.log.access.log0","msg":"handled request","request":{"remote_addr":"10.1.1.4:17262","proto":"HTTP/1.1","method":"GET","host":"xxx.udance.com.au","uri":"/favicon.ico","headers":{"Accept-Encoding":["gzip, deflate, br"],"Sec-Fetch-Dest":["image"],"Accept-Language":["en-US,en;q=0.9"],"Sec-Ch-Ua":["\"Chromium\";v=\"92\", \" Not A;Brand\";v=\"99\", \"Microsoft Edge\";v=\"92\""],"Sec-Fetch-Mode":["no-cors"],"X-Forwarded-For":["10.1.1.222"],"User-Agent":["Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.159 Safari/537.36 Edg/92.0.902.84"],"Accept":["image/webp,image/apng,image/svg+xml,image/*,*/*;q=0.8"],"Cookie":["wordpress_test_cookie=WP%20Cookie%20check"],"Referer":["https://xxx.udance.com.au/wp-login.php"],"Sec-Ch-Ua-Mobile":["?0"],"Sec-Fetch-Site":["same-origin"],"X-Forwarded-Proto":["https"]}},"common_log":"10.1.1.4 - - [30/Aug/2021:10:26:04 +0800] \"GET /favicon.ico HTTP/1.1\" 302 0","duration":0.258119519,"size":0,"status":302,"resp_headers":{"Content-Type":["text/html; charset=UTF-8"],"Link":["<https://xxx.udance.com.au/wp-json/>; rel=\"https://api.w.org/\""],"X-Redirect-By":["WordPress"],"Location":["https://xxx.udance.com.au/wp-includes/images/w-logo-blue-white-bg.png"],"Status":["302 Found"],"X-Powered-By":["PHP/7.4.21"],"Server":["Caddy"]}}

In the first line in the headers, I see fields and values like "Cookie":["wordpress_test_cookie=WP%20Cookie%20check"] and "Cache-Control":["max-age=0"] and a "status":200. In the second line, I see "Referer":["https://xxx.udance.com.au/wp-login.php"], but I also see "status":302.

I append some code for testing handle_regexp in the Caddyfile.

:80 {
    log {
        output file /var/log/caddy/access.log
    }

    root * /usr/local/www/wordpress
    php_fastcgi 127.0.0.1:9000
    file_server

    @test header_regexp <field> <regexp>
    handle @test {
        respond @test "Match"
    }
    handle {
        respond "No match"
    }
}

Cookie

I set the matcher to look for the word cookie in the field Cookie.

    @test header_regexp Cookie cookie

After reloading the Caddyfile, I access the login script again and get a match.

This time, I set the matcher to look for the word biscuit in the field Cookie.

    @test header_regexp Cookie biscuit

I access the script again after reloading Caddyfile and I get no match. So far, it all works as expected.

Cache-Control

I repeat the exercise for the field Cache-Control looking for the word max-age and then maxage and get a match and no match as expected.

Referer

I repeat the exercise, but this time look for the word login in Referer

    @test header_regexp Referer login

This time I get a no match. I repeat for the words udance and http and even just the letter a and get the same result.

The main difference I’m seeing is that Referer doesn’t appear in the browser inspect screen and appears in the second line rather than the first line of the access log. I think this has something to do with why I’m not getting a match and why I’m not able to deny access to no referrer requests using the code below. The matcher is never activated.

    @noreferrer header_regexp Referer https?://xxx\.udance\.com\.au/(wp-comments-posts|wp-login)\.php$
    abort @noreferrer

That’s just a request to the favicon.ico, but wordpress redirected it (see Location header, and 302 status is a redirect) to /wp-includes/images/w-logo-blue-white-bg.png instead. Since the /favicon.ico always gets loaded as a result of an HTML page loading, those will always have the Referer header.


The point of these “brute force attacks” checks is to prevent POST requests on wp-login.php or wp-comments-posts.php, without having previously loaded the actual login/comments page with a GET request beforehand.

Also looking back at the Nginx example you gave in Using Caddy to deter brute force attacks in WordPress - #20 by basil (I’ll be honest I wasn’t 100% paying attention cause I don’t super care about WordPress :joy:) this would actually be more like this in Caddy:

@protected path_regexp (wp-comments-posts|wp-login)\.php$
handle @protected {
	@no-referer header Referer {scheme}://{host}*
	abort @no-referer
}

So, if the current request is to one of those two paths, then check the Referer header and make sure it exists and matches the current scheme+host using placeholders (with a fast suffix match for the remainder).

To actually test this, you should make sure that like curl -v https://xxx.udance.com.au/wp-login.php fails (drops the connection, doesn’t return HTML) but works in the browser.

Hah! I was actually tinkering around with path_regexp and handle. I just couldn’t make the leap to two matchers and figure out the source for Referer. Can you explain this bit to me :pleading_face:? I’d like to understand it more.

{scheme}://{host}*