Basicauth and file_server

1. Caddy version (caddy version):

v2.4.6 h1:HGkGICFGvyrodcqOOclHKfvJC0qTU7vny/7FhYp9hNw=

2. How I run Caddy:

/usr/bin/caddy run --environ --config /etc/caddy/Caddyfile

a. System environment:

systemd, not inside docker, ubuntu 18.04

b. Command:

caddy reload --config /etc/caddy/Caddyfile

c. Service/unit/compose file:

no

d. My complete Caddyfile or JSON config:

adomain.tld {
    root * /opt/www/html
    header / Strict-Transport-Security "max-age=15768000;"
    templates
    encode zstd gzip
    file_server
    log {
        output file /var/log/caddy/weblogs.log
    }   
}
sub.adomain.tld {
    route /restricted/* {
        # import file_with_several_basicauth_entries like so:
        basicauth /restricted/item1.pdf {
            a JDJhJDE0.......
        }
        basicauth /restricted/item2.pdf {
            a JDJhJDE0.......
        }
        basicauth /restricted/item3.pdf {
            a JDJhJDE0.......
        }
        # this above is everything inside  file_with_several_basicauth_entries there is nothing else there
        # reason it is separated, is because i could easily append / or delete from that file
 
        file_server /restricted/* {
            root /opt/www/sub/restricted/
            hide /opt
        }
    }
    file_server  {
        root /opt/www/sub
        hide /opt
    }
    # handle /restricted/* { respond "please provide correct path" }
}

3. The problem I’m having:

First of all, i need to make a correction to above. This is not directly Caddyfile, but my Caddyfile has two lines of import one_file (only with “top level domain” and import “other_file”, I have “merged them” and pasted here together in same order. I hope this doesn’t change anything for you guys.

My intended usage:

  • whenever user navigates to sub.adomain.tld/restricted it should be redirected to “please provide correct path”
  • whenever user navigates to sub.adomain.tld/restricted/no_such_file.pdf it should be redirected to “please provide correct path”
  • whenever user navigates to sub.adomain.tld/restricted/correct_file.pdf it should first be asked to do basic auth, then the file should be served
  • whenever user navigates to sub.adomain.tld other paths than /restricted, i would want to have file_server serving normal www2
  • whenever user navigates to adomain.tld, i would want to have caddy serving normal www1

Optionally, i would also want to be able to have another reverse_proxy for one sub.adomain.tld/something path. But perhaps this is a separate issue and i don’t want to create confusion here.

4. Error messages and/or full log output:

There aren’t any errors. Caddy loads, but I don’t have any “handler” when wrong URI is typed.
If i remove from route second file_server and replace it with handle /restricted/* { respond } then i get the notification - but when i properly authenticate - i don’t get the files served. However when wrong /restricted/badfile is given - i have just blank screen, as no 404 handler is setup.

5. What I already tried:

I have also tried to have

route /restricted/* {
    import  file_with_several_basicauth_entries
    respond "please provide proper /restricted/ path" # or same with handle
}

But in this case, when i provided proper credentials - instead of pdf files i would have 0 bytes, or (on /restricted/bad.pdf i would get pdf with contents “please provide …”

Now i have some (wrong) ideas why is that so. Perhaps when i write them down here, this could be used to improve documentation for others?

  • I have read in other thread [1] that basicauth is “propagated” down. So whenever this “worked” it applies to everything what is further in caddyfile.
  • i probably should setup handle_errors however i can’t have that for custom matcher
  • i think this is because when route ends - it needs to have file server. but in that case perhaps this should be route inside route? I have tried also adding handle_errors inside route:
route /restriced/* {
    import files_with_basic_auths
    file_server {
    }
    handle_errors {
	@4xx expression `{http.error.status_code} >= 400 && {http.error.status_code} < 500`
	respond @4xx "It's a 4xx error!"
	respond "It's not a 4xx error."
}

However in this case Caddy fails to reload.

6. Links to relevant resources:

Caddy basicauth using query - #2 by Whitestrake # [1]

Okay, there’s a lot going on here, and I’m frankly not really sure where to start. So I’m just gonna start quoting random parts of your post as I go and comment on what I see. Sorry if it’s a bit scattered/unorganized.

That thread is from 2018, for Caddy v1. Anything you read relating to Caddy v1 should essentially be ignored, because Caddy v2 was a complete rewrite.

When you use basicauth, you need to make sure 401 Unauthorized responses get written as-is back to the client, so they find out that the authentication failed.

It’s a tricky interaction between basicauth and handle_errors, since if you handle the errors and write a different response, it’ll break the behaviour of basicauth. You need to make sure you at least keep the same HTTP status code.

In your example you just used respond without specifying a status code, so it defaults to 200.

Also, handle_errors is not an HTTP handler directive, because it defines a separate set of routes. It can only be defined once per site, and it cannot be nested inside of route or handle blocks.

A few comments on this.

The request matcher on file_server is redundant here, because you’ve already matched for the same path in route. So you can remove that.

Also, keep in mind that file_server takes the defined root and appends the current request path to it, so you’ll likely have it look for files on disk at /opt/www/sub/restricted/restricted/example.txt. Note that /restricted is doubled up there. To handle this, you’ll need to strip the path prefix before passing it onto file_server. You can do this either by using handle_path instead of route, or using uri strip_prefix.

Finally, I’m not sure that hide /opt will do anything for you, because /opt is not inside of /opt/www/sub/restricted :thinking:

This isn’t valid syntax, you must have directives on a new line, and the } on its own line as well. Whitespace and newlines are significant in the Caddyfile.

FYI you have a / matcher here, which will only make this match requests to exactly / and no other paths. You probably want to remove the / to match all requests.

That wouldn’t be a redirection, but rather just “writing a response body”. A redirect in the context of HTTP/web is specifically writing a response with the Location HTTP header which tells the client “sorry, try again at this URL instead”.

You can either do this by using a file matcher to catch missing files before it reaches file_server and handling it however you need there, or you can let it fall through to file_server which would trigger a 404 error and then handle that error it via handle_errors.

I would probably recommend using a structure like this to your config:

example.com {
	handle_path /restricted/* {
		basicauth ...
		file_server ...
	}

	# everything else
	handle {
		root * /opt/www/sub
		file_server
	}
}

Sure, just add another handle block with the path you need.

Typically writing a 200 status with 0 bytes is because a request went unhandled (i.e. no HTTP handler directives ran, or did anything). Caddy’s default is to do nothing if it wasn’t configured to do anything specific.

2 Likes

Thanks a lot for a very detailed and helpful answer!

Just to clarify some things :slight_smile:

Me: handle /restricted/* { respond "..." }
This isn’t valid syntax, you must have directives on a new line, and the } on its own line as well. Whitespace and newlines are significant in the Caddyfile.

that was kind of added by me last minute - it would be of course in the separate line, i just wanted to show my intent or idea what i was trying to do :slight_smile: I just wanted to present in compact way all ideas have i tried.

Me: whenever user navigates to sub.adomain.tld/restricted it should be redirected to “please provide correct path”
That wouldn’t be a redirection, but rather just “writing a response body”.

You are right, I actually wanted sort of that, or that was the idea initially - so to reply with 200 and reply “there is no such file” but after your reply i realized this is actually not reasonable to reply 200 here. I mean, i didn’t care, but now i realized that probably i should :slight_smile:

I would probably recommend using a structure like this to your config

That is a very nice suggestion, I will tweak it a little bit (restricted is actually on subdomain), test it and post it. I might even risk to add another handle_path like you suggested :slight_smile:

So anyway this was super helpful for me, let me summarize quickly what have i learned:

  • double check if the threads are for the old Caddy or not
  • don’t cut corners in writing thread, although it is more verbose, it could potentially bring more confusion :wink:
  • one has to be careful about basicauth
  • handle_error is not an HTTP handler directive
  • hide actually works relative to the root :slight_smile: which is nice. I didn’t fully understand before, how can i make Caddy more secure, or rather prevent it from exposing my mistakes in configuring it :smiley:
  • recap what i thought i remembered but i didn’t - when you have a matcher it matches exactly, and if you don’t have a matcher, it is sort of like match everything. I guess it takes some time to get used to, people sometimes are used to using * glob.
  • recap that order in Caddyfile is important :slight_smile:

I wonder:

  • would that help to retroactively add a tag to titles [v1] or something of old threads? There weren’t many threads about basicauth tbh, and i didn’t notice the version.
  • perhaps it might be helpful for less experienced in servers people like me, to have a table in documentation - those are directives, those are not. After those - nothing else will get sent back to the client ect.

Just thinking out loud, it doesn’t have to be necessarily something people would benefit from :slight_smile:

So at the moment i have

example.com {
    root * /opt/www/html
    header Strict-Transport-Security "max-age=15768000;"
    templates
    encode zstd gzip
    file_server
    log {
        output file /var/log/caddy/weblogs.log
    }   
}

sub.example.com {
	handle_path /restricted/* {
		import file_with_several_basic_auth_commands_only
		file_server {
                    root /opt/www/sub/restricted/
                    # initially after change i had just   /opt/www/sub/ but then i got 404s 
                }
	}

       handle_path /specific {
            respond "here will be an app"
            # reverse_proxy localhost:8888
       }

	# everything else
	handle {
		root * /opt/www/sub
		file_server {
                     root /opt/www/sub/
                }
	}
 
        handle_errors { # copy paste from example:
            @4xx expression `{http_status_code} >= 400 && .......`
            respond @4xx "its an 4xx error!"
            respond "its not a 4xx error"
        }
}

It seems now that files are served from sub.example.com/restricted/* without any basic auth.
Just for the sake of completeness file file_with_several_basic_auth_commands_only has exactly just

# some comments
# and

basicauth /restricted/item2.pdf {
        user2 JDJhJDE0JDRO                                          
}

basicauth /restricted/item1.pdf {
        user1 ssssssssssss
}

# ... and so on

Yeah, path matchers are exact, but you can append a * to match everything under that path (keep in mind /foo* will also match /foobar so you may want to do /foo/* and maybe add a redir /foo /foo/ to handle the base case.

And yes, omitting the matcher is the same as having specified *. This is explained in the request matching docs here:

Caddy sorts directives according to the a predetermined order (docs below). Using route overrides the sorting that is done by the Caddyfile adapter, and executes them in the order they appear within the route.

Yeah, maybe… But we’d want to avoid making the thread timestamp get updated otherwise when we do that it would bump every topic we touch up to the most recent. Maybe there’s a Discourse forum plugin/script we can run to do this. Hmm. I’ll look into it. Thanks for the suggestion!

They are all directives, but some are HTTP handlers while others are not (i.e. tls, handle_errors, bind, import, log, request_body). The main differentiating factor is that those don’t support request matchers because they apply site-wide. But you’re right, we could augment the table on the Directives page to clarify that.

Your basicauth lines still have /restricted/ in their matchers. The handle_path directive strips /restricted from the front of the path before passing it onto handlers within, so you’ll need to remove /restricted from those matchers as well.

Also, keep in mind that handle_errors as you have it will still break basicauth because it will prevent auth errors (401) from being written back to the client.

2 Likes

Hmm, right, so how could i improve this some more, not to break basic_auth but yet reply with some 4xx code, where someone tries to access files which are not there? Or perhaps it is not a good practice to do that, as someone could try to bruteforce guess filenames which can be then bruteforce basicauth’ed?

Sidenote, indeed, handle_path strips the restricted but if so, how come they actually load at all? Ah, i guess they do, cause file_handler.
I need to read up also on what exactly happens when we have

handle {
    root * /path
    file_server {
         root /someotherpath
    }

cause it is not yet clear for me. But i’ll start with reading handle docs again, until it clicks :wink:

Maybe something like this:

handle_errors {
	@401 expression `{http.error.status_code} == 401`
	handle @401 {
		respond 401
	}

	@404 expression `{http.error.status_code} == 404`
	handle @404 {
		# or you could rewrite to a 404.html and use 
		# file_server, if you want it to look pretty
		respond "That file could not be found" 404
	}

	handle {
		# fallback for anything else
		respond "Some other unknown error" {http.error.status_code}
	}
}

In general, it’s better to check auth before checking if a file exists, because yeah, otherwise you’re revealing information about files on disk. But since you’re setting different auth for different files, it’s not really possible do that in general in your case.

That’s right, your file_server didn’t have its own matcher, so it would still work.

If you define root inside of file_server, then that will take precedence over the root directive. Generally we recommend using the root directive because it makes it possible to use other directives/matchers that also look at the set root, like php_fastcgi, the file matcher, try_files, etc.

That’s mentioned in the docs for root on the file_server page:

2 Likes

ah, so the second root is, say like a attribute or something specific for file_server only, while root can be used by several. In that case I actually don’t need two “root” keywords, i could have one root on top of the sub-page definition, and I could remove root inside file_servers… but then i need to re-add /restricted/ before each file in file with imports…edit: or maybe i wouldn’t need to, cause handle_path has already stripped it :slight_smile: Seems quite reasonable.

I see now how expressions work, and i guess one could also leave general @4xx catch like it was, but prepend it with “specific” catch for 401, I guess that would work. But i’ll double-check. I do want it to look pretty, 404.html for the web page itself, but for this specific dir, i want it just to be either you download after some basic auth, or you’ll get the info about failure, nothing fancy.

One last thing i need to check would be root * /opt/www/ (with and without trailing slash) vs root /opt/www/ with or without slash. Would a matcher * make sense or is it redundant.
And there is my answer in the logs, it has to have *. I wonder if i had it without * inside file_server and it worked :thinking:

Oh, interesting (perhaps, for some) fact:

basicauth item.pdf { 
    user   passwordhash
}

actually fails, so it has to have leading / otherwise Caddy thinks it is a hashing algorithm and not a path

I think this support i’ve got here is really exceptional, I am immensely grateful.
I’ll select one as solution, after i do one last check

1 Like

The * matcher for root is necessary, if the first argument passed to root happens to start with a /. Because otherwise, the Caddyfile parser would take it as being a path matcher (see request matcher syntax docs).

The root directive happens to be a case where the first argument is almost always an absolute path, so that implicit first argument behaviour becomes troublesome there.

The trailing slash (on the right side) is not necessary. /opt/www and /opt/www/ would be the same.

Yeah, root inside file_server is a subdirective to the file_server directive, and subdirectives don’t use request matchers. So you don’t need (and can’t use) the * there.

That’s right, it’s not seen as an (inline) path matcher if it doesn’t have a leading /.

You can still used a named matcher with the path matcher if you want to do a rule that doesn’t start with /. like *.pdf for example.

2 Likes

soo, “global” root can optionally have * or not, and subdirective inside file_server actually cannot have it, or will fail :slight_smile: slightly confusing in the beginning :wink: but luckily Caddy states quite clearly what’s wrong too, if you try to reload it like that.

Anyways, works like charm now :heart_eyes_cat:

1 Like

Not quite.

The * is necessary if your argument to it is an absolute path. e.g. root * /opt/www

If it’s a relative path, like if you ran caddy run while inside of /opt/www, then you could omit it and just do root . (because . does not start with /, obviously).

2 Likes

Out of curiosity, something else. So now this 401 error is returned properly, but some browsers load up their pdf viewers anyways and read well, 17 bytes of 401 message. While i could prepare “bad.pdf” with that line inside, i wonder if there is something smarter? Cause for example on firefox, it opens the pdf browser, but does not render that line, as it is not “PDF” format.

Edit. No, actually i must have made a mistake again.

    handle_errors {
        @401 expression `{http.error.status_code} == 401`
        handle @401 {
                respond 401 
        }
        @4xx expression `{http.error.status_code} >= 400 && {http.error.status_code} < 500`
        respond @4xx "It's a 4xx error!"

        respond "It's not a 4xx error."
    }   

I thought that this will cover one specific error (cause it is “first”) and later eval the others, but apparently this would need to be divided into ranges, as @4xx sadly “hijacks” the response,

and baditem.pdf ends up with It's a 4xx error! inside :slight_smile: no, wait, so that is actually ok. That wasn’t 401. So yeah, actually all is well, just need to find a way how to prevent user browser from running pdf viewer, or reply with %pdf1.4 string or something

I guess i will research what can i do with {http.request.uri.path.file} to check if that is inside /restricted/*.pdf (some maybe one can do “contains” or .endswith inside expression), and have a different 404 error for this type…
indeed it can:

Side note, perhaps i will figure this out, eventually. If i wanted to have some sort of dynamic redirection based on user who basic_auth'ed

handle_path /restricted {
    import file_with_basicauths_checks
    redir {http.some_variable.which_user_authenticated}
}

But today is too late for that, i can’t focus anymore

This topic was automatically closed after 30 days. New replies are no longer allowed.