Live Helper Chat conversion from Nginx

1. Caddy version (caddy version):

v2.1.1

2. How I run Caddy:

docker-compose

a. System environment:

docker (as mentioned above)

b. Command:

does not apply

c. Service/unit/compose file:

does not apply

d. My complete Caddyfile or JSON config:

does not apply

3. The problem I’m having:

I am trying to run Live Helper Chat on Caddy. I found this Nginx config which is a little complicated but that should be fine if I play with it patiently. I am struggling with this regex however:

(^(?!(?:(?!(php)).)*/(albums|bin|var|lib|cache|doc|settings|pos|modules)/).*?(index\.php|upgrade\.php)$)

This line is insane. I found out Go does not support lookaheads which is what is used here. The first part though is something I cannot decypher and I have no idea what those 3 nested somethings should do. I used some regex debugger which told me what this does but I still have no idea what is the expected outcome. It’s just too complicated for me.

Could someone help me translate this to Caddy, please?

4. Error messages and/or full log output:

does not apply

5. What I already tried:

regex debuggers; I tried these matchers but it obviously is not correct:

@restrict_files {
  path /albums/* /bin/* /var/* /lib/* /cache/* /doc/* /settings/* /pos/* /modules/*
}

@allow_php {
  path_regexp ^/((index|upgrade)\.php)?
}

6. Links to relevant resources:

This is ludicrously messy, but it helps to get an idea of exactly what’s grouped by those negative lookaheads.

The exterior brackets can be dropped, they just add to the visual noise. The non-capturing group (?:(?!(php)).) also massively adds to the visual noise, but it seems to be necessary as whoever wrote this qualifies the whole non-capturing group with a * quantifier… For some scenario I can’t quite conceive of, I suppose.

The first negative lookahead starts immediately:

(?!(?:(?!(php)).)*\/(albums|bin|var|lib|cache|doc|settings|pos|modules)\/)

It contains a second negative lookahead:

(?!(php))

This is probably mid-to-high level mindbending stuff, but as far as I grasp it, it says “look for a string with a starting section that DOES NOT contain any number of instances of php followed by a single character and DOES contain one of albums, bin, var|lib, cache, doc, settings, pos, modules - in order to ensure the whole line doesn’t actually contain that”.

Then it follows up with “and then any number of any character - maybe? but as few as possible - and then we should see either index.php or upgrade.php and then end of line.”

This seems like a really, really goofy way to go about things.

Some things that match:

/foo/bar/index.php

/index.php/albums/upgrade.php

Some things that don’t match:

/albums/upgrade.php

/settings/index.php

Sometimes it helps to think it through backwards, as well… So to try narrow it down to a set of human readable rules - in order to MATCH this regex, the string needs to:

  1. End with index.php or ugprade.php
  2. Not start with albums, bin, var|lib, cache, doc, settings, pos, or modules
    2a. Unless there’s any number of php + any 1 other character, appearing somewhere before that

I can say with confidence that this regex EITHER solves some arcane application requirement I can’t currently conceive of with a pure stroke of genius, OR it’s a sad, poorly crafted piece of unnecessary complexity that “works, so why fix it?”.

2 Likes

lookahead, non-catching group, lookahead … I was able to identify that but on the whole that “php” part makes it impossible for me to understand. Anyway, maybe I could do with matchers that rule out anything not possible and then allow the rest?

Something like:

  • if there is no index.php OR upgrade.php, then deny (ultimate rule)
  • if there is no php AND there are albums, bin, …, then deny
  • if there is php somewhere AND index.php OR upgrade.php, then allow
  • allow everything else because I denied all that should be denied already

Would this make sense? All matchers should go one by one as they are written in the Caddyfile so this could work.

I found these dirs so this php is needed because of some external libraries.

image

Could this be a way to go? I have not tested the code yet, just trying to get a general idea whether this would make any sense at all.

@allow_php_dirs {
  path_regexp /php.*/.*/(index|upgrade)\.php$
}

@deny_dirs {
  path_regexp /(albums|bin|var|lib|cache|doc|settings|pos|modules)/
}

@allow_php_self {
  path_regexp /(index|upgrade)\.php$
}

try_files {path} {path}/ /index.php?{path}&{query}

php_fastcgi @allow_php_dirs php73:9000
respond @deny_dirs 404
php_fastcgi @allow_php_self php73:9000

I first allow any path containing php and index/upgrade.php. Then I deny everything containing any of the forbidden subdirs, unless they were already allowed to pass to fastcgi in the first rule. In the end I allow the index/upgrade.php if it stands alone.

I’m pretty sure you won’t need that try_files line, because php_fastcgi includes that logic already:

Without commenting on whether it’ll work (I have no idea), I might write it like this:

@deny_dirs path /albums/* /bin/* /var/* /lib/* /cache/* /doc/* /settings/* /pos/* /modules/*
respond @deny_dirs 404

@allow_php {
	path_regexp /php.*/.*/(index|upgrade)\.php$
	path_regexp /(index|upgrade)\.php$
}
php_fastcgi @allow_php php73:9000

I sure will test the functionality and then share the working config once I make it work.
Sorry but to my understanding this matcher

@deny_dirs path /albums/* /bin/* /var/* /lib/* /cache/* /doc/* /settings/* /pos/* /modules/*

does not satisfy the need to allow for cases like /phpsometing/albums/index.php, which is that I understand should be allowed.

Secondly, doesn’t this matcher say

@allow_php {
path_regexp /php././(index|upgrade).php$
path_regexp /(index|upgrade).php$
}

that both path_regexp should apply at the same time? There is an AND logic for all the lines, right? So if either one does not validate, this matcher will fail.

Well /albums/* will not match /phpsomething/albums so that should be fine.

You’re right about that regexp matcher not being right - turns out the Caddyfile adapter will just take the last defined regexp matcher because it’s not supported to have multiple path_regexp matchers in the same matcher set.

I feel like those two could be combined though, feels like you could use ? for a zero or one matches check. Maybe like this:

@allow_php path_regexp (/php.*/.*)?/(index|upgrade)\.php$

This topic was automatically closed after 30 days. New replies are no longer allowed.