Getting fairly frequent 502 Bad Gateway Errors

Hi all -

So Caddy has been running fine for me until I’ve recently added in Organizr into the ‘www’ folder. About once every ~12 hours or so, I’ll start getting 502 Bad Gateway errors that require a Caddy Server restart to fix.

I’ve already created a github issue on both Caddy Server and Organizr.

The Organizr dev indicated that he believes it is an issue on Caddy’s end. And in searching other similar Caddy github issues, have been reading that it might have something to do with needing Go 1.8.X? However, I’m using the latest version of Caddy (v0.10.3), which I thought I heard already incorporates that.

Any help is appreciated!

Are you able to curl the Organizr endpoint manually during these 502 outage periods? (Invoke-WebRequest being the go-to Powershell equivalent if you don’t want to grab a cURL executable).

Also, the templates you’ve linked to on pastebin don’t seem to have anything regarding Organizr - are you able to post the Caddyfile that’s being used?

Are any of your certificates expiring within 30 days, perchance?

Are you able to curl the Organizr endpoint manually during these 502 outage periods? (Invoke-WebRequest being the go-to Powershell equivalent if you don’t want to grab a cURL executable).

Also, the templates you’ve linked to on pastebin don’t seem to have anything regarding Organizr - are you able to post the Caddyfile that’s being used?

I’ll take a shot at Invoke-WebRequest the next time the 502 appears. As for the templates, the links I put in the Caddy github issue are the exact Caddyfile and Common.conf file I use (with some personal details scrubbed out).

But you’re right in that nothing in the files directly mentions Organizr. Organizr is just a site written in PHP, so I downloaded it’s files from git and put them all in c:\caddy\www. I also downloaded PHP and put those files in c:\caddy\php and it’s loaded fine ever since. (I also roughly followed Organizr’s install instructions for Windows, but they were written with Nginx in mind).

Are any of your certificates expiring within 30 days, perchance?

They shouldn’t be. I just set up my entire domain/dns/caddy reverse proxy setup about 3 weeks ago (with the awesome help of Whitestrake in my previous thread.)

Ohh! Man, I saw 502 and for some reason assumed it must be a proxy issue… I recall a description of similar behaviour a while back but for the life of me can’t find the thread, I want to say it also mentioned fastcgiInvoke-WebRequest isn’t going to get us much here in that case. The intent was to test the upstream directly without Caddy involved, but that’s less relevant when we’re talking PHP.

Ohh! Man, I saw 502 and for some reason assumed it must be a proxy issue… I recall a description of similar behaviour a while back but for the life of me can’t find the thread, I want to say it also mentioned fastcgi… Invoke-WebRequest isn’t going to get us much here in that case. The intent was to test the upstream directly without Caddy involved, but that’s less relevant when we’re talking PHP.

Ah, gotcha. Yea, the only things in the logs I can see that seem to mention a 502 error keep mentioning ajax and http2. Also, the problem only seems to happen when I already have a tab open with Organizr, and then open a new one (either in a new tab on my computer, or on another device). Is it possible there is something that is just making too many calls by having multiple tabs open? Not sure if that helps you diagnose the issue.

This was the github issue I was reading that seemed like it could’ve been similar - but they were mentioning something with Go 1.8.X. I guess as a starter question, do you know if Go 1.8.X is built into Caddy 0.10.3? However, it’s very likely that that could’ve been an unrelated problem.

The good news (for my sanity, but not necessarily for the situation) is that I’m not the only one experiencing the 502 errors with Organizr in Caddy. This user on reddit is having the same problem, it seems.

I am running on Windows Server and have exactly the same issue. The only thing i can add is that im also running Monsta FTP and the 502 also happens for it at the same time as organizr. So it is something to do with subdirectories where the proxies all still work while these 502s are coming up.

Edit: regarding the above mentioning reddit thread which is me. The initial error appeared to be with Ajax.php but that went away when i pulled the plex portion of their homepage. The error then went from occurring at the 10 minute mark to 12ish hours as this thread is referring to. I don’t know if any of that is relevant or helpful but its something to throw into the pot of knowledge.

Does this happen through the proxy directive or the fastcgi directive, do we know? It’s quite possible there is something wrong with fastcgi.

i am new to all of this form the technical side so tell me what you need and i will provide it to the best of my ability.

Mftp and Organizr do not have and lines in the caddyfile, when i visit them i simply go to /organizr and /mftp and i have the organizr and mftp folders in my caddy www directory. I do not know if that answers you question as all though.

This essentially implicates FastCGI.

@Magic815 out of curiosity, where’d you pick up your PHP binaries for Windows?

This essentially implicates FastCGI.

@Magic815 out of curiosity, where’d you pick up your PHP binaries for Windows?

I got them from: PHP For Windows: Binaries and sources Releases
(I downloaded the VC14 x64 Non Thread Safe version, which at the time happened to be version 7.1.5).

Then, per the Organizr instructions, I made a few modifications to the php.ini file. I ended up uncommenting the following lines:

extension=php_openssl.dll
extension=php_pdo_sqlite.dll
extension=php_sqlite3.dll
sqlite3.extension_dir = ext
extension_dir = "ext"

I’m suspicious of a potential bug in the fastcgi middleware introduced by https://github.com/mholt/caddy/pull/1087 that could possibly (?) exhibit symptoms like this. I want to see if someone would be willing to revert that change in a branch and have others try it and see if these problems go away… but the revert is not as simple as clicking a button because a lot has changed since then. But I think we need to try gutting the feature implemented there and see if that helps. (Anyone want to give it a shot?)

im trying to set this up im a total noob can you please show me your caddyfile to show me how you got it running please

I’m suspicious of a potential bug in the fastcgi middleware introduced by https://github.com/mholt/caddy/pull/1087 that could possibly (?) exhibit symptoms like this. I want to see if someone would be willing to revert that change in a branch and have others try it and see if these problems go away… but the revert is not as simple as clicking a button because a lot has changed since then. But I think we need to try gutting the feature implemented there and see if that helps. (Anyone want to give it a shot?)

So one update on my end. User ‘LastStarFighter’ over on the Organizr github issue I created gave some further info. They had me create the following two environmental variables:

setx PHP_FCGI_CHILDREN 3
setx PHP_FCGI_MAX_REQUESTS 128

After doing so, I haven’t had Caddy/PHP crash on me yet. I plan to keep stress testing over the next day or two, but so far it seems like it fixed it. Does that seem to indicate one way or another that your fastcgi middleware could still be involved in the issue?

im trying to set this up im a total noob can you please show me your caddyfile to show me how you got it running please

@jbravo2uk - Take a look at the Caddy Server and Organizr github issue links I put in my first post. On those github issues, I have hyperlinks to the caddyfile and common.conf file that I use.

sorry still being a total noob could i see your caddy file

@jbravo2uk, see this part of the original post:

Both links above have further links to a Caddyfile and an accompanying imported common.conf file.

Each link leads to a different set of files, though the ones linked on the Caddy Server github issue are uploaded more recently.

i have set the above two command in my environmental settings and have been running a solid day now with no issues. seems like it did well. thanks again @Magic815 for running it down. I can also put the plex stuff back up on the homepage which is where my issue started, must have been the same root cause.

@jbravo2uk check the below links to pastebin for caddyfile and common.conf files that are a good starting point and well commented. it’s how i got rolling. They might be linked elsewhere as well but it can’t hurt to have more links.

I wanted to just follow up. Ever since putting in the below environment variables in Windows, I haven’t had a single 502 bad gateway error since! Coming up on 5 days without issue, and counting.

setx PHP_FCGI_CHILDREN 3
setx PHP_FCGI_MAX_REQUESTS 128
1 Like

That’s good news. We’ll improve on this in the future so that config won’t be needed in this situation.

@magic815 I found in my use of php on windows that the only way I could run it reliably was to set PHP_FCGI_MAX_REQUESTS=0 (unlimited) and then restart the process on a regular basis. PHP fast-cgi seems to stop when it hits the request limit. I get the impression on linux it restarts, however it does not seem to on windows. There is frustratingly little official info about this for such an important feature of php fast-cgi. I have included a link here to scripts I have been using very successfully on my low traffic windows site.