Readiness checks

1. Caddy version (caddy version):

xcaddy build v2.4.6 --with github.com/delfick/caddy-supervisor --with github.com/leodido/caddy-jsonselect-encoder

2. How I run Caddy:

a. System environment:

Caddyfile in caddy:2.4.6 docker container in Google Cloud Run.

b. Command:

caddy run

c. Service/unit/compose file:

N/A

d. My complete Caddyfile or JSON config:

{
	admin off
	auto_https off
	supervisor {
		php-fpm -p . -c . -y fpm.conf -F {
			redirect_stdout stderr
			redirect_stderr stderr
		}
	}

	log {
		format json {
			time_key "time"
			level_key "severity"
			message_key "message"
			level_format "upper"
			time_format "rfc3339_nano"
		}
	}
}

:8087 {
	@trailing-slash {
		path_regexp dir (.+)/$
	}
	rewrite @trailing-slash {re.dir.1}

	root * ../../webserver/portal

	try_files {path} {path}.php {path}/index.php =404
	php_fastcgi unix/../../runtime/fpm.sock {
		# php-fpm takes a few seconds the first time
		lb_try_duration 5s
	}

	log {
		output {$CADDY_LOG_OUTPUT}
		format jsonselect "{severity:level} {timestamp:ts} {logName:logger} {httpRequest>requestMethod:request>method} {httpRequest>protocol:request>proto} {httpRequest>status:status} {httpRequest>responseSize:size} {httpRequest>userAgent:request>headers>User-Agent>[0]} {httpRequest>requestUrl:request>uri}" "{httpRequest>referrer:request>headers>Referer>[0]}" {
			level_format "upper"
			time_format "rfc3339_nano"
		}
	}

	encode zstd gzip

	handle_errors {
		@404 {
			expression {http.error.status_code} == 404
		}
		rewrite @404 /not-found.php
		reverse_proxy @404 unix/../../runtime/fpm.sock {
			transport fastcgi {
				split .php
			}
		}
	}
}

3. The problem I’m having:

4. Error messages and/or full log output:

It starts listening to :8087 before php-fpm is ready and google cloud run doesn’t have it’s own readiness checks and so it’s possible to get these errors because Google Cloud run thinks that my web server exists.

5. What I already tried:

I already have the lb_try_duration but it seems that sometimes isn’t long enough and I’d rather not make it much larger than it is.

If I instead make it so the Caddyfile uses :8086 and then I use the admin endpoint to change it to :8087 when php-fpm is up and running, then php-fpm gets restarted cause I’m using the supervise module to run php-fpm so that I don’t have to manage separately to caddy in the docker container.

Is it possible to have readiness checks so that it doesn’t start listening to :8087 until the fpm.sock file exists?

I think it would make more sense for supervisor to have health checks for services it runs and optionally block Caddy’s startup from completing until ready, or something like that.


I had another idea but it won’t work (because php_fastcgi expects the upstream to speak fastcgi but Caddy doesn’t have a fastcgi server you could use for this). The idea is that you could run a 2nd server in Caddy and make it a secondary upstream, and as long as the primary is unhealthy, it would get hit. Active health checks would be necessary for this to work.

# Main
:8087 {
	php_fastcgi unix/../../runtime/fpm.sock localhost:8888 {
		lb_policy first
		health_uri /health
		health_interval 2s
		health_timeout 250ms
	}
}

# Imagine this is a fastcgi server (not http)
:8888 {
	php_fastcgi unix/../../runtime/fpm.sock {
		lb_try_duration 60s
	}
}

The lb policy would make requests to the first upstream get skipped until health checks say the first upstream is ok, but the fallback server would have a pretty long try duration so the original requests would still work (maybe).

Another idea; we could potentially add a placeholder like {config.seconds_since_startup} (name TBD) which could be used with the expression matcher to have different routing behaviour in say the first minute or so of the config running:

@early expression {config.seconds_since_startup} < 60
vars @early try_duration 60s

@late expression {config.seconds_since_startup} >= 60
vars @late try_duration 5s

php_fastcgi unix/../../runtime/fpm.sock {
	lb_try_duration {vars.try_duration}
}

At this point this is just theory, I’m not sure if this is viable right now.

After I made the post I started looking at how I could fix it through the supervise module.

Ended up making this GitHub - delfick/caddy-php-fpm: Module to run php-fpm in caddy

Seems if I start and stop using Provisioner and CleanerUpper instead of Start() and Stop() I can prevent caddy from creating the :8087 socket before php is ready :slight_smile:

{
	admin off
	auto_https off
	php-fpm {
		cmd php-fpm -p . -c . -y fpm.conf -F
		sock_location ../../runtime/fpm.sock
	}

	log {
		format json {
			time_key "time"
			level_key "severity"
			message_key "message"
			level_format "upper"
			time_format "rfc3339_nano"
		}
	}
}

:8087 {
	@trailing-slash {
		path_regexp dir (.+)/$
	}
	rewrite @trailing-slash {re.dir.1}

	root * ../../webserver/portal

	try_files {path} {path}.php {path}/index.php =404
	php_fastcgi unix/../../runtime/fpm.sock

	log {
		output {$CADDY_LOG_OUTPUT}
		format jsonselect "{severity:level} {timestamp:ts} {logName:logger} {httpRequest>requestMethod:request>method} {httpRequest>protocol:request>proto} {httpRequest>status:status} {httpRequest>responseSize:size} {httpRequest>userAgent:request>headers>User-Agent>[0]} {httpRequest>requestUrl:request>uri}" "{httpRequest>referrer:request>headers>Referer>[0]}" {
			level_format "upper"
			time_format "rfc3339_nano"
		}
	}

	encode zstd gzip

	handle_errors {
		@404 {
			expression {http.error.status_code} == 404
		}
		rewrite @404 /not-found.php
		reverse_proxy @404 unix/../../runtime/fpm.sock {
			transport fastcgi {
				split .php
			}
		}
	}
}

And now when I load test I don’t get those pesky 502s!

1 Like

Nice! Good idea!

I didn’t take a deep look at the code, but I like this; would make for a good off-the-shelf way to run Caddy + PHP-FPM in a single container for pretty much anyone.

One thing I noticed – by convention, directives/app names should use underscores, not dashes, so it should be php_fpm, for consistency.

1 Like

but I like this

:slight_smile:

not dashes, so it should be php_fpm , for consistency.

ah yeah, easy fix, will do that :slight_smile:

Also, I think it would be nice to include a sample Dockerfile in that github repo as well, for those who might want a quick place to get started with it (example of using caddy:builder + installing php-fpm etc)

Decent idea, added an example folder caddy-php-fpm/example/tools at main · delfick/caddy-php-fpm · GitHub

1 Like

Wow that’s complicated :joy: I was just expecting a single Dockerfile, but sure I guess – I don’t think that’s a super approachable example ultimately :man_shrugging:

life is often complicated :stuck_out_tongue:

but the example does give a lot of nice things that otherwise takes a while to figure out

2 Likes