Challenge with reverse_proxy while backend container renewing (restarting)

1. Caddy version (caddy version):

v2.4.6

2. How I run Caddy:

Podman rootless container

a. System environment:

Ubuntu server 20.04

3. The problem I’m having:

I’m using my back+front stack with several podman rootless containers on VPS with two cores. reverse_proxy point to the container with Python aiohhtp server and all working pretty excellent. But, when I renew (stop, remove, run up) the backend container under live load, the requests to reverse_proxy backend are rejected. So, I know about Load Balancing, but I read about ‘handle_errors’…
Is there a way to treat such a situation, while backend container renewing (suppose 30 sec) to Caddy keeping each request only once for a particular time limit then trying again, despite that the backend container being removed?

And, what about renewing the frontend Caddy container itself by reason that it encapsulates all frontend files cause particular release version (without bind ‘/srv’ volume)?
What is the easiest way you see to refresh a Caddy container under load without interrupting user requests in a resource-constrained environment?

Yes, you can do that with lb_try_duration. Make sure to also configure dial_timeout on the http transport to something low enough (1-3s is probably fine) so that that fails fast. Currently there’s no way to limit the amount of retries, but this should cover what you need.

If by “refresh” you mean swap containers, then you should probably deploy your files separately from Caddy, so you can keep Caddy running.

If you shut down Caddy then there’s no way to prevent downtime, unless you have some other kind of TCP proxy in front of Caddy which can load balance to more than one Caddy instance while you swap them out.

2 Likes

It’s working as expected.
While I update the backend container, client requests are not rejected and then resumed. Cool!
Thank you.

Suppose I’m ok with the location of the served files in the bonded directory. But I found that when reloading the config by ‘caddy reload’ command, all incoming sessions are broken. I need to pause requests to update files and config.
Is there a way to temporarily accumulate requests, as if pausing and not breaking sessions with clients?

Otherwise, it turns out I have to put another reverse proxy :astonished:

I’m not sure I understand what’s breaking. What do you mean by “sessions”?

The way reloading works in Caddy means that there’s technically two servers running for a short period of time while the reload is happening. First the new config is started, then the old config’s shutdown begins, where Caddy will wait for any requests that used the old config to complete before shutdown can end (unless grace_period is configured which sets a time limit to wait for requests to complete, otherwise they get cancelled).

The new config will immediately start handling the incoming requests instead of the old config, meaning there’s zero downtime. But old requests still need to be closed off so state in the old config can be cleaned up, which is important to avoid memory leaks.

2 Likes

I mean established TCP connections.

When I reload the config, it is clear that the TCP connections are broken and new connections are created. This is logical behavior.

Interestingly, in the case of restarting the backend, the TCP connections are not broken (it seems the default timeout for the TCP connection = 1 hour), and it turns out that the requests are waiting, and after the start of the backend, I do not see interrupted requests.

If Caddy implements this behavior for a reverse_proxy (freeze TCP connections), maybe then this can be applied to a file_server as well. I have not tested this behavior yet, but what will happen to requests when updating files in a mounted (’/srv’) directory?

If you reload the backend (i.e. upstream behind reverse_proxy) then of course TCP connections are not broken, because the TCP connection is with Caddy. The reverse_proxy module makes a new connection between Caddy and the upstream, and that’s what “breaks” on a reload of the backend.

This doesn’t make sense. “Freeze TCP connections” isn’t a thing. I don’t see how TCP connections relate to file_server at all. All file_server does it look at the request path, find if a file matches that path on disk (by concatenating the root with the path) then serving that file if it exists.

1 Like

Ok, I decided to put Caddy behind Caddy. :sunglasses: The first one is like a proxy. But there was a problem, I can not get all the logs. On the second Caddy, which is configured as file_server, the following config:

{
	debug
	admin off
	auto_https off
	log {
		output file /var/log/caddy.log
		format json
	}
}

http://mysite.cool {
	encode zstd gzip
	root * /srv
	file_server
}

I only see logs (The second entry must be in error):

{"level":"debug","ts":1645616530.3187547,"logger":"http.handlers.file_server","msg":"opening file","filename":"/srv/index.html"}
{"level":"debug","ts":1645616532.1304195,"logger":"http.handlers.file_server","msg":"sanitized path join","site_root":"/srv","request_path":"/ups","result":"/srv/ups"}

I want to see it like this:

{
	"level": "info",
	"ts": 1585597114.7687502,
	"logger": "http.log.access",
	"msg": "handled request",
	"request": {
		"method": "GET",
		"uri": "/",
		"proto": "HTTP/2.0",
		"remote_addr": "127.0.0.1:50876",
		"host": "example.com",
		"headers": {
			"User-Agent": [
				"curl/7.64.1"
			],
			"Accept": [
				"*/*"
			]
		},
		"tls": {
			"resumed": false,
			"version": 771,
			"ciphersuite": 49196,
			"proto": "h2",
			"proto_mutual": true,
			"server_name": "example.com"
		}
	},
	"user_id": "",
	"duration": 0.000014711,
	"size": 2326,
	"status": 200,
	"resp_headers": {
		"Server": [
			"Caddy"
		],
		"Content-Type": ["text/html"]
	}
}

I want to see HTTP logs. How can I do it? I built Caddy with the format-encoder module. I even specified "sink": {"writer": file } in the config. Caddy doesn’t want to show me HTTP logs… :sob:

To get access logs, you configure the log directive inside of each site. You used the log global option which configures Caddy’s runtime logs.

1 Like

I wasn’t able to figure it out from the documentation. Thanks.

And I still can’t understand, please explain to me why in the case of the first (proxy) Caddy instance, I do not need to specify the protocol (just ‘mysite.cool’), but in the case of Caddy as a receiver from proxy, I need to specify (like ‘http://mysite.cool’)? But if in the second case I specify the protocol together with the IP address, it does not work (like ‘http://10.88.5.99’), because in the Caddy (receiver) logs I see the IP address of the source (‘10.88.5.99’) as the IP address of the proxy Caddy?

Do you mean “was able”?

Because Caddy defaults to HTTPS and attempts to enable Automatic HTTPS if your site address qualifies. See the docs:

So specifying http:// explicitly tells Caddy “no, I want HTTP, not HTTPS” so it won’t attempt to turn on Automatic HTTPS for that site.

Caddy matches the Host header when making routing decisions, and reverse_proxy passes through the Host header to the upstream.

You could just do http:// instead of http://mysite.cool and that would also work, because it would match all Host headers instead of just that domain (if that’s what you want). But nothing ever sets the Host header to the IP address, so trying to match the IP address won’t work.

1 Like

I will try to explain what could have misled me.
Somehow automatically my thoughts are like this. If we specify some options globally, then they should affect all other levels. Suppose if logging is disabled by default everywhere, then enabling logging globally should enable it everywhere. And then we can override or completely disable where necessary.

From documentation (Global Options (Very vague wording for me.)):

It is used to set options that apply globally, or not to any one site in particular.

With this similar situation for me. I read in the documentation:

Any of the following will prevent automatic HTTPS from being activated, either in whole or in part:

 - Explicitly disabling it
 - Not providing any hostnames or IP addresses in the config
 - Listening exclusively on the HTTP port
 - Manually loading certificates (unless this config property is true)

Well, I thought that if the ‘auto_https off’ option is specified in the global parameters, and Caddy is still a web server and when HTTPS is disabled, it will use HTTP automatically, unless another protocol is intentionally specified.


But these are just my thoughts.
And thank you for the clarification.

That’s accurate though. Caddy’s access logging is per-site, not global.

See the docs for the log global option:

And the log directive for access logs:

Caddy is still HTTPS by default in the sense that it’ll default to the HTTPS port if no port or scheme is specified.

“Automatic HTTPS” is the name of a feature in Caddy which has to do with automatically enabling certificate automation and HTTP->HTTPS redirects.

1 Like

This topic was automatically closed after 30 days. New replies are no longer allowed.