Caddy Service won't stay running or start with boot (Alpine / OpenRC)

1. The problem I’m having:

I’m running Caddy on Alpine Linux primarily but not exclusively as a reverse proxy. The configuration validates and works without any issues. However the daemon does not seem to want to keep running. The init system is OpenRC and there appears to be a disconnect between the init system and the actual daemon for the following reasons:

  • even though the service is enabled, it does not start on a reboot
  • I can start the service with rc-service caddy start and it will start, but running rc-service caddy status about 30 seconds later will show the caddy.service as stopped, even though it is still serving a static site and working as a reverse proxy.
  • I can start Caddy with caddy start and it starts just fine, but this does not update the status in OpenRC as running rc-service caddy status shows it as stopped
  • some hours after starting Caddy, I am not sure how long, it seems to stop on its own, and I have to SSH to the server and run caddy start to get it going again.

The primary purpose if this server is to allow me access to Apache Guacamole, so it needs to stay running or I may not have access to SSH to restart the service, depending on where I am.

caddy:/etc/caddy# cat /etc/os-release 
NAME="Alpine Linux"
ID=alpine
VERSION_ID=3.18.3
PRETTY_NAME="Alpine Linux v3.18"
HOME_URL="https://alpinelinux.org/"
BUG_REPORT_URL="https://gitlab.alpinelinux.org/alpine/aports/-/issues"
caddy:/etc/caddy# rc-update show --all
             bootmisc | boot                                   
                caddy |      default                           
                crond |      default                           
                devfs | boot                                   
             hostname | boot                                   
            killprocs |                        shutdown        
           networking | boot default                           
            savecache |                        shutdown        
                 sshd |      default                           
               syslog | boot

2. Error messages and/or full log output:

Not getting any error messages anywhere, unless you consider the output of rc-service caddy status showing as “stopped” when it is in fact running.

caddy:/etc/caddy# rc-service caddy status
 * status: stopped
caddy:/etc/caddy# caddy stop
caddy:/etc/caddy# rc-service caddy status
 * status: stopped
caddy:/etc/caddy# caddy start
2023/08/22 21:47:46.972	INFO	using adjacent Caddyfile
2023/08/22 21:47:46.974	INFO	admin	admin endpoint started	{"address": "localhost:2019", "enforce_origin": false, "origins": ["//localhost:2019", "//[::1]:2019", "//127.0.0.1:2019"]}
2023/08/22 21:47:46.974	INFO	http.auto_https	server is listening only on the HTTPS port but has no TLS connection policies; adding one to enable TLS	{"server_name": "srv0", "https_port": 443}
2023/08/22 21:47:46.974	INFO	http.auto_https	enabling automatic HTTP->HTTPS redirects	{"server_name": "srv0"}
2023/08/22 21:47:46.974	INFO	http.auto_https	enabling automatic HTTP->HTTPS redirects	{"server_name": "srv1"}
2023/08/22 21:47:46.974	INFO	tls.cache.maintenance	started background certificate maintenance	{"cache": "0xc0003f8400"}
2023/08/22 21:47:46.975	INFO	http	enabling HTTP/3 listener	{"addr": ":443"}
2023/08/22 21:47:46.975	INFO	failed to sufficiently increase receive buffer size (was: 208 kiB, wanted: 2048 kiB, got: 416 kiB). See https://github.com/quic-go/quic-go/wiki/UDP-Buffer-Sizes for details.
2023/08/22 21:47:46.975	INFO	http.log	server running	{"name": "srv0", "protocols": ["h1", "h2", "h3"]}
2023/08/22 21:47:46.975	INFO	http	enabling HTTP/3 listener	{"addr": ":8000"}
2023/08/22 21:47:46.975	INFO	http.log	server running	{"name": "srv1", "protocols": ["h1", "h2", "h3"]}
2023/08/22 21:47:46.976	INFO	http.log	server running	{"name": "remaining_auto_https_redirects", "protocols": ["h1", "h2", "h3"]}
2023/08/22 21:47:46.976	INFO	http	enabling automatic TLS certificate management	{"domains": ["access.reid.li", "port.reid.li", "guac.reid.li", "pve.reid.li", "gw.reid.li", "guac.prime42.wtf"]}
2023/08/22 21:47:46.978	INFO	tls	cleaning storage unit	{"description": "FileStorage:/root/.local/share/caddy"}
2023/08/22 21:47:46.978	INFO	autosaved config (load with --resume flag)	{"file": "/root/.config/caddy/autosave.json"}
2023/08/22 21:47:46.978	INFO	serving initial configuration
Successfully started Caddy (pid=732) - Caddy is running in the background
2023/08/22 21:47:46.979	INFO	tls	finished cleaning storage units
caddy:/etc/caddy# rc-service caddy status
 * status: stopped

3. Caddy version:

caddy:/etc/caddy# caddy version
v2.7.3 h1:eMCNjOyMgB5A1KgOzT2dXKR4I0Va+YHCJYC8HHu+DP0=

4. How I installed and ran Caddy:

apk add caddy

a. System environment:

Running in a Linux Container (LXC) on Proxmox. The only issue I am having is the service not starting at boot and not remaining running after a time. I am not sure what part of the running environment is relevant to include.

b. Command:

caddy start

Caddy file is in the default location of /etc/caddy, and I have cd /etc/caddy in the .profile to make it easier to work with, IE I don’t need to specify full path if I edit, validate, etc. the Caddyfile. This is a single purpose container, so nothing is running on it that does not need to.

c. Service/unit/compose file:

d. My complete Caddy config:

caddy:/etc/caddy# cat Caddyfile
access.reid.li {
	basicauth {
		john [Hashed Password REDACTED]
	}
	root * /var/www
	encode gzip
	file_server {
		hide .git
	}
	log {
		output file /var/log/caddy/access.log
	}
	header {
		?Cache-Control "max-age=1800"
	}
}

pve.reid.li {
	reverse_proxy 172.20.17.50:8006 {
		transport http {
			tls
			tls_insecure_skip_verify
		}
	}
}

port.reid.li:443 {
	reverse_proxy 172.20.17.51:9443 {
		transport http {
			tls
			tls_insecure_skip_verify
		}
	}
}

tcp://port.reid.li:8000 {
	reverse_proxy 172.20.17.51:8000 {
		transport http {
			tls
			tls_insecure_skip_verify
		}
	}
}

guac.reid.li {
	basicauth {
		john [Hashed Password REDACTED]
		admin [Hashed Password REDACTED]
	}
	reverse_proxy 172.20.17.53:8080
}

guac.prime42.wtf {
	basicauth {
		marc [Hashed Password REDACTED]
	}
	reverse_proxy 172.20.17.53:8080
}

gw.reid.li {
	basicauth {
		john [Hashed Password REDACTED]
	}
	reverse_proxy 172.20.17.1:443 {
		transport http {
			tls
			tls_insecure_skip_verify
		}
	}
}

5. Links to relevant resources:

https://wiki.alpinelinux.org/wiki/OpenRC

.EOF

I found another service command I could run. Note that Caddy is serving pages at the moment, despite the init system thinking the service is failed.

Also, here is the contents of the `/etc/init.d/caddy/ file:

supervisor=supervise-daemon

name="Caddy web server"
description="Fast, multi-platform web server with automatic HTTPS"
description_checkconfig="Check configuration"
description_reload="Reload configuration without downtime"

: ${caddy_opts:="--config /etc/caddy/Caddyfile --adapter caddyfile"}

command=/usr/sbin/caddy
command_args="run $caddy_opts"
command_user=caddy:caddy
extra_commands="checkconfig"
extra_started_commands="reload"

depend() {
        need net localmount
        after firewall
}

checkconfig() {
        ebegin "Checking configuration for $name"
        su ${command_user%:*} -s /bin/sh -c "$command validate $caddy_opts"
        eend $?
}

reload() {
        ebegin "Reloading $name"
        su ${command_user%:*} -s /bin/sh -c "$command reload $caddy_opts"
        eend $?
}

stop_pre() {
        if [ "$RC_CMD" = restart ]; then
                checkconfig || return $?
        fi
}

I found my solution I think, even though it doesn’t feel right.

To start the /etc/init.d/caddy file is not something I created. I’m guessing it got created when I installed the package using the apk package manager. So the line for user and group using the username and group name of caddy was populated. Also the user caddy existed, so that was probably also created by the install.

That said my research into OpenRC user services, or running a service as a particular user led me to quickly believe OpenRC doesn’t do that. It wants services to run as root. As bad as it sounds, changing this line command_user= in /etc/init.d/caddy to reflect root rather than caddy fixed all my issues.

So the web server is running as root, which is not great. That said:

  • SSH is key based login only and I use ed25519 keys
  • I think as a target Caddy has a lot less surface area to attack than Apache or Nginx, and it is less well known as well
    -I have no ports exposed to the internet beyond 80 and 443. Also other service ports for services that use different ports are translated by Caddy in the reverse proxy. Even SSH is only available on my LAN, which is why it was so important for Caddy to start at boot and remain running. To SSH to the Caddy server I have to use Apache Guacamole, which used Caddy as the reverse proxy to access it.

If anyone knows of a better way to solve this issue, please let me know. Otherwise I hope this thread can assist someone in the future should they run into this issue.

The Alpine package isn’t maintained by the Caddy team, it’s maintained by Michał Polański. I’ve reached out to him to get his attention on this thread.

I don’t think I can answer questions related to OpenRC, I’ve never used it before. I only use Alpine as a base image in Docker.

Thank you for bringing this to Michal’s attention.

As I see it there is one of a couple things that might need to happen with the package. If running as root is the correct way, then the package should be modified to use root rather than create the caddy user. If it is not the correct way, then something in the setup of the caddy user and their permissions to run Caddy as a service appears to be missing. It didn’t happen with the package install, and I cannot find any documentation or instructions on further required setup.

i don’t have a solution for you. but i’m running caddy in alpine as well and it works out of the box for me, no special configuration.

i noticed you don’t have the log global option. add it to your caddyfile to get server logs (not just access logs). maybe it’ll unearth something.

Logs are written to stdout/stderr by default. Does OpenRC not capture them by default? That seems silly.

This appears to be a permissions and file attributes/privileged ports issue.

I also installed via apk, then used the upgrade command and added the DNS module I needed. Any changes to the binary will remove the setcap attribute, so mine also broke.

The following is relevant in this state, i.e. Michael’s community caddy APK was installed (caddy-2.6.4-r4 & caddy-openrc-2.6.4-r4). Please note, I am not familiar with *nix in general.

As per documentation, general options are:

  1. modify /etc/init.d/caddy to run as root, or;
  2. change user and fix permissions/privileges:
    2a. setcap cap_net_bind_service (file level, may not persist)
    2b. Use higher ports and port forward (e.g. iptables)
    2c. Authbind (unavailable on Alpine)

2a. setcap
Whenever the caddy binary is touched outside of APK upgrade, e.g. ‘caddy upgrade’ or ‘caddy add-package’, setcap needs to be run again.

Add libcap APK

apk add libcap

With root, add the setcap:

setcap cap_net_bind_service=+ep $(which caddy)

Note, Michael’s /etc/init.d/caddy cannot run this as it will be running as caddy. You can make your own/ run the command manually/cron or whatever.

2b: Use higher ports:
Install iptables

apk add iptables

Update your Caddy configuration. If using Caddyfile, add the following (or substitute your own ports):

http_port 8080
https_port 8443

Add the following routes (substitute your own ports/network):

iptables -A PREROUTING -t nat -i eth0 -p tcp --dport 80 -j REDIRECT --to-port 8080
iptables -A PREROUTING -t nat -i eth0 -p tcp --dport 443 -j REDIRECT --to-port 8443

Test (without rebooting):

rc-service caddy restart
ps aux | grep caddy
2598 **caddy**     0:00 /usr/sbin/caddy run --config /etc/caddy/Caddyfile --adapter caddyfile

Then save:

iptables-save

If any permissions issues remain, fix permissions (assuming user caddy / group caddy):

chown caddy:caddy /usr/sbin/caddy
chown -R caddy:caddy /etc/caddy

You might need to modify /etc/init.d/caddy to replace ‘command_user’ with just the “caddy” user. If you do this, you will need to remove or lock the caddy-openrc APK.
Original:

command_user=caddy:caddy

Fixed:

command_user=caddy

I’m not sure what happens during an APK upgrade (whether plugins persist) when using Michael’s version so I preferred removing them and then fixing. Happy to keep using his remnant init.d, albeit slightly modified to automatically format the caddyfile before reloading.

I’m not using any installed modules, and I don’t believe that I ran the upgrade command. However, you are right about it being related to permissions as changing the user appears to have fixed the issue.

I will accept the idea that I must have done something that broke the install as created by Michael and a setcap is needed to repair it. That said, it appears to be working for me now as it is, and I am happy with that.

I feel the install is isolated enough that I don’t need to fear too much that it is running as root in this case. As a professional shared hosting provider I fully understand the potential consequences of this, as I remember the days when we didn’t have much choice and I was around before we had PHP Process Managers that let you SU to the account user. I have Shrek levels of layers for security in place, and I am a very tiny target.

Thank you everyone for your assistance. I consider this solved for my use case.

Hi! Late reply, but I can add some context to this issue. Looks like the caddy binary was replaced by John. The output of caddy version should be “unknown”, because Caddy distributed by Alpine is currently built without using ldflags. So the problem is indeed missing cap_net_bind_service capability as Eric noticed.

If you want to use your own build of Caddy and start it using OpenRC, one way is to install only caddy-openrc. This way caddy aport won’t override your binary.

Also, with the release of Alpine 3.19 setcap won’t be needed anymore, because capabilities are now defined in the service script (relevant change: community/caddy: use openrc cap instead of setcap (fee6d1a9) · Commits · alpine / aports · GitLab).

2 Likes

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.