Website sometime not responding in v2.4.0-beta2. Caddy service status shows failed after caddy upgrade

elon_musk · May 13, 2021, 6:15am

1. Caddy version (`caddy version`):

v2.4.0-beta.2 h1:DUaK4qtL3T0/gAm0fVVkHgcMN04r4zGpfPUZWHRR8QU=

2. How I run Caddy:

a. System environment:

Ubuntu 20.04.1 LTS

b. Command:

sudo systemctl restart caddy

c. Service/unit/compose file:

none

d. My complete Caddyfile or JSON config:

{
    on_demand_tls {
        ask https://my.hypershapes.com/validate
        interval 2m
        burst 10
    }
}

(root) {
    root * /var/www/{args.0}
}

(baseSetup) {
    file_server
    php_fastcgi unix//run/php/php7.4-fpm.sock
    encode gzip zstd

    header {
        Strict-Transport-Security "max-age=31536000; includeSubDomains; preload; always"
		X-Frame-Options "SAMEORIGIN"
		X-XSS-Protection "1; mode=block"
		X-Content-Type-Options "nosniff"
    }

    @static {
		file
		path *.ico *.css *.js *.gif *.jpg *.jpeg *.png *.svg *.woff *.woff2 *.json
	}
	header @static Cache-Control max-age=5184000

    log {
        output file /var/log/caddy/access.log {
            roll_size 2MiB
            roll_keep 100
            roll_keep_for 1440h
        }
    }
}

my.hypershapes.com {
    import root hypershapes/public
    import baseSetup 
}

affiliate.hypershapes.com {
    import root hypershapes/public
    import baseSetup
}

admin.hypershapes.com {
    import root hypershapes-master-admin/dist
    import baseSetup
}

*.hypershapes.com {
    import root hypershapes/public
    import baseSetup

    tls {
        dns cloudflare {REDACTED}    
    }
}

https:// {
    import root hypershapes/public
    import baseSetup

    tls {
        on_demand
    }
}

3. The problem I’m having:

All my websites hosted with Caddy web server not responding, although the status of caddy service shows active (running).

Status of caddy when websites not responding:

The result of curl -v -I of one of my website:

No responses are returned after long wait.

This issue happens in my production server only, and never happen in my staging server, even their Caddyfile configurations are the same. This usually happens once per week or few times a day. And so far only can be solved by running

sudo systemctl restart caddy

4. Error messages and/or full log output:

Journalctl log:

I found this error message keep appear in my log, is it related to the problem I faced?

May 12 07:15:56 hyper-prod-ubuntu-sgp1-01 caddy[25352]: {"level":"error","ts":1620803756.6630805,"logger":"http.handlers.reverse_proxy","msg":"aborting with incomplete response","error":"http: wrote more than the declared Content-Length"}

5. What I already tried:

sudo systemctl restart caddy

6. Links to relevant resources:

full journalctl of that day:
https://www.notion.so/journalctl-u-caddy-since-2021-05-13-13734ed935df4419896244a11f9c8c29
htop output
https://drive.google.com/file/d/1En239Tg0t3Ox0q2p-WIS0QZ8MBs7qPzt/view?usp=sharing

francislavoie · May 13, 2021, 2:25pm

Please upgrade to v2.4.0 stable! It was just released earlier this week, with plenty of fixes.

That’s pretty weird. That suggests a bug in your upstream server possibly

Please try again with v2.4.0 to see if it works any better, and if not, @matt might need to chime in with ideas for debugging this.

elon_musk · May 13, 2021, 4:03pm

Hi.

When I tried to upgrade caddy to v2.4.0 stable with caddy upgrade, errors occur, and caddy service’s status becomes failed now. (I ran with root user)

Status of caddy service

Journalctl logs

Any idea what has happened?

francislavoie · May 13, 2021, 4:14pm

Yeah, there’s a known issue with caddy upgrade that’s already been fixed and will ship with v2.4.1, it doesn’t preserve the permissions on the binary correctly when it swaps it out.

github.com/caddyserver/caddy

"caddy upgrade" sets wrong permissions

opened 05:12AM - 11 May 21 UTC

closed 10:11PM - 11 May 21 UTC

Caligatio

bug

I currently run caddy via the [example systemd service as described in the docum…entation](https://caddyserver.com/docs/install#linux-service) and was excited to try out the new `caddy upgrade` option that was released in v2.4.0. The original `caddy` binary is owned by `root:root` with permissions `755` so I ran `caddy upgrade` as root to find that the resulting binary has permissions `750` which causes the systemd service to be unable to execute the binary. From a security perspective, I rather have the binary owned by root and running as a non-root user. However, this does not seem possible using the new `caddy upgrade`.

You’ll need to change the permissions for /usr/bin/caddy back to 755 by hand for now.

elon_musk · May 13, 2021, 4:46pm

After manually change the permission of /usr/bin/caddy to 755, the caddy service still failed eh.

Here are all the steps I ran:

caddy upgrade
sudo chmod 755 /usr/bin/caddy
sudo systemctl daemon-reload

The status of caddy service now:

Am I wrongly executed anything here? Any help is appreciated ^____^

Updated:
My websites are working even the caddy status is failed.

francislavoie · May 13, 2021, 5:19pm

That means you must have a 2nd instance of Caddy running also using port 2019. Make sure to kill off any of them, then restart the systemd service.

elon_musk · May 13, 2021, 5:38pm

As I checked, looks like only one instance of Caddy is using port 2019.

What can I do to get the status of Caddy service back to active(running). Do I need to kill this caddy instance and restart the service?

francislavoie · May 13, 2021, 5:54pm

Yes, exactly. Kill that one, because it’s not managed by systemd, then restart systemd.

elon_musk · May 31, 2021, 2:39pm

Hi.

After kill the process, the status becomes active (running) now.

Thank you for your help.

system · June 12, 2021, 6:16am

This topic was automatically closed after 30 days. New replies are no longer allowed.