Got 502 with an error message `readLoopPeekFailLocked: <nil>`

1. Caddy version (caddy version):

$ caddy version
v2.3.0 h1:fnrqJLa3G5vfxcxmOH/+kJOcunPLhSBnjgIvjXV/QTA=

2. How I run Caddy:

a. System environment:

$ cat /proc/version
Linux version 4.19.36-12.al7.x86_64 (mockbuild@e69b16555.et15sqa) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-36) (GCC)) #1 SMP Wed May 15 22:19:28 CST 2019

b. Command:

c. Service/unit/compose file:

d. My complete Caddyfile or JSON config:

The whole config is very large, here is the simplified one with all essential information:

{
  "admin": {
    "config": {
      "persist": true
    }, 
    "listen": "0.0.0.0:2019"
  }, 
  "logging": {
    "logs": {
      "default": {
        "exclude": [
          "http.log.access.log0"
        ], 
        "writer": {
          "output": "file", 
          "roll_keep_days": 2, 
          "roll_size_mb": 500, 
          "roll_keep": 10, 
          "filename": "/data/logs/caddy/default.log"
        }, 
        "level": "INFO"
      }, 
      "log0": {
        "writer": {
          "output": "file", 
          "roll_keep_days": 2, 
          "roll_size_mb": 500, 
          "roll_keep": 10, 
          "filename": "/data/logs/caddy/access.log"
        }, 
        "include": [
          "http.log.access.log0"
        ], 
        "level": "INFO"
      }
    }
  }, 
  "apps": {
    "http": {
      "http_port": 80, 
      "https_port": 443, 
      "servers": {
        "srv0": {
          "routes": [
            {
              "handle": [
                {
                  "routes": [
                    {
                      "handle": [
                        {
                          "routes": [
                            {
                              "handle": [
                                {
                                  "rate": "25r/s", 
                                  "handler": "rate_limit", 
                                  "key": "{query.ent_id}"
                                }
                              ]
                            }, 
                            {
                              "handle": [
                                {
                                  "headers": {
                                    "response": {
                                      "add": {
                                        "Server": [
                                          "Backend"
                                        ]
                                      }
                                    }
                                  }, 
                                  "handler": "reverse_proxy", 
                                  "transport": {
                                    "dial_timeout": 5000000000, 
                                    "protocol": "http"
                                  }, 
                                  "upstreams": [
                                    {
                                      "dial": "10.0.7.127:80"
                                    }
                                  ]
                                }
                              ]
                            }
                          ], 
                          "handler": "subroute"
                        }
                      ], 
                      "match": [
                        {
                          "path": [
                            "/client/inputting", 
                          ], 
                          "host": [
                            "new-api.meiqia.com"
                          ]
                        }
                      ]
                    }, 
                  ], 
                  "handler": "subroute"
                }
              ]
            }
          ], 
          "automatic_https": {
            "disable": true
          }, 
          "logs": {
            "default_logger_name": "log0"
          }, 
          "listen": [
            ":6666"
          ]
        }
      }
    }
  }
}

3. The problem I’m having:

Got 502 with an error message readLoopPeekFailLocked: <nil>.

4. Error messages and/or full log output:

{"level":"error","ts":1631277306.4629822,"logger":"http.log.error.log0","msg":"readLoopPeekFailLocked: <nil>","request":{"remote_addr":"10.40.3.30:37528","proto":"HTTP/1.1","method":"POST","host":"new-api.meiqia.com","uri":"/client/inputting?ent_id=153626","headers":{"X-Log-Id":["0d46613b50fa7f220c88"],"X-Forwarded-For":["223.104.141.58, 47.111.193.136"],"Content-Length":["111"],"Accept":["application/json"],"Content-Type":["application/json"],"Origin":["https://yddkf.com"],"Referer":["https://yddkf.com/"],"Accept-Language":["zh-CN,zh;q=0.9,en-US;q=0.8,en;q=0.7"],"X-Forwarded-Proto":["https"],"Connection":["close"],"Accept-Encoding":["gzip, deflate"],"Sec-Fetch-Dest":["empty"],"Sec-Fetch-Mode":["cors"],"Sec-Fetch-Site":["cross-site"],"X-Forwarded-Famax":["famax-hz-famax-sandbox"],"X-Requested-With":["com.android.browser"],"X-Nginx-Proxy":["true"],"Wl-Proxy-Client-Ip":["223.104.141.58"],"User-Agent":["Mozilla/5.0 (Linux; Android 8.1.0; DUB-AL00 Build/HUAWEIDUB-AL00; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/81.0.4044.145 Mobile Safari/537.36 huaweioem (8.0.0.390)"]}},"duration":0.000877385,"status":502,"err_id":"dcq8gfgfs","err_trace":"reverseproxy.statusError (reverseproxy.go:783)"}

5. What I already tried:

I found that there is a related Golang issue, in which the possible reason was described as:

Specifically, there are some “hasty” HTTP servers out there that will send an HTTP response as soon as the TCP connection completes.

After capturing packets transferred between Caddy and the upstream backend, I got the packages as below:

The sequence of the requests is:

  1. The request POST /client/inputting?ent_id=292385 (package #4671)
  2. The failed request POST /client/inputting?ent_id=153626 (which is not actually captured here)
  3. The request POST /sdk/init_sdk_user (package #4683)

Obviously, the selected package #4674 is very suspicious since there is no corresponding request. By diving into this specific TCP connection (Caddy: 56902 <-> Upstream: 80), I got the full lifecycle of this keep-alive TCP session:

As we can see:

  1. At first, everything is ok with the typical TCP handshake and the subsequent request GET /conversation/.../messages_streams.
  2. The behavior becomes strange since the first TCP segment of a reassembled PDU (package #4367) and the subsequent HTTP/1.1 200 OK (package #4370).
  3. Finally, the selected package #4674 is just the HTTP response that may lead to the 502 error mentioned above.

6. Links to relevant resources:

Please upgrade to v2.4.5!

Where’s your Caddyfile?

Please complete the help topic template.

I actually think there might be enough here to go on. It also looks like it might be a Go issue? Will have to investigate more thoroughly when I’m not abroad.

Edit: lol, apparently I’ve seen that golang issue before, since I reacted with a thumbs up emoji to one of jsha’s replies. I totally don’t remember this at all… but it’s interesting that Let’s Encrypt is also experiencing this sometimes.

I think it’s likely to have been fixed since. There’s been a lot of changes to Caddy since v2.3.0. Best to upgrade and try again first.

Yes, I saw your thumbs up emoji there :wink:

One interesting thing is that we have deployed two types of backends behind Caddy:

  1. Client → Caddy → Go Backend
  2. Client → Caddy → Erlang Backend (Cowboy based)

But only HTTP requests in the 2nd case (i.e. Caddy <-> Erlang Backend) are encountering the readLoopPeekFailLocked: <nil> errors occasionally.

Thanks @francislavoie, I’ll try to make a plan for upgrading in the near future.

1 Like

Hi guys, I have fixed this issue! The reason is described here.

And this is the key packet that leads me to the solution:

Nice, good job figuring that out.

Quoting the summary here for others who search and find this:

TL;DR : An abnormal 204 response, which includes a body, will cause the readLoopPeekFailLocked: <nil> error.

The Erlang Backend is using Cowboy 2.7.0, which unfortunately allows developers to construct a > 204 response with a body. And the fact is that our Erlang developer does do this!

After correcting these abnormal 204 responses, all readLoopPeekFailLocked: <nil> errors disappear.

NOTE: Cowboy 2.8.0 has fixed this issue (commit).

2 Likes