Php fastcgi i/o timeout and too many open files

1. Caddy version (caddy version):

v2.0

2. How I run Caddy:

sudo caddy --config caddy.json

a. System environment:

ubuntu 18.04

b. Command:

sudo caddy run --config caddy.json

c. Service/unit/compose file:

d. My complete Caddyfile or JSON config:

{
  "logging": {
    "logs": {
      "default": {
        "exclude": [
          "http.log.access.log0"
        ]
      },
      "log0": {
        "writer": {
          "filename": "access_dom1.log",
          "output": "file"
        },
        "encoder": {
          "field": "common_log",
          "format": "single_field"
        },
        "include": [
          "http.log.access.log0"
        ]
      }
    }
  },
  "apps": {
    "http": {
      "servers": {
        "srv0": {
          "listen": [
            ":443"
          ],
          "idle_timeout": "200ms",
          "routes": [
            {
              "match": [
                {
                  "host": [
                    "example.net"
                  ]
                }
              ],
              "handle": [
                {
                  "handler": "subroute",
                  "routes": [
                    {
                      "handle": [
                        {
                          "handler": "vars",
                          "root": "/var/www/bench/bench/public"
                        }
                      ]
                    },
                    {
                      "handle": [
                        {
                          "handler": "static_response",
                          "headers": {
                            "Location": [
                              "{http.request.uri.path}/"
                            ]
                          },
                                                      "status_code": 308
                        }
                      ],
                      "match": [
                        {
                          "file": {
                            "try_files": [
                              "{http.request.uri.path}/index.php"
                            ]
                          },
                          "not": [
                            {
                              "path": [
                                "*/"
                              ]
                            }
                          ]
                        }
                      ]
                    },
                    {
                      "group": "group0",
                      "handle": [
                        {
                          "handler": "rewrite",
                          "uri": "{http.matchers.file.relative}"
                        }
                      ],
                      "match": [
                        {
                          "file": {
                            "try_files": [
                              "{http.request.uri.path}",
                              "{http.request.uri.path}/index.php",
                              "index.php"
                            ]
                          }
                        }
                      ]
                    },
                    {
                      "handle": [
                        {
                          "encodings": {
                            "gzip": {},
                            "zstd": {}
                          },
                          "handler": "encode"
                        }
                      ]
                    },
                    {
                      "handle": [
                        {
                          "handler": "reverse_proxy",
                          "transport": {
                            "protocol": "fastcgi",
                            "split_path": [
                              ".php"
                            ],
                            "read_timeout": "300ms",
                         "write_timeout": "300ms"
                          },
                          "upstreams": [
                            {
                              "dial": "unix//run/php/php7.3-fpm.sock"
                            }
                          ]
                        }
                      ],
                      "match": [
                        {
                          "path": [
                            "*.php"
                          ]
                        }
                      ]
                    }
                  ]
                }
              ],
              "terminal": true
            }
          ],
          "logs": {
            "logger_names": {
              "example.net": "log0"
            }
          }
        }
      }
    }
  }
}

3. The problem I’m having:

I’ve gotten everything set up to mostly resemble my nginx conf. I’m trying to do some load testing between nginx and caddy. I’m using locust for testing, nginx is able to get to 135 RPS (requests per second) with no issue. Caddy can get to about 55 RPS before throwing read unix @->/run/php/php7.3-fpm.sock: i/o timeout.

I’m assuming this is because of the timeout I set.
before i set the time out i was getting the error read unix @->/run/php/php7.3-fpm.sock: too many open files

When i do lsof | grep -e caddy i get hundreds (up to the limit i set for php) of

caddy     17476 17724             root 1011u     IPv6           13599486        0t0        TCP xxx.xxx.xxx.xx:https->xx-xxx-xxx-xxx-static.hfc.comcastbusiness.net:61893 (ESTABLISHED)

From what i’ve seen other users post caddy vs nginx shows they are fairly similar. If i can get this figured out this will be my final question :laughing:

4. Error messages and/or full log output:


2020/05/20 16:26:38.902 ERROR   http.log.error.log0     dialing backend: dial unix /run/php/php7.3-fpm.sock: socket: too many open files     {"request": {"method": "GET", "uri": "/", "proto": "HTTP/1.1", "remote_addr": "xxx.x.xxx.xx:55557", "host": "example.net", "headers": {"User-Agent": ["python-requests/2.23.0"], "Accept-Encoding": ["gzip, deflate"], "Accept": ["*/*"], "Connection": ["keep-alive"], "Cookie": ["XSRF-TOKEN=eyJpdiI6InVLUkFWV1VYZ1NLQW5oVHhEZXdCQ1E9PSIsInZhbHVlIjoianp1NzQzT21ZODRYZjVnbEkwalBRTVNQKzhrZTNDVEdKL3FPYXB5bDAvRXZrSDloSGk5dWNQTEFxS295QVFibSIsIm1hYyI6ImY4ODliYTIyNjg3MTRlODA5MmZhOWE2MjVhNThkYWUyYTFkNjM3ODhlNTVjMDFlNmZhOGRjZjZmYTYzMmRjZTMifQ%3D%3D; laravel_session=eyJpdiI6IjI4UzltdzN1ZVRPcEZHZnpUTUhYN1E9PSIsInZhbHVlIjoibDVJU2xpQlJzTXlBaEVqRmNXM1VtNWlOcUNobFc3Yk9nVUN6bHYzMEErYkZ0b3dzbDFQU2d4K3g1eHpJdjdSMCIsIm1hYyI6IjhiNTRkYThmYzZkYWM0ODExYzQ5OWNjMDgzZWYzMWZjY2VjYTlmODRkZjlkOWViMGY0N2IzNDVjYjgwNjY5N2YifQ%3D%3D"]}, "tls": {"resumed": false, "version": 772, "ciphersuite": 4865, "proto": "", "proto_mutual": true, "server_name": "example.net"}}, "duration": 0.001816113, "status": 502, "err_id": "69thix4hx", "err_trace": "reverseproxy.(*Handler).ServeHTTP (reverseproxy.go:380)"}

http.log.error.log0 read unix @->/run/php/php7.3-fpm.sock: i/o timeout {"request": {"method": "GET", "uri": "//", "proto": "HTTP/1.1", "remote_addr": "xx.xxx.xxx.xxx:52682", "host": "example.net", "headers": {"User-Agent": ["python-requests/2.23.0"], "Accept-Encoding": ["gzip, deflate"], "Accept": ["*/*"], "Connection": ["keep-alive"]}, "tls": {"resumed": false, "version": 772, "ciphersuite": 4865, "proto": "", "proto_mutual": true, "server_name": "example.net"}}, "duration": 0.300784001, "status": 502, "err_id": "r5sbn4tqe", "err_trace": "reverseproxy.(*Handler).ServeHTTP (reverseproxy.go:380)"}

5. What I already tried:

6. Links to relevant resources:

Is the system and php-fpm configured exactly the same when comparing Caddy and Nginx? Seems like the error is coming from php-fpm.

Are those logs from Caddy itself? I’ve not seen those before (I don’t see read unix in the Caddy codebase). Do the Caddy logs have anything relevant?

both nginx and caddy are running the same php-fpm. Both are installed on the same system to ensure any tweaking we did to php is the same for both web servers.

I’ve updated the error logs in my original post with what caddy output.

Thanks for the info – since you’re already pretty deep into figuring this out, would you be able to look at the source code and see why it is acting like this? That would be very helpful!

That error is coming from here, for reference:

The error is generated from the Go stdlib, i.e. net.DialContext

Any update on this?

I was hoping you would have one for us, since you’re able to reproduce it and we aren’t…

2 Likes

Quite simple

  1. Your ran up against your max open files limit for caddy user or which ever use runs PHP-FPM or both caddy and PHP-FPM services. Check your max open files limit for caddy, nginx and php-fpm process ids using below command where PROCESSID is your caddy, nginx and/or php-fpm process ids
prlimit -p PROCESSID

Example to get nginx master process id’s limits, shows NOFILE = 524288 for open files limits

prlimit -p $(pgrep nginx | head -n1)
RESOURCE   DESCRIPTION                                SOFT         HARD UNITS
AS         address space limit                   unlimited    unlimited bytes
CORE       max core file size                            0    unlimited blocks
CPU        CPU time                              unlimited    unlimited seconds
DATA       max data size                         unlimited    unlimited bytes
FSIZE      max file size                         unlimited    unlimited blocks
LOCKS      max number of file locks held         unlimited    unlimited 
MEMLOCK    max locked-in-memory address space 901943132160 901943132160 bytes
MSGQUEUE   max bytes in POSIX mqueues               819200       819200 bytes
NICE       max nice prio allowed to raise                0            0 
NOFILE     max number of open files                 524288       524288 
NPROC      max number of processes                  127803       127803 
RSS        max resident set size                 unlimited    unlimited pages
RTPRIO     max real-time priority                        0            0 
RTTIME     timeout for real-time tasks           unlimited    unlimited microsecs
SIGPENDING max number of pending signals            127803       127803 
STACK      max stack size                          8388608    unlimited bytes

for my PHP-FPM master process id, NOFILE = 262144

prlimit -p 38752
RESOURCE   DESCRIPTION                             SOFT      HARD UNITS
AS         address space limit                unlimited unlimited bytes
CORE       max core file size                         0 unlimited blocks
CPU        CPU time                           unlimited unlimited seconds
DATA       max data size                      unlimited unlimited bytes
FSIZE      max file size                      unlimited unlimited blocks
LOCKS      max number of file locks held      unlimited unlimited 
MEMLOCK    max locked-in-memory address space     65536     65536 bytes
MSGQUEUE   max bytes in POSIX mqueues            819200    819200 bytes
NICE       max nice prio allowed to raise             0         0 
NOFILE     max number of open files              262144    262144 
NPROC      max number of processes                16384     16384 
RSS        max resident set size              unlimited unlimited pages
RTPRIO     max real-time priority                     0         0 
RTTIME     timeout for real-time tasks        unlimited unlimited microsecs
SIGPENDING max number of pending signals         127803    127803 
STACK      max stack size                       8388608 unlimited bytes

If you’re on a systemd linux distro, you can follow same guide at Increase open-files-limit in MariaDB on CentOS 7 with systemd for increasing max open file limits for your caddy, nginx and php-fpm services files

  1. PHP-FPM using unix sockets does not scale for high traffic loads. Unix sockets have slightly faster latency repsonse times but doesn’t scale. If you want PHP-FPM to scale better switch back to using TCP listener i.e. PHP-FPM TCP port 9000 and then tune your Kernel TCP parameters + PHP-FPM fastcgi buffers and timeouts.
1 Like

Yes it doesn’t rule out Caddy though as to it’s default timeouts for interacting with PHP-FPM fastcgi. Nginx has relevant fastcgi buffer, buffer sizes and timeout values to tune this at Module ngx_http_fastcgi_module. Would be nice if Caddy exposed the same too.

i.e. Module ngx_http_fastcgi_module

Syntax: fastcgi_read_timeout time;
Default:
fastcgi_read_timeout 60s;
Context: http, server, location
Defines a timeout for reading a response from the FastCGI server. The timeout is set only between two successive read operations, not for the transmission of the whole response. If the FastCGI server does not transmit anything within this time, the connection is closed.

Syntax: fastcgi_send_timeout time;
Default:
fastcgi_send_timeout 60s;
Context: http, server, location
Sets a timeout for transmitting a request to the FastCGI server. The timeout is set only between two successive write operations, not for the transmission of the whole request. If the FastCGI server does not receive anything within this time, the connection is closed.

Caddy has those parameters too: Modules - Caddy Documentation

1 Like

nice is this one directly the one mentioned in above error ?

dial_timeout

The duration used to set a deadline when connecting to an upstream.

Duration can be an integer or a string. An integer is interpreted as nanoseconds. If a string, it is a Go time.Duration value such as 300ms , 1.5h , or 2h45m ; valid units are ns , us / µs , ms , s , m , and h .

When checking for open files. Caddy never closes the 1k+ open php files (for php-fpm). While when checking for open files with nginx they all closed almost instantly (as the php file was just a echo).

1 Like

Caddy does close them: caddy/modules/caddyhttp/reverseproxy/fastcgi/fastcgi.go at master · caddyserver/caddy · GitHub

If you’ve found a bug (which nobody else has reported, hmmm), then please submit a pull request to fix it! Thank you :slight_smile:

1 Like

This topic was automatically closed after 30 days. New replies are no longer allowed.