Strange gzip behavior

I’m seeing odd behavior when proxying to a service: when calling the service directly on its port, everything is hunky-dory. When accessing it through Caddy, the body is gzipped. This only happens on one proxied service.

I do have the gzip module installed. It is not, however, enabled in the config file:

slyd% grep -i gzip caddy/Caddyfile
slyd%

The service is returning HTML with the following headers:

HTTP/1.1 200 OK
Content-Type: text/html; charset=utf-8
Vary: Accept-Encoding
Date: Fri, 13 Jan 2017 03:35:14 GMT
Content-Length: 892

Caddy is returning a gzip’d body with the following headers:

HTTP/1.1 200 OK
Content-Length: 421
Content-Type: text/html; charset=utf-8
Date: Fri, 13 Jan 2017 03:54:03 GMT
Vary: Accept-Encoding

Other services do not set the Vary header, and are not exhibiting this issue; it’s the only difference between the services I can detect. A couple of other things I notice:

  • If I call Caddy with Accept: text/html it returns an uncompressed body
  • If I call the service with Accept-Encoding: gzip it returns a compressed body
  • The service does not set the Content-Encoding: gzip header in the latter case (which I believe is wrong)

So I think there is a bug in the service. However, why is Caddy interfering with the HTTP request from the client and inserting a request for gzipped content when none such came from the client? Going back to the start, when I curl the service directly and don’t specify gzip, I get HTML. When I curl Caddy and don’t specify gzip, it injects a request to the service for gzip. Is this acceptable behavior? Should Caddy be injecting headers that change the content type when it is proxying requests?

Thanks

Do you have any .gz files in your site (as in, have you pre-compressed your site’s static files)? What’s your Caddyfile?

I have not compressed any assets. Some of the proxied services may be serving compressed assets. My site is in this web directory, and there are no gz files in it:

slyd% find web -name \*.gz
slyd% 

The site that’s having the issue is the realize.slyd.space vhost (the last host in the Caddyfile); the vhost is entirely proxied to a service. The service will compress the response if asked to by Accept-Encoding, but doesn’t set the Content-Encoding: gzip header as it should; it only does this compression if asked to, and I think Caddy shouldn’t be inserting an Accept-Encoding header if the client hasn’t asked for it.

Sorry if I’m missing the attachment button. Here’s a link to the Caddyfile; the link is good until 2017-01-15.

Thanks, I snagged a copy of it and took a look – it’s very helpful to have that. (Maybe new users don’t have permission to post attachments, sorry about that.)

Yeah, I agree. Nowhere in the Caddy code base does it add/set the Accept-Encoding header (except implicitly when proxy copies the headers to the upstream). Have you observed the headers of the upstream request between Caddy and the service to verify the claim that Caddy is modifying the headers by setting Accept-Encoding when the client did not originally set it?

I hadn’t, but here are the results.

$ sudo tcpflow -p -c -i lo port 6981
tcpflow[18444]: listening on lo
127.000.000.001.56445-127.000.000.001.06981: GET / HTTP/1.1
Host: realize.slyd.space
User-Agent: curl/7.26.0
Accept: */*
Authorization: Basic *****************************************
X-Forwarded-For: 198.15.119.101
X-Forwarded-Proto: https
X-Real-Ip: 198.15.119.101
Accept-Encoding: gzip


127.000.000.001.06981-127.000.000.001.56445: HTTP/1.1 200 OK
Content-Type: text/html; charset=utf-8
Vary: Accept-Encoding
Date: Fri, 13 Jan 2017 14:07:12 GMT
Content-Length: 421

......n.....Kk.@.......Z....Y4.!..!...!..Q..7...-.7.>H.Y.o .r.T.t.%j........|T

I truncated the gzip output, and obfuscated the credentitals. This is in response to a curl from localhost, but on the domain IP (so as to trigger the vhost in Caddy):

curl https://CREDS@realize.slyd.space/

And the same thing with curl directly to the service:

127.000.000.001.56988-127.000.000.001.06981: GET / HTTP/1.1
User-Agent: curl/7.26.0
Host: localhost:6981
Accept: */*


127.000.000.001.06981-127.000.000.001.56988: HTTP/1.1 200 OK
Content-Type: text/html; charset=utf-8
Vary: Accept-Encoding
Date: Fri, 13 Jan 2017 14:17:08 GMT
Content-Length: 892

<!DOCTYPE html><html ng-app="reali

Output truncated again. Caddy is adding several headers, most of which are desirable for proxy. Again, the service is not correctly setting the Content-Encoding header, but that’s maybe a blessing in disguise or I wouldn’t have noticed this.

If you’re not doing it, and I’m not doing it, then who…?
looks nervously over shoulder at hockey-mask-guy

That’s weird. I’m seeing the same thing here.

But it definitely isn’t Caddy. I printed the original request headers and the headers of the request going upstream on the very line before RoundTrip():

fmt.Printf("HEADERS GOING UPSTREAM: %+v\n", outreq.Header)
res, err := transport.RoundTrip(outreq)
if err != nil {
	return err
}

and the output is (this is a fmt.Printf call):

HEADERS FROM CLIENT:    map[X-Forwarded-For:[::1] User-Agent:[curl/7.51.0] Accept:[*/*]]
HEADERS GOING UPSTREAM: map[X-Forwarded-For:[::1] User-Agent:[curl/7.51.0] Accept:[*/*]]

I just did a search on golang.org for “Accept-Encoding” and found the following result:

http package - net/http - pkg.go.dev (look at the DisableCompression field).

I had forgot about this, and I don’t think this is a bug in either the standard lib or the way Caddy is using it. Ultimately, your backend should be setting the Content-Encoding header properly. Simply setting Accept-Encoding on a backend proxy request shouldn’t hurt anything.

I don’t know if this question belongs here too, but it was the closest head I found.
I’m poking arround with a real simple setup for local development.

caddy 0.9.4 on mac.

caddyfile:

0.0.0.0
root srv
ext .html
gzip
templates
tls .certs/local.crt .certs/local.key
log stdout
errors stdout

calling https://localhost:2015/ in the browser:

  • with gzip enabled, the index.html gets downloaded to the local filesystem by the browser.

  • without, the index.html is correctly shown in the browser.

in both cases curl -k https://localhost:2015 shows the correct index.html

For sure I want to show, not download the file.

gzip {
  ext .js .css
}

would be an option, but is this intended?

Maybe i’m just on the wrong track,
but I’m wondering

thx in advance
achim

I read the Transport document; interesting. So the defined behavior of the Transport should be transparent to the caller. Only, because of a bug in the service, Transport isn’t performing the decompression that it should be performing; this is a case of a Transport trying to be helpful and instead is injecting buggy behavior.

Agreed, it is not Caddy (unless Caddy is using it’s own transport, or a third-party transport), and agreed that the spec allows Transports to do this (even though, IMO, it’s a bad decision by the spec because of exactly this unexpected consequence).

Exactly. That’s how I understand it, anyway.

I think it’s good. It exposes a bug in your backend.

We’ll, it isn’t my service. It’s a third party service. We can agree to disagree about the wisdom of “helpful” behavior by that fundamentally change the nature of upstream requests. Case in point: I wouldn’t care about the bug, and it would not affect me, if Transport wasn’t mangling my headers.

Again, this isn’t a Caddy problem, or a problem with my code or my config; I just have to deal with the fallout.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.