Replace_Response not working as expected

1. Caddy version (caddy version):

Caddy 2.45 built for Windows using xcaddy with replace-response and format-encoder included.

2. How I run Caddy:

Caddy runs as a Windows service set up by nssm. Configuration changes are performed using Caddy Reload.

a. System environment:

Windows Server 2019

b. Command:

nssm start Caddy (nssm starts Caddy with the "run" argument)

c. Service/unit/compose file:

n/a

d. My complete Caddyfile or JSON config:

(not complete - but you really don’t need to see a dozen perfectly working sites)

{
	#debug
	email pwh@cassland.org
	order replace after encode
	grace_period 10s
}

ai.cassland.org {
   rewrite / /magnoliaPublic/ai/index.html
   uri /dms/* replace /dms/ /magnoliaPublic/dms/ai/ 1
   @nopublic not path /magnoliaPublic/ai/*
   uri @nopublic replace / /magnoliaPublic/ai/ 1
   encode gzip
   replace {
       `/magnoliaPublic/dms/ai/` `/dms/`
       `/magnoliaPublic/ai/` `/`
	}
  	reverse_proxy http://localhost:8080

	log {
		output file .\Logs\AIaccess.log
		format formatted
	}
}

(The back-tick quoting is there to see if it made a difference; it didn’t.)

3. The problem I’m having:

I am reverse-proxying to a site running in Tomcat using a CMS called Magnolia (in an obsolete version). For architectural reasons any site in Magnolia is not at the server root, but under “/magnoliaPublic/” (there is a corresponding “/magnoliaAuthor/” for staging development). Furthermore, I am running multiple sites in Magnolia by adding another layer, so that the site root is (in the example I’m testing) actually at “/magnoliaPublic/ai/”. (There is an alternative substitution required which needs no separate discussion.) The URLs are published without this extra path element, and it is added in the Caddyfile using uri directives - this is working perfectly.

However, Magnolia generates internal links which include these additional path elements, so I need to remove the string “/magnoliaPublic/ai” from any internal link URL returned from Magnolia.

I have already had this running successfully for several years in Caddy v1 using the directives:

	filter rule {
		content_type text/html.*
		search_pattern /magnoliaPublic/dms/ai/
		replacement /dms/
	}
	filter rule {
		content_type text/html.*
		search_pattern /magnoliaPublic/ai/
		replacement /
	}

However, using the Caddy 2 version in the Caddyfile above is having no effect. That’s to say, the reverse-proxying is working fine, but the responses still contain the strings which the replace directive should have removed.

The site works with the strings remaining in, but I wish to hide them from the public users as they are an unnecessary complication (and expose information about the server I prefer to keep obscure).

4. Error messages and/or full log output:

Nothing pertinent is logged.

5. What I already tried:

I haven’t tried changing anything apart from adding quotes to the strings in the replace directive, because I cannot see any error in the configuration I have created. I have found another thread in this forum in which the same problem is raised, and that thread has no useful conclusion (the last post was saying that the suggestions given did not work).

6. Links to relevant resources:

Paul

Is your backend at http://localhost:8080 returning compressed content (i.e. check the Content-Encoding header)?

If it is because the response is compressed, then you can tell the upstream not to compress it (if it properly supports content negotiation) with this:

reverse_proxy http://localhost:8080 {
	header_up Accept-Encoding: identity
}

Aha! You are right - upstream is compressed with gzip.

I recall that before I used Caddy, when I was proxying this through Apache the configuration included a directive to unzip the response body before processing it. The filter available for Caddy v1 must have included that code automatically.

However, sending the upstream Magnolia an Accept-Encoding header doesn’t help, so I presume it is not recognised or honoured properly.

It took me a while, as there is no extant documentation of how to turn off compression in a 12-year-old version of Magnolia, but I have now succeeded in removing compression from its html pages (after various false starts which broke the server, in the config for the cache module I happened on a list of content types to be compressed, so I edited text/html to text/plain which did the job instantly - well, after the cache was cleared…).

And now the replace directive in Caddy is doing its job just fine, and the final obstacle to retiring Caddy v1 has fallen.

I suspect that this could be a common issue with replace - perhaps it would be worth considering adding decompression as part of the package?

Thanks,
Paul

Decompression is kinda expensive, but yeah it was talked about in this issue:

@matt, should probably add a note in the README about compression.

Oh crap I’m sorry, I gave you the wrong code snippet earlier. I included a : in there, which probably would have broken it :man_facepalming:

reverse_proxy http://localhost:8080 {
	header_up Accept-Encoding identity
}

You could try this I guess. But if you already turned off compression in the upstream, I suppose you don’t need to care anymore :sweat_smile:

1 Like

Heh! Well, I could have looked it up myself to learn more about it.

Anyway, as you say, I’ve got a solution, and as far as I can see it has no downside. I’ll never make anything new in this Magnolia installation, and indeed, I am planning to replace it entirely as soon as I find time to rebuild the site in a more modern platform (Bludit is what I’m looking at).

Paul

1 Like

This topic was automatically closed after 30 days. New replies are no longer allowed.