Set-up a reverse proxy for image in order to prevent sourcing image from remote host and filter out SVG images to follow CASA Security requirement

1. The problem I’m having:

we’re trying to set-up a reverse proxy for image in order to prevent sourcing image from remote host and filter out SVG images to follow CASA Security requirement.

We want to do something like:

  • https://img.mydomain.io/domain.com/path/to/example.svg → 403
  • https://img.mydomain.io/www.gravatar.com/avatar/4462f80994d1518a3b30b324df958bd5?d=404&s=400 → reverse_proxy www.gravatar.com/avatar/4462f80994d1518a3b30b324df958bd5?d=404&s=400 → content
  • What we tried so far (Caddyfile below) is configuring Caddy to reverse proxy to https://www.gravatar.com, so for example requesting https://img.mydomain.io/avatar/4462f80994d1518a3b30b324df958bd5?d=404&s=400 will display the image, but the @png matcher where we check for Content-Type if image/png then block the image didn’t work, the image was still displayed, i checked with curl -I https://img.mydomain.io/avatar/4462f80994d1518a3b30b324df958bd5?d=404&s=400 and the Content-Type header was indeed image/png (our case is we want to block SVG, we are just testing with a PNG available url )

2. Caddy version:

v2.8.4

3. How I installed and ran Caddy:

Installed from the APT repository following the official Docs Install — Caddy Documentation, we run it with systemd.

a. System environment:

Ubuntu 22.04 LTS, Linux kernel 5.15.0-124-generic, x86_64, systemd 249

b. Command:

systemctl start caddy

d. My complete Caddy config:

https://img.mydomain.io {
    route /avatar/* {

    @png {
        header Content-Type image/png
    }

    handle @png {
        respond "Access Denied" 403
    }

   reverse_proxy https://www.gravatar.com {
       header_up Host www.gravatar.com
       header_up X-Forwarded-Host {host}
   }
 }
}

Is the URL here a mistake? Doesn’t seem to align with the rest of what you say. Do you really want to include the domain in the URL?

This matches the request header, not the response header.

If you need to intercept the response to check its Content-Type, then you need to use handle_response inside of reverse_proxy.

img.mydomain.io {
	handle /avatar/* {
		reverse_proxy https://www.gravatar.com {
			header_up Host {upstream_hostport}

			@png header Content-Type image/png
			handle_response @png {
				error 403
			}
		}
	}

	handle {
		# Any path that isn't /avatar*
	}
}
2 Likes
  • Thank you @francislavoie, handle_response indeed fixed the display issue.

  • In fact the URL https://img.mydomain.io/www.gravatar.com/avatar/4462f80994d1518a3b30b324df958bd5?d=404&s=400 → reverse_proxy www.gravatar.com/avatar/4462f80994d1518a3b30b324df958bd5?d=404&s=400 isn’t a mistake (mybad it should be reverse_proxy just the domain not URI).

  • We do retrieve images from many sources not only Gravatar, so what we want to achieve is to send a full URL, then configure caddy to dynamically reverse proxy to it, and block SVG Images, the current configuration will oblige us to use only images from Gravatar.

  • Is it possible to send a full URL (https://img.mydomain.io/https://www.gravatar.com/avatar/4462f80994d1518a3b30b324df958bd5?d=404&s=400), then we extract the domain to reverse proxy to it, and replace the URI from https://img.mydomain.io/https://www.gravatar.com/avatar/4462f80994d1518a3b30b324df958bd5?d=404&s=400https://img.mydomain.io/avatar/4462f80994d1518a3b30b324df958bd5?d=404&s=400

This way it will be more dynamic if we use multiple image sources.

I appreciate any guidance :pray:

I would recommend not including the scheme because it’s awkward with the double slashes and such (can get collapsed to one slash), and Caddy’s reverse_proxy cannot be multi-transport (can’t do both HTTP and HTTPS at the same time) so best if you just stick to one transport always (HTTPS).

You can use the path_regexp matcher to extract the domain out, then uri strip_prefix to remove it from the path, then use the capture from the regexp as the upstream address and the remaining path is passed to the upstream. handle_response is the same.

@host path_regexp ^/([^/]*)/
handle @host {
	uri strip_prefix /{re.host.1}
	reverse_proxy {re.host.1} {
		header_up Host {upstream_hostport}
		transport http {
			tls
		}

		@svg header Content-Type image/svg+xml
		handle_response @svg {
			error 403
		}
	}
}

I’m not sure if the regexp is quite right, just typed it up quick but you can surely play around with this to get it to where you need it to be. Should match the start of the path, skip the leading /, capture everything except a slash (i.e. the domain), then ensure it’s followed by a slash.

I’d suggest also having somekind of allow-list of domains cause surely users could still find a way to do bad things with other domains. Only blocking SVG is probably not enough. I assume. To do that, you can just change @host to like:

@host {
	path_regexp ^/([^/]*)/
	path /www.gravatar.com/* /somewhere-else.com/* /blahblah.com/*
}

And non-matching requests would fall through to your other handle with no matcher (you can emit whatever error you want)

2 Likes

Thank you for your fast reply, with regex ^/([^/]*)/ i get bad request 502 error , so i tweaked a little bit the regex to capture https://www.gravatar.com as follow @host path_regexp ^.*\/(https:\/\/[^\/]+)(\/.*)
full Caddyfile :

https://img.mydomain.io {

    @host path_regexp ^.*\/(https:\/\/[^\/]+)(\/.*)
    handle @host {
	uri strip_prefix /{re.host.1}
	reverse_proxy {re.host.1} {
		header_up Host {upstream_hostport}
		transport http {
			tls
		}

		@svg header Content-Type image/png
		handle_response @svg {
			error 403
		}
	}
}

when i request https://img.mydomain.io/https://www.gravatar.com/avatar/4462f80994d1518a3b30b324df958bd5?d=404&s=400 i get a blank page, no log/console errors,
Log line :

{
  "level": "info",
  "timestamp": 1731059018.262781,
  "logger": "http.log.access.log3",
  "message": "NOP",
  "request": {
    "remote_ip": "172.71.186.192",
    "remote_port": "12118",
    "client_ip": "my_ip",
    "protocol": "HTTP/2.0",
    "method": "GET",
    "host": "img.mydomain.io",
    "uri": "/https://www.gravatar.io/avatar/4462f80994d1518a3b30b324df958bd5?d=404&s=400",
    "headers": {
      "Priority": ["u=0, i"],
      "Cache-Control": ["max-age=0"],
      "Sec-Fetch-Site": ["none"],
      "Accept-Language": ["en-US,en;q=0.9,fr-FR;q=0.8,fr;q=0.7,ar;q=0.6"],
      "Accept-Encoding": ["gzip, br"],
      "Sec-Fetch-Dest": ["document"],
      "Sec-Ch-Ua-Platform": ["\"Linux\""],
      "X-Forwarded-For": ["my_ip"],
      "X-Forwarded-Proto": ["https"],
      "Sec-Fetch-User": ["?1"],
      "Cf-Visitor": ["{\"scheme\":\"https\"}"],
      "Sec-Ch-Ua": ["\"Chromium\";v=\"130\", \"Google Chrome\";v=\"130\", \"Not?A_Brand\";v=\"99\""],
      "Upgrade-Insecure-Requests": ["1"],
      "User-Agent": ["Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/130.0.0.0 Safari/537.36"],
      "Cf-Ipcountry": ["TN"],
      "Sec-Fetch-Mode": ["navigate"],
      "Cf-Connecting-Ip": ["my_ip"],
      "Cookie": ["REDACTED"],
      "Accept": ["text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7"],
      "Cdn-Loop": ["cloudflare; loops=1"],
      "Sec-Ch-Ua-Mobile": ["?0"],
      "Cf-Ray": ["8df49eaf9be90da8-MRS"]
    }
  },
  "tls": {
    "resumed": false,
    "version": 772,
    "cipher_suite": 4865,
    "protocol": "h2",
    "server_name": "img.mydomain.io"
  },
  "response": {
    "bytes_read": 0,
    "user_id": "",
    "duration": 0.000050187,
    "size": 0,
    "status": 0,
    "headers": {
      "Server": ["Caddy"],
      "Alt-Svc": ["h3=\":443\"; ma=2592000"]
    }
  }
}

i tried hardcoding some steps for debug purpose as follow :

    @host path /https://www.gravatar.com/*
    handle @host {
        uri strip_prefix /https://www.gravatar.com
        reverse_proxy https://www.gravatar.com {
                header_up Host {upstream_hostport}
                transport http {
                        tls
                }

                @svg header Content-Type image/png
                handle_response @svg {
                        error 403
                }
        }

and it worked, i got the image displayed, so probably there is an issue with my regex

I really suggest not keeping the scheme in the URL, like I wrote above. If you don’t have it then the regexp is simpler, and you’re not misled because no other scheme would be supported anyway. Whatever thing you have that builds these URLs should just not put the scheme. My regexp specifically only reads “up to the first /” and since https:// obviously has slashes, it doesn’t work.

1 Like

We can remove the https:// scheme, but Caddy doesn’t seems to communicate without it with the upstream server (www.gravatar.com).
i tried this simple configuration :

https://img.mydomain.io {
    reverse_proxy www.gravatar.com {
               header_up Host {upstream_hostport}
               transport http {
                       tls
               }
}

The response is Bad gateway Error code 502 :worried:.

when i force reverse_proxy https://www.gravatar.com, the content is displayed as expected.

Ah, cause you need to add :443 to the upstream address. The default is :80 if not specified. So just do {re.host.1}:443

2 Likes

Thank you very much, now everything works perfectly, such an amazing community :pray:

Here is the final Caddyfile in case someone tried to achieve same thing :

https://img.mydomain.io {

    @host path_regexp ^/([^/]*)/
    handle @host {
	uri strip_prefix /{re.host.1}
	reverse_proxy {re.host.1}:443 {
		header_up Host {upstream_hostport}
		transport http {
			tls
		}

		@svg header Content-Type image/svg+xml
		handle_response @svg {
			error 403
		}
	}
}

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.