Meta supervisor?

I have an idea for a plugin that I’ve been trying to write, but I may be missing some theoretical understanding of Caddy internals (or it may be something completely impossible or undesirable given the current architecture). Please enlighten me.

I want to write a plugin that behaves similar to supervisor/exec: you pass a command line, it executes it and keeps the child process alive. The main difference is that it also sets an env var with an URL. The child process must POST to that URL at least once with a JSON that is a very simplified configuration (think like what php_fastcgi sets: index, files that should be proxied to it, etc). The plugin reads this config and recreates the config for this route with the proper Caddy routing.

This would work as sort of a meta frankenPHP / modern FastCGI with minimal configuration for the user, as the program is responsible for telling Candy what it can/should handle. Like “I’m a generic Markdown Server that caches/executes files ending in .md, only send those requests to me”. I want those decision to be dynamically changeable by the client process.

My initial attempt at implementing this: I created an empty HandlerDirective Middleware (I tried to do with Directive instead, but I wasn’t able to keep the configuration around long enough to have a ctx). I set up a AdminRouter (which would be the URL the child process would send to). I thought about using the Admin API itself to update the config.

The problem is a logical one: the initial config eventually gets replaces with the new one (as intended), which means that the only place where the configuration data would be available is on the old object (I then need to UsagePool to keep them around, I assume). This way the original Caddyfile works almost as a bootstrap step, that then gets replaced by the “real configuration” that each child process built. This seems hacky and brittle.

Here’s my current code: GitHub - fserb/substrate

Is what I’m trying to do even possible? Does this direction make any sense? Is there a better way to do this?

1 Like

Update: I also have been trying to do something similar to what caddy-docker-proxy does (i.e., just publish new Caddyfiles for whatever the supervised child sends), which seems a bit simpler.

One difficulty is that I’d still like for the original caddyfile to be used for defining the listen address, and other caddy rules. For example, imagine an original Caddyfile (where substrate is my plugin):

mysite.com {
   tls xxxx
   substrate /my/app --port 5555
   cache
}

and if /my/app says it’s able to handle .md files, I would create a tmp Caddyfile (like docker-proxy does) with:

mysite.com {
  tls xxxx

  @indexFiles file {
    try_files {path} {path}/index.md index.md
    split_path .md
  }
  rewrite @indexFiles {file_match.relative}

  @targetFiles path *.md
  reverse_proxy @targetFiles localhost:5555 

  cache   
}

But for that to work, I need to be able to capture the parent block when parsing the directive. Is this even possible? It seems that the Dispenser is always defined within the segment, which would make it impossible to access the parent rules. Am I missing something?

Neat idea.

You’d probably want to write an app module, so that when any Caddy config loads with your app present in the config, it would spawn the processes that are configured. It could also open its own socket (maybe a unix socket would actually be best?) for the IPC.

Then you would also register a HTTP handler module that can handle HTTP requests. I don’t know if you really need to expose your internal routing logic to the Caddy config.

Yeah, you’ll quite likely find the UsagePool helpful.

Anyway, does that make sense? Might be a much simpler approach than trying to dynamically craft a Caddy config from within Caddy.

It’s true, I don’t need to rebuild the config. The reason I suggested to is because I want to leverage other Caddy modules.

If I have my own middleware I could handle most logic by myself, but I definitely don’t want to do my own reverse proxy, for example. So how would I go around that?

Am I allowed to / supposed to caddy.GetModule("http.handlers.reverse_proxy").New(), provision it and then ServeHTTP() it by myself inside my middleware?

This could work well, since my only real conditional is “do we call the reverse proxy or not for this”.

WDYT?

I managed to make the whole server / client process response to work. And I’m able to pass info from the child process back into the request. Trying to set up the dynamic route isn’t working yet:

func (s *SubstrateHandler) UpdateOrder(order Order) {
	s.Order = order

	routes := caddyhttp.RouteList{}

	rewriteMatcherSet := caddy.ModuleMap{
		"file": s.JSON(fileserver.MatchFile{
			TryFiles: []string{
				"{http.request.uri.path}",
				"{http.request.uri.path}/index.md",
				"index.md"},
			TryPolicy: "first_exist_fallback",
		}),
	}

	rewriteHandler := rewrite.Rewrite{
		URI: "{http.matchers.file.relative}",
	}

	rewriteRoute := caddyhttp.Route{
		MatcherSetsRaw: []caddy.ModuleMap{rewriteMatcherSet},
		HandlersRaw:    []json.RawMessage{caddyconfig.JSONModuleObject(rewriteHandler, "handler", "rewrite", nil)},
	}

	routes = append(routes, rewriteRoute)
	s.route = routes
}

func (s SubstrateHandler) ServeHTTP(w http.ResponseWriter, r *http.Request, next caddyhttp.Handler) error {
	return s.route.Compile(next).ServeHTTP(w, r)
}

when I try to get /index.md it works, but / doesn’t. Ideas?

Anyway, not sure if I’m allowed/supposed to Compile() a RouteList that wasn’t in the config.

I tried a few things so far.

As I expected, I can’t just Compile a RouterList and make it work. :confused: Nor can I reasonably instantiate a Rewriter module without a config. I also gave up on trying to exec a small subconfig “in parallel” to the main one. This would actually be cool and kinda reasonable (basically exposing caddy.run() and a way to stop a context that doesn’t mess up with the main config).

I will try now do it as a Directive that will replace itself with the final RouteList with placeholders. Except that I want to be able to expand a placeholder into multiple values (for file matching, for example) and that is not supported at all. So the solution I came up with is to create a bunch of placeholders {p1:} {p2:} {p…N:} and then replace the ones I want to in the middleware. Not sure if the Matcher is going to complaint about having empty placeholders and it’s a bit ugly, but it might work.

It would be nice if placeholders could expand into multiple entries, specially for Matchers.

I got it to work with multiple placeholders, like:

func parseSubstrateDirective(h httpcaddyfile.Helper) ([]httpcaddyfile.ConfigValue, error) {
	routes := caddyhttp.RouteList{}

	substrateHandler := SubstrateHandler{}
	substrateHandler.UnmarshalCaddyfile(h.Dispenser)
	substrateRoute := caddyhttp.Route{
		HandlersRaw: []json.RawMessage{caddyconfig.JSONModuleObject(substrateHandler, "handler", "substrate", nil)},
	}
	routes = append(routes, substrateRoute)

	files := []string{"{http.request.uri.path}"}
	for i := range 32 {
		files = append(files, fmt.Sprintf("{http.request.uri.path}{substrate.match_files.%d}", i))
	}
	rewriteMatcherSet := caddy.ModuleMap{
		"file": h.JSON(fileserver.MatchFile{
			TryFiles:  files,
			TryPolicy: "first_exist",
		}),
	}
	rewriteHandler := rewrite.Rewrite{
		URI: "{http.matchers.file.relative}",
	}
	rewriteRoute := caddyhttp.Route{
		MatcherSetsRaw: []caddy.ModuleMap{rewriteMatcherSet},
		HandlersRaw: []json.RawMessage{caddyconfig.JSONModuleObject(rewriteHandler,
			"handler", "rewrite", nil)},
	}
	routes = append(routes, rewriteRoute)

	paths := []string{}
	for i := range 32 {
		paths = append(paths, fmt.Sprintf("{substrate.match_path.%d}", i))
	}
	reverseProxyMatcherSet := caddy.ModuleMap{
		"path": h.JSON(paths),
	}
	reverseProxyHandler := reverseproxy.Handler{
		Upstreams: reverseproxy.UpstreamPool{
			&reverseproxy.Upstream{
				Dial: "{substrate.host}",
			},
		},
	}
	reverseProxyRoute := caddyhttp.Route{
		MatcherSetsRaw: []caddy.ModuleMap{reverseProxyMatcherSet},
		HandlersRaw: []json.RawMessage{caddyconfig.JSONModuleObject(reverseProxyHandler,
			"handler", "reverse_proxy", nil)},
	}
	routes = append(routes, reverseProxyRoute)

	return []httpcaddyfile.ConfigValue{
		{
			Class: "route",
			Value: caddyhttp.Subroute{Routes: routes},
		},
	}, nil
}

and at request time:

func (s SubstrateHandler) ServeHTTP(w http.ResponseWriter, r *http.Request, next caddyhttp.Handler) error {
	if s.Order == nil {
		http.Error(w, "Internal Server Error", http.StatusInternalServerError)
		s.log.Error("No order")
		return nil
	}

	repl := r.Context().Value(caddy.ReplacerCtxKey).(*caddy.Replacer)
	repl.Map(func(key string) (any, bool) {

		if key == "substrate.host" {
			return s.Order.Host, true
		}

		var outmap *[]string
		if strings.HasPrefix(key, "substrate.match_files.") {
			outmap = &s.Order.TryFiles
			key = key[22:]
		} else if strings.HasPrefix(key, "substrate.match_path.") {
			outmap = &s.Order.Match
			key = key[21:]
		}

		if outmap == nil {
			return nil, false
		}
		number, err := strconv.Atoi(key)

		if err != nil || number < 0 || number >= len(*outmap) {
			return nil, false
		}
		return (*outmap)[number], true
	})

	return next.ServeHTTP(w, r)
}

For the file matching, having multiple patterns is no overhead, because you’d only go through all of them when it’s a 404.

For the proxy match, it does have to go through all of them. I tried to replace it with a path_regexp, but it doesn’t seem to support placeholders for the patterns.

Yes!

Modules are designed to be loaded/provisioned (and CleanUp’ed!) any time, even during an HTTP request.

1 Like

Ohhhh. I think I got it now.

mod, err := caddy.GetModule("http.handlers.reverse_proxy")
handler := mod.New().(*reverseproxy.Handler)
handler.Upstreams = reverseproxy.UpstreamPool{
	&reverseproxy.Upstream{
		Dial: "",
	},
}
err = handler.Provision(*s.ctx)

This works. It does feel a bit brittle. If I don’t create the empty Upstreams[0].Dial, then Provision() doesn’t do something and the whole thing becomes a no-op when called inside the ServeHTTP.

But like this, I can just set Dial during the request and everything works.

Thanks so much for the help. The final-ish code is here. And it does everything I set up myself to do in the first place.

I’ll write some docs when I have some time so other folks can use it.

1 Like

That makes sense though, there’s not much to provision with a totally empty config.

But yeah! Congrats getting it working! I think that’s the way to do it. :+1: Thanks for cranking away at it and sharing it!