Maybe a module Match() method can be called after Cleanup()?

Hello,

I’m the developer of caddy-maxmind-geolocation, a simple plugin to match requests based on the origin IP.

I open the MaxMind DB file in the Provision() method and I close it in the Cleanup() method.

I got an issue saying that, after a Caddy reload, my plugin begins to fail because it tries to access a closed database.

Unfortunately I still haven’t been able to reproduce it: I reloaded Caddy lots of times but it’s still working. However, I double-checked my code and the upstream dependency I use: the file is closed only by the Close() method, which I call only on Cleanup().

So it seems that, for the affected users, the Match() method is called on a module instance after Cleanup() has been called, that, if I got the lifecycle right, should never happen.

So here are my questions:

  1. Is there some situation this can happen, which I didn’t understand? Maybe this was a bug that has been solved recently? I had a look at the open and recently closed issues but I couldn’t find anything.

  2. Is opening the file in Provision() the right way? Reading the documentation again, I see: “when in the provisioning phase, do not expect that the module will actually be used”. So maybe I’m not doing this right, but I don’t see a OkDudeTheModuleIsGoingToBeUsed() method. Should I open the file on the first call to Match() and, on every call to Match(), ensure it’s still open and re-open it if needed?

Thank you very much for your help!

1 Like

I think you’re doing it right. As long as having more than one instance of the module in memory at the same time doesn’t cause problems. Did you test for that?

Maybe it’s a problem with the underlying lib you’re using that doesn’t isolate instances of the DB properly.

2 Likes

I’d second what Francis said – modules will for a brief time be loaded into memory twice during config reloads (assuming the module is used in both the old and new configs).

But yeah, that sounds right overall.

If you can only have 1 instance of the DB open throughout the life of the whole server process, consider using a UsagePool:

https://pkg.go.dev/github.com/caddyserver/caddy/v2#UsagePool

Thank you for your suggestions. Actually the library seems to be doing nothing wrong: in the Provision() I call maxminddb.Open() that returns a *maxminddb.Reader that I store in the instance itself. Different instances should have different *maxminddb.Reader and everything should be fine.

However, I just noticed that the maxminddb.Reader struct has a private buffer proprerty that is a []byte, and I expected it to be a file handler so it seemed strange to me. Turns out that maxminddb.Open() opens a file descriptor to the database and then uses unix.Mmap() to create a memory map of it. I know nothing of memory maps but maybe they’re unsafe across different handlers of the same files? I’ll try digging into it.

Thank you for pointing me in this direction!

1 Like

Ah, wow. Yeah they are trying to limit the concurrency of the DB I think.

That’s fine, but in that case try using a UsagePool to share the handle across your module instances.

As an aside, since you’re implementing a matcher, I’d suggest you implement the caddyhttp.CELLibraryProducer interface like most of the vanilla matchers do now.

I think you can copy the way MatchQuery does it, i.e. use CELValueToMapStrList to pull out any config as a “JSON structure” and then load it into your module.

The benefit of doing this is your matcher will be usable inside of expression matchers, so you could do something like this:

@allow `remote_ip('127.0.0.1') || maxmind_geolocation({'db_path':'/path/to/file','allow_countries':['IT','FR']})`

My talk https://youtu.be/QSerOHpMjgY?t=639 covers why this is useful (boolean expressions aren’t otherwise possible in matchers)

1 Like

This topic was automatically closed after 30 days. New replies are no longer allowed.