Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak after unloading the ODBCInst library after calling SQLGetPrivateProfileString #130

Open
matthew-wozniczka opened this issue Jan 12, 2023 · 10 comments

Comments

@matthew-wozniczka
Copy link
Contributor

matthew-wozniczka commented Jan 12, 2023

If you dlopen the unixODBC ODBCInst library (built w/ --enable-inicaching=true), call SQLGetPrivateProfileString, and then unload it, it will leak the ini cache.

I noticed that there was a __clear_ini_cache function added a while back which could be called to prevent the leak, but it's not exported.

Could you perhaps add an atexit call somewhere to clean up the cache before the library is unloaded?

@lurcher
Copy link
Owner

lurcher commented Jan 13, 2023 via email

@lurcher
Copy link
Owner

lurcher commented Jan 13, 2023 via email

@matthew-wozniczka
Copy link
Contributor Author

I'm a bit surprised you can call __clear_ini_cache from SQLAllocHandle; I guess the 'odbcinst' code is statically compiled into libodbc? My use case is loading/unloading libodbcinst directly; I assumed that libodbc dynamically linked against libodbcinst.

In any case, I don't think your change would help in my scenario. Wouldn't it make more sense to register the atexit handler from within the 'odbcinst' code so that it would get used in both scenarios (libodbc & libodbcinst)?

Also, I noticed you were calling atexit on every call to SQLAllocHandle; atexit isn't idempotent, so that means the handler will get called once for every call to SQLAllocHandle. That in itself isn't too much of a problem (a bit inefficient though), but the number of handlers that atexit can register can be limited, so you could 'waste' it for other libraries or the main application.

Another suggestion would be to check for atexit failing, and to then fail whatever function triggered it, so that you don't set the process up for a leak later on.

And one more thing, even though I suggested using atexit, IIRC there's some unix platforms where it doesn't really 'work' for shared libraries (it doesn't run the handlers registered from a library when it's unloaded, rather it waits until the process is actually exiting, at which point the library might have already been unloading, causing a segfault), so you might want to be more selective than calling it whenever it exists. (There might be some alternative to atexit which would work on those platforms, but I'm not sure of the details)

@lurcher
Copy link
Owner

lurcher commented Jan 13, 2023 via email

@matthew-wozniczka
Copy link
Contributor Author

So where would you suggest I call atexit() from and when?

I think the simplest thing, without other refactoring, would be to call it on the first call to _check_ini_cache

It only calls it on the allocation of the ENVIRONMENT Handle, so not
used that often.

In a long-lived process, anything more often than 'once' is often enough to cause problems.

TBH, all you have done is made me think I should unwind the changes and
suggest that you either build your own version that does exactly what
you want or just build without the ini pooling enabled.

I work on a framework used to build ODBC drivers which are driver-manager-agnostic, and the reason we load libodbcinst in the first place is so that we follow the DM-in-use's behaviour for finding configuration information. So we're not really in the place to build the library ourselves (which one to load is configured by the user of the built driver), which was why I was hoping to fix the issue upstream so that eventually when users upgrade they won't have the leak anymore. (We learned about this leak due to a customer support issue)

We can recommend that customers build with ini-caching disabled, but it's enabled by default, and most customers just use whatever is installed on their machine by the OS rather than building unixODBC themselves.

If by 'build your own version' you mean come up with a pull request for you, I'd be willing to do that.

@lurcher
Copy link
Owner

lurcher commented Jan 13, 2023 via email

@matthew-wozniczka
Copy link
Contributor Author

Ok, but if its unreliable when used in shared object exactly what is
gained by making it generic? I guess it could have a configure test that
would create and load a library and check if atexit was called, but it
all seems very iffy.

I'm thinking it makes sense to use it on platforms where atexit has useful behaviour for shared libraries. I can do a bit of research to figure out which platforms these are, I think enabling it at least on the common platforms where it works would be a win.

Maybe an exported function could be added to the ODBCInst library that explicitly clears the cache, to be called explicitly by whoever loads it, to handle other platforms? This can kind of already be done by calling SQLWriteDSNToIni/SQLWritePrivateProfileString, since they call _check_ini_cache, but we don't want to actually modify any configuration from within a driver simply reading from it.

Yes, I would agree with that, but given a finite ini file then the
leaked memory for a long lived process is not going to increase that
much either.

The size of the leak isn't limited to the size of the data cached for any given ini file, because the leak is repeated for every load/unload cycle. In our code, we already take care to load it only once to limit the damage [originally we did so for every connection made], but the driver our code lives in may itself be loaded/unloaded an arbitrary number of times in a given process.

Ok. with as much sympathy as I can muster at 22:49 on a Friday night,
that sounds a bit like a "you problem". the solution is available in the
standard build by not enabling the pooling. It might be simpler to just
open the ini files yourself?

I think there's value in general (not just me/my company/our customers) in preventing the leak (at least saves people time in investigating it when it's detected). For me, making the ini caching disabled by default would be a good second-best resolution (compared to fixing the leak) since then it wouldn't affect the vast majority of users (but I worry a bit about how much CPU/IO could be potentially wasted without it).

I'm not going to demand a fix, worst-case we might start loading the library from a subprocess (or re-implement/fork it like you suggested), but I think it makes sense.

Happy to accept that as long as there is zero negative impact for users
that are not your customers.

OK, I'll take a look into this soon and try to come up with something that shouldn't break anyone.

@matthew-wozniczka
Copy link
Contributor Author

I haven't had time to work on this yet, but what do you think about my suggestion to add an exported function to the ODBCInst library that explicitly clears the cache? Given that atexit won't be portable, I think that's the only way to fully solve the problem?

@lurcher
Copy link
Owner

lurcher commented Jan 27, 2023 via email

@matthew-wozniczka
Copy link
Contributor Author

I just built unixODBC off of head on linux & tested w/ a change to call __clear_ini_cache, and it seems to resolve that leak, thanks.

I have two more comments/suggestions related to this:

  • Perhaps the name of the exported function should be something like uodbc_clear_ini_cache (identifiers that begin with two underscores are reserved in C, and in our use case we're not really sure who built the ODBCInst library, so right now we're just blindly calling __clear_ini_cache if dlsym finds it, which could be a problem if some other DM defined it differently [Not that worried about this but a coworker brought that up])
  • The copy of the ini cache that's linked into libodbc is still getting leaked. Maybe it should be cleared when the last environment handle is freed? This doesn't directly affect us (we don't dynamically load it, and I doubt many/any of our customers do), but it would probably be good to fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants