-
Notifications
You must be signed in to change notification settings - Fork 178
Mv clean overlaps
- merged to "master" August 30, 2013
- development started August 9, 2013
Riak has an Erlang to leveldb interface layer called eleveldb. eleveldb has its own unit tests. One test that is very temperamental is the cachetest.erl. It has failed regularly in continuous integration and worked fine on developer platforms "for ever".
Effort was made to get the test running again to achieve green status in the continuous integration program. That work lead to the discovery of a true cache memory leak introduced in leveldb for Riak 1.4. The eleveldb test was failing before the leak was added. Just pointing out that bad unit tests can hide real problems.
The leveldb issue relates to code added in Riak 1.4 to keep level-0 and level-1 .sst files from ever being flushed from the file cache. This new code added an extra reference to the file objects of level-0 and level-1. That reference remained during database closing and therefore the objects did not delete ... memory leak.
Additional code was added to the reconstructor of the VersionSet object (db/version_set.cc). This code walks the overlapped files in the current version and manually evicts them from the cache.
There is a suspected race condition where some overlapped files might be tied up by iterator sequences / prior versions. Those files would still leak a cache object. Future unit tests and better application of malloc tools within the Riak environment are required to prove then fix these cases.