closing and reopening eleveldb may deadlock #71

uwiger · 2013-09-03T17:44:11Z

When testing a patched mnesia using eleveldb as a backend, I noticed that some test cases could hang forever. I believe the problem is as follows:

The test process creates a table (leveldb database instance) and does some reads and writes
The database is closed and deleted (eleveldb:destroy/2 followed by rm -rf ... just to be sure)
The same process reopens the database. In this particular case, the open() consistently hangs on an IO error.

The key is that the 'client' process reads from the database, i.e. using the Ref. If the Ref remains as garbage on the heap when mnesia is restarted (which triggers a lot of work, but not in the calling process), the Ref will not be freed, as the destructor isn't called until the GC clears out the last reference.

Calling erlang:garbage_collect() in the test process before restarting mnesia fixes the problem in this particular case (with luck, adding debug printouts can achieve the same thing by triggering the GC). But it's not safe to assume that the Ref will ever be completely freed by GC, as some processes may perform work and then idle forever without performing the final GC.

One idea is to let a worker thread call AwaitCloseAndDestructor() [1] right after InitiateCloseRequest() has been called, then have it remove the LevelDB env from the magic binary. I assume this would release the LevelDB lock entry?

[1] https://github.com/basho/eleveldb/blob/master/c_src/refobjects.cc#L137

The text was updated successfully, but these errors were encountered:

matthewvon · 2013-10-27T15:16:06Z

What is the full text of the IO error?

uwiger · 2013-10-27T18:49:36Z

I don't think I have it anymore, but as I recall, it was the usual error you get when you try to open an instance that is already in use.

matthewvon · 2013-10-27T19:24:18Z

I assumed that, but was just trying to make sure.

eleveldb_close() is one of the routines our Erlang experts identified as still being "synchronous" and therefore needing to be reworked. I will attempt to address your concerns in the rework.

matthewvon · 2014-05-21T19:26:59Z

@uwiger The eleveldb mv-tuning7 branch coupled with leveldb mv-tuning7 branch includes the long awaited asynchronous close of the database and/or iterator. Give it a whirl if you have time and send me feedback.

matthewvon · 2014-08-15T19:26:53Z

This is believed fixed on the "develop" branch. Have you retried recently?

uwiger · 2014-08-15T21:36:57Z

From what I could tell when I was testing, it was ok.

evanmcc added this to the 2.1 milestone May 12, 2014

uwiger closed this as completed Aug 15, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

closing and reopening eleveldb may deadlock #71

closing and reopening eleveldb may deadlock #71

uwiger commented Sep 3, 2013

matthewvon commented Oct 27, 2013

uwiger commented Oct 27, 2013

matthewvon commented Oct 27, 2013

matthewvon commented May 21, 2014

matthewvon commented Aug 15, 2014

uwiger commented Aug 15, 2014

closing and reopening eleveldb may deadlock #71

closing and reopening eleveldb may deadlock #71

Comments

uwiger commented Sep 3, 2013

matthewvon commented Oct 27, 2013

uwiger commented Oct 27, 2013

matthewvon commented Oct 27, 2013

matthewvon commented May 21, 2014

matthewvon commented Aug 15, 2014

uwiger commented Aug 15, 2014