Auto-optimize #93

astigsen · 2013-05-10T03:03:46Z

I have noticed that one of the things that confuses new users most is when and why they should call optimize(). So I have mede a change that calls optimize() automatically after 1000 inserts. That should in most cases be enough for any patterns to be detectable.

There are some cases where this will cause unintended behavior. If you are working with a table where you constantly delete everything and then fill it up again, then you might not want the optimize step every time (but who knows, it might still be useful, depending on what you are putting in). And if you use it as a stack adding and popping from the bottom, you could get pathological behavior if you stayed around the 1000 mark, but that is probably a very rare edge case.

For the large majority of cases it will greatly simplify things for the user, and if it turn out to give problems, we can add some code to detect that it has already been optimized, and skip doing it again.

kspangsege · 2013-05-12T14:39:40Z

src/tightdb/table.cpp

@@ -1419,6 +1419,16 @@ void Table::insert_done()
 {
    ++m_size;

+    // We will do an automatic optimize after 1000


This change interferes negatively with replication.

As is revealed by the following FIXME from "replication.hpp", the replication feature is implemented under the assumption that public modifying table methods will not call each other:

// FIXME: Be careful about the possibility of one modification // functions being called by another where both do transaction // logging.

(note, this is not limited to table methods only!)

The reason is that each public modifying method adds a corresponding instruction to the transaction log, and on the remote side, each instruction causes a call to the same public modifying method that generated it.

Therefore, if one public modifying method calls another one, the other one will end up being called twice on the remote side.

For example, if optimize() is called from the code below, it will add an 'optimize' instruction to the transaction log. That will be followed by a 'insert_done' instruction. On the remote side, where the transaction log is applied, the 'optimize' operation is first carried out, then the 'insert_done'. The problem is that, since the 'insert_done' instruction causes a call to Table::insert_done(), the optimization process will run twice.

Although, by itself, this is not a big issue, we must strive to adhere to rules like the mentioned one (no blame intended, I know you didn't know). If we don't, code complexity will quickly rise to a level where nobody can maintain enough insight to spot an issue like this one.

The trivial solution in this case is to introduce a new function that actually carries out the optimization, but does not add anything to the transaction log. This new function is then called both from Table::optimize() and from Table::insert_done(). The only hard problem is finding a suitable name...

The pathological scenario might not be that unlikely, which makes me a bit uneasy about this way of doing it.

I believe a simple and good behaviour would be to prevent the automatic optimization to occur more than once per table. The good part is that the worst case performance impact is limited in a simple to understandable way.

Unfortunately, I don't think this can be achieved without allocating an extra bit of information in the underlying array structure.

For example, by adding a flag that says whether the optimization operation has been carried out or not.

Alexander, do you have any other ideas of how to achieve 'once per table' behaviour?

kspangsege · 2013-05-12T15:15:21Z

Done reviewing.

astigsen · 2013-05-15T02:46:26Z

Hmm, I just wondered if this would get triggered for sub-tables as well. That would not be intended behavior (yet).

timanglade · 2014-02-04T06:01:57Z

@kspangsege @astigsen any news on the status of this one would be welcome as well.

Have we considered other ways to smooth performance without developer inputs (e.g. automatic index creation) ?

bmunkholm · 2014-02-04T07:25:31Z

My considerations in this area is that I think it's dangerous or at least
undesirable to do too much "intelligent" on behalf of developers. Suddenly
the system behavior is not deterministic anymore. And in the case of
indexing you would suddenly decide to trade memory usage for speed. I don't
think we can make such a tradeoff for people automatically.

// Brian

On Tue, Feb 4, 2014 at 7:01 AM, Tim Anglade notifications@github.comwrote:

@kspangsege https://github.com/kspangsege @astigsenhttps://github.com/astigsenany news on the status of this one would be welcome as well.

Have we considered other ways to smooth performance without developer
inputs (e.g. automatic index creation) ?

Reply to this email directly or view it on GitHubhttps://github.com//pull/93#issuecomment-34033260
.

jenkins-tightdb · 2014-03-18T07:57:15Z

Test FAILed.
Refer to this link for build results: https://ci.tightdb.com/job/core_pr/11/

jenkins-tightdb · 2014-03-18T08:21:38Z

Test FAILed.
Refer to this link for build results: https://ci.tightdb.com/job/core_pr/22/

simonask · 2015-04-09T09:53:06Z

@astigsen @bmunkholm Should this be closed or parked or…?

jedelbo · 2018-05-18T06:57:47Z

Abandonned

Added auto-optimize after 1000 inserts

8975211

kspangsege reviewed May 12, 2013
View reviewed changes

Merge branch 'master' of github.com:Tightdb/tightdb into auto-optimize

805f03d

simonask added the wip label Mar 30, 2015

bmunkholm changed the title ~~[WIP] Auto-optimize~~ [PARKED] Auto-optimize Apr 9, 2015

bmunkholm added the parked label Apr 20, 2015

danielpovlsen changed the title ~~[PARKED] Auto-optimize~~ Auto-optimize Jul 15, 2015

bmunkholm added P3 and removed parked labels Aug 12, 2015

bmunkholm removed the wip label Feb 8, 2016

bmunkholm removed the S:Idea Backlog label Aug 21, 2017

jedelbo closed this May 18, 2018

gvravikumar mentioned this pull request Oct 30, 2023

Failure to open a Realm file on Android 14 #7037

Open

sync-by-unito bot mentioned this pull request Feb 16, 2024

App crashing when starting Realm: [realm-core-11.0.2] Assertion failed: r == 0 && "File::unlock() " ... #4777

Closed

github-actions bot locked as resolved and limited conversation to collaborators Mar 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Auto-optimize #93

Auto-optimize #93

astigsen commented May 10, 2013

kspangsege May 12, 2013

kspangsege May 12, 2013

kspangsege May 12, 2013

kspangsege May 12, 2013

kspangsege commented May 12, 2013

astigsen commented May 15, 2013

timanglade commented Feb 4, 2014

bmunkholm commented Feb 4, 2014

jenkins-tightdb commented Mar 18, 2014

jenkins-tightdb commented Mar 18, 2014

simonask commented Apr 9, 2015

jedelbo commented May 18, 2018

Auto-optimize #93

Auto-optimize #93

Conversation

astigsen commented May 10, 2013

kspangsege May 12, 2013

Choose a reason for hiding this comment

kspangsege May 12, 2013

Choose a reason for hiding this comment

kspangsege May 12, 2013

Choose a reason for hiding this comment

kspangsege May 12, 2013

Choose a reason for hiding this comment

kspangsege commented May 12, 2013

astigsen commented May 15, 2013

timanglade commented Feb 4, 2014

bmunkholm commented Feb 4, 2014

jenkins-tightdb commented Mar 18, 2014

jenkins-tightdb commented Mar 18, 2014

simonask commented Apr 9, 2015

jedelbo commented May 18, 2018