-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance problems with new groups search/filter and large databases #2852
Comments
This should be fixed in the latest development version. For a very large groups tree, the filtering process still takes a moment and is not as fluid as I would like but at least it works now. Could you please check the build from http://builds.jabref.org/master/. Thanks! |
JabRef 4.0.0-dev--snapshot--2017-05-20--master--01e854829 I can confirm, that the performance has drastically improved. A group search takes no longer a couple of minutes. However, as you say, the performance is still not very high for large databases. In my case it takes 14 sec after entering a search/filter term for the group to be displayed. If you are switching often from groups to groups, I think this is a bit too much. Thank you so much for your work! |
* upstream/master: (23 commits) Implement #2785: resort groups using drag & drop (#2864) Add Library of Congress as ID-fetcher (#2865) Fix export and import of MS office day/year/month acessed fields (#2862) Adsurl to url (#2861) Update LICENSE.md Update Update LICENSE.md Update license file so that github recognize it properly Improve Issue Template Using a Collapsible Log Area Fix #2852: Improve performance of group filtering. Rename GroupSelector to GroupSidePane Fix #2843: Show information correctly in entry editor Remove old entry editor code related to focus selection Implement feedback Menu Greek Translation (#2836) Relaxed the regex to also match negative timezone formats when parsing pdf annotation dates (#2841) Update localization Remove unnecessary group code (and move remaining settings to preferences) Add Local Maven repo as first lookup resource, to avoid having duplicate libs in gradle and maven Implement #2786: Allow selection of multiple groups ...
* upstream/master: (38 commits) Add link to "feature branch workflow" Support Annotations Created by Foxit (#2878) Fixes jacoco by excluding the fetcher tests from analysis (#2877) Fix entry editor (#2875) update bcprov-jdk15on from 1.56 -> 1.57 update assertj-swing-junit from 3.5.0 -> 3.6.0 update mockito-core from 2.7.22 -> 2.8.9 update jfx from 0.11.0 -> 0.11.1 update google guava from 21.0 -> 22.0 Fix Divide by zero exception if rows is zero in Entry Editor Tab (#2873) Implement #2785: resort groups using drag & drop (#2864) Add Library of Congress as ID-fetcher (#2865) Fix export and import of MS office day/year/month acessed fields (#2862) Adsurl to url (#2861) Update LICENSE.md Update Update LICENSE.md Update license file so that github recognize it properly Improve Issue Template Using a Collapsible Log Area Fix #2852: Improve performance of group filtering. ...
When we are done with optimizing the entry editor, the time has come to address the performance issues in the groups panel. I'll be so enthusiastic and add this to the 4.1 milestone. We'll see how it works out. |
That sounds awesome: Thank you very much! |
@AEgit To profile this, I need an example database of the size you mentioned. Can you point me to one? |
@halirutan I think @lenhard should still have my example database with 6,500 entries. My current one consists of over 12,000 entries, but I think the old example should be fine. Please let me know, if you don't have access to that example database - if that is not the case, I'll send you a new example database via email (for various reasons I would prefer not to make the database available to the public, so I cannot upload it here). |
@AEgit Send me yours to my mail on GitHub. If I have time, I look into it and see if I can point out some hot spots we should look in to. |
@halirutan Database has been sent. Thank you for your help! |
@AEgit @halirutan Sorry for being late. We have the prior file as well, but if you've already exchanged the new one, there's no reason to send it again, I guess. |
I guess I found the problem and @lenhard or @tobiasdiez should be able to provide a fix easily. When I see this right, then the flaw lies here: private boolean showNode(T t) {
if (filter.get() == null) {
return true;
}
if (filter.get().test(t)) {
// Node is directly matched -> so show it
return true;
}
// Are there children (or children of children...) that are matched? If yes we also need to show this node
return childrenFactory.call(t).stream().anyMatch(this::showNode);
} What happens is that every node in the groups view decides for itself if it is visible or not. To do this, intermediate nodes need to check if any of their children are visible because then, they need to be visible as well; no matter if they themselves match. With a nested group structure of say depth 3, this means that nodes in level 1 check themselves and all their children. Despite the fact, that we now already calculated the visibility information of all children, each node at level 2 will do the exact same work again. To give an example: Consider 1 root node with 3 children and each of the children again 3 children. This gives 13 nodes we need to check. What we instead do is
Makes 34 node tests altogether. We need to remember the visibility of a node once we have calculated it for a particular search term. |
I believe one crucial step, no matter if we can increase the performance of the group-search itself, is to unbind the search field from an instant search action. I have tried to do this, by replacing the direct
This works pretty well. So we only start the search 400ms after the user typed. Everytime the user hits a key in the search-field, the timer is restarted. With this, editing the search-field doesn't lag even with large databases. @AEgit I have pushed a new branch |
JabRef 4.1-dev--snapshot--2017-11-16--fix-GroupSearchPerformance--08d8a2ca0 @halirutan Thank you very much! This is exactly why I wanted to express in my convoluted comment: #2852 (comment) This makes the group search/filter much faster in terms of user experience. So, this is definitely an improvement! Note also, that I'm running JabRef on a quite powerful machine (Core i7 4+4HT cores with 2.2 GHz, 16 GB RAM, SSD), so on other machines this might take a bit longer. Thank you for your help! EDIT: I also found another bug with the group filter, which is reported here: |
@AEgit I did not fix the overal complexity issue I pointed out in #2852 (comment). The recursive groups tree is implemented using a very general recursive structure that is used in other places too. Currently, I don't see how I can fix this on my own and be sure I don't break anything else. Sorry. |
@halirutan Yes, thank you very much! I just mentioned those issues, to make sure that other people following this thread would notice that - while much has been done to improve the performance - there still persist some issues (caused by the structural problem you mentioned). |
@halirutan Thanks for looking into this and implementing an improvement. I think you could go for a PR with your delayed filtering. Regarding the overall complexity problem, I had a look at this some time ago, but did not arrive at a proper solution. The underlying problem is that every node tries to decide its own visibility locally. But ultimately it is not a decision that can be made based on local knowledge, because it depends on its children. This means that the current solution has to be rewritten and the filtering needs to be triggered once and only once from the root of the group tree. This requires some major changes to the node structure and I am also unsure how to fix this without breaking something else. |
@lenhard Exactly. There is another solution which, unfortunately, is highly impractical as well: The node stores the result of the last run and knows the last search term. This, however, means that you need to cache the I haven't profiled it carefully, but my guess is that as it stands the implementation has memory problems when searching repeatedly in large groups. At least this is suggested by what @AEgit described with "after a couple of group searches/filter actions, the filtering becomes again slower". |
@AEgit Would you test this PR http://builds.jabref.org/fix-GroupSearchPerformance/ It should be considerably faster and it contains (a) a delay in the search field so that it waits until you finished typing and (b) a cache that prevents the described repeated calculation of visibility. |
@halirutan Thank you very much! The behaviour in: JabRef 4.1-dev--snapshot--2017-12-20--fix-GroupSearchPerformance--12aab7e26 is much much better! A nice Christmas present ;) |
@AEgit I changed a few things to the code. Can you please have a look, if it is still working for you. Thanks. The link is the same http://builds.jabref.org/fix-GroupSearchPerformance/ but it's a newer version (in a few minutes). |
JabRef 4.1-dev--snapshot--2017-12-27--fix-GroupSearchPerformance--4246be769 @tobiasdiez Unfortunately the new version is not working as expected. While it still filters the groups, when a filtered group is selected, the associated entries are not updated accordingly in the main table view. This was not the case with the previous version: JabRef 4.1-dev--snapshot--2017-12-20--fix-GroupSearchPerformance--12aab7e26 |
Thanks for the feedback @AEgit. This bug should be fixed now as well. |
@AEgit Tobi fixed the bug and the branch is now merged. I haven't had the time to profile through the current state, but when Tobi fixed the repeated creation of tree elements it should be acceptable now. Anyway, great work of testing JabRef with incredibly large databases so that we get a feeling for the bottlenecks. |
@tobiasdiez and @halirutan Thank you very much! I can confirm, that this bug is fixed in: JabRef 4.2-dev--snapshot--2018-01-03--master--193bbbca6 The only remaining issue with the groups filtering is the one reported here: Thanks a lot for your help! |
JabRef 4.0.0-dev--snapshot--2017-05-18--master--018173ebd
Windows 10 10.0 amd64
Java 1.8.0_131
The newly implemented groups filter/search (#2588) exhibits massive performance problems when used on a large database (>10,000 entries, ~1,000 static groups).
This issue has been reported before (#2588 (comment), #2588 (comment)) but I have been advised to open a new ticket so that the issue is not forgotten.
The text was updated successfully, but these errors were encountered: