-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed several performance issues and stuttering #339
Conversation
…minate some unnecessary work to improve performance
I found a few more opportunities, at this point the included test is running too fast to have any meaningful data so I bumped up the repeats to 800 and got
(might be a good idea to do a warmup in the tests, bumping up the repeats reduced averages overall) I think it's pretty tight at this point, the next meaningful step could be only searching inside visible area and then lazily adding new markers as the user navigates past the top/bottom marker, but that'd be more work than what I want to put into it. |
Hi @chylex, thanks for the PR, this is really solid work! I was able to reproduce the latency reduction on some texts, although the variance is still pretty high depending on the query and editor contents. I also added a lorem ipsum test in c8c6f9c and noticed an improvement, but did not notice a significant improvement on random text. Most of the optimizations make sense, although a more comprehensive set of performance tests are needed. Ideally, these would include a representative sample of random source code files. I am running These were the results I collected before merging this PR as of 6c19767:
These were the results I observed after merging this PR as of a38c670:
Assigning text by only considering the visible area has been suggested, although this would not be compatible with AceJump's current search functionality due to the tag assignment problem. In order to assign tags, we must ensure no tags could ever result in a collision, which is basically an ambiguous string elsewhere in the editor text. We do not currently test for tag collisions, and I did not check very carefully, but it looks like your PR does not break this assumption.
I have merged this was planning to release in
I could not reproduce this error. Please open an issue if you happen to encounter it again. |
Was trying the plugin out and immediately ran into severe performance issues, so I figured I'd give some optimizations a shot.
Here are my results with the
test tag latency
test:I verified that unit tests succeed (apart from
test pinyin selection
which had a compile error inmaster
, not sure if you know), and that the order of tags has not changed and basic uses of the plugin still work, but since I've only used it for a few minutes so far it's possible something subtle may have changed that I didn't catch.Rationale
In a few places I avoided recomputing things that don't change, those cases should be obvious.
In
Solver.tryToAssignTag
, the first major change was storing used indices in anIntSet
which already brought it down from about 24s to about 17s. HashMap value collections are O(n), plusIntSet
has the benefit of not boxing primitives. I also changednewTags
into anObject2IntOpenHashMap
to avoid boxing values. These collections come from fastutil included in IJ.In
Solver.map
, I completely rewrote the collections and removed parallel streams. Regarding parallelization, I have a 16 thread CPU and the overhead of starting threads and synchronizing on theTreeMultimap
was so large it took all the threads just to match the performance of single-threaded streams. Reducing the thread count made performance worse, so this parallel stream likely runs worse on any CPU with fewer than 16 threads than a non-parallel stream.Still, the major bottleneck was
TreeMultimap
. I found that the values only needed to be in a sorted state by the time they get passed totryToAssignTag
, so I completely replaced the multimap with aHashMap<String, IntList>
whereIntList
not only avoids boxing, but also supports primitive integer comparators for sorting, and has an efficient conversion to array.Switching out the map and removing parallel streams brought it down from 17s to 11s. Replacing the boxed comparator with primitive comparator made the code a bit more verbose, but it also saved roughly 2 additional seconds. I noticed the collections end up rather large and this is probably close to the limit of what can be done without completely rewriting how tags are assigned.
In
AceUtil.wordBoundsPlus
, I turned the sequence into a simple loop, it didn't save much time but I didn't see any reason to use a sequence and allocate collections there.Finally, the old implementation of
Trigger
appears to have been freezing the UI thread and didn't work as described in the documentation comment (it wasn't resetting the timer on repeated calls despite the comment saying so). My reimplementation probably isn't the best either, but with it and with the other optimizations, I was able to drop thevisibleAreaChanged
delay to about half with no stuttering.