Fixed several performance issues and stuttering #339

chylex · 2020-11-17T11:05:40Z

Was trying the plugin out and immediately ran into severe performance issues, so I figured I'd give some optimizations a shot.

Here are my results with the test tag latency test:

Before: 25.1s (avg 185ms)
After:   9.2s (avg  28ms)

I verified that unit tests succeed (apart from test pinyin selection which had a compile error in master, not sure if you know), and that the order of tags has not changed and basic uses of the plugin still work, but since I've only used it for a few minutes so far it's possible something subtle may have changed that I didn't catch.

Rationale

In a few places I avoided recomputing things that don't change, those cases should be obvious.

In Solver.tryToAssignTag, the first major change was storing used indices in an IntSet which already brought it down from about 24s to about 17s. HashMap value collections are O(n), plus IntSet has the benefit of not boxing primitives. I also changed newTags into an Object2IntOpenHashMap to avoid boxing values. These collections come from fastutil included in IJ.

In Solver.map, I completely rewrote the collections and removed parallel streams. Regarding parallelization, I have a 16 thread CPU and the overhead of starting threads and synchronizing on the TreeMultimap was so large it took all the threads just to match the performance of single-threaded streams. Reducing the thread count made performance worse, so this parallel stream likely runs worse on any CPU with fewer than 16 threads than a non-parallel stream.

Still, the major bottleneck was TreeMultimap. I found that the values only needed to be in a sorted state by the time they get passed to tryToAssignTag, so I completely replaced the multimap with a HashMap<String, IntList> where IntList not only avoids boxing, but also supports primitive integer comparators for sorting, and has an efficient conversion to array.

Switching out the map and removing parallel streams brought it down from 17s to 11s. Replacing the boxed comparator with primitive comparator made the code a bit more verbose, but it also saved roughly 2 additional seconds. I noticed the collections end up rather large and this is probably close to the limit of what can be done without completely rewriting how tags are assigned.

In AceUtil.wordBoundsPlus, I turned the sequence into a simple loop, it didn't save much time but I didn't see any reason to use a sequence and allocate collections there.

Finally, the old implementation of Trigger appears to have been freezing the UI thread and didn't work as described in the documentation comment (it wasn't resetting the timer on repeated calls despite the comment saying so). My reimplementation probably isn't the best either, but with it and with the other optimizations, I was able to drop the visibleAreaChanged delay to about half with no stuttering.

…minate some unnecessary work to improve performance

…fragments

chylex · 2020-11-18T20:38:27Z

I found a few more opportunities, at this point the included test is running too fast to have any meaningful data so I bumped up the repeats to 800 and got

Before: 34s (avg 22.8ms)
After:  28s (avg 15.6ms)

(might be a good idea to do a warmup in the tests, bumping up the repeats reduced averages overall)

I think it's pretty tight at this point, the next meaningful step could be only searching inside visible area and then lazily adding new markers as the user navigates past the top/bottom marker, but that'd be more work than what I want to put into it.

breandan · 2020-11-20T00:15:45Z

Hi @chylex, thanks for the PR, this is really solid work! I was able to reproduce the latency reduction on some texts, although the variance is still pretty high depending on the query and editor contents. I also added a lorem ipsum test in c8c6f9c and noticed an improvement, but did not notice a significant improvement on random text. Most of the optimizations make sense, although a more comprehensive set of performance tests are needed. Ideally, these would include a representative sample of random source code files.

I am running ./gradlew test -i on IU-2020.3 running on a Intel(R) Xeon(R) CPU E3-1505M v5 @ 2.80GHz.

These were the results I collected before merging this PR as of 6c19767:

AceTest > test lorem ipsum latency STANDARD_OUT
    Average time to tag results: 742ms

AceTest > test random text latency STANDARD_OUT
    Average time to tag results: 1403ms

These were the results I observed after merging this PR as of a38c670:

AceTest > test lorem ipsum latency STANDARD_OUT
    Average time to tag results: 72ms

AceTest > test random text latency STANDARD_OUT
    Average time to tag results: 1322ms

the next meaningful step could be only searching inside visible area and then lazily adding new markers as the user navigates past the top/bottom marker

Assigning text by only considering the visible area has been suggested, although this would not be compatible with AceJump's current search functionality due to the tag assignment problem. In order to assign tags, we must ensure no tags could ever result in a collision, which is basically an ambiguous string elsewhere in the editor text. We do not currently test for tag collisions, and I did not check very carefully, but it looks like your PR does not break this assumption.

I'll continue and probably make another PR tomorrow

I have merged this was planning to release in 3.6.4, but if you prefer, I will wait for your next PR.

unit tests succeed (apart from test pinyin selection which had a compile error in master, not sure if you know),

I could not reproduce this error. Please open an issue if you happen to encounter it again.

chylex · 2020-11-20T09:26:41Z

unit tests succeed (apart from test pinyin selection which had a compile error in master, not sure if you know),

I could not reproduce this error. Please open an issue if you happen to encounter it again.

It's strange, I tried loading the project on a linux system with IJ 2020.2.3 and it compiles there, but my main windows system with 2020.3 EAP (203.5251.39) complains:

I have merged this was planning to release in 3.6.4, but if you prefer, I will wait for your next PR.

I'll make another PR later today, so you could wait since it'll probably take until next week for JetBrains to review the update. I also added a few new features and improvements, but I'll open an issue with more details and ask you which features you'd like me to PR separately from optimizations.

Change collection data types & usage, fix Trigger stuttering, and eli…

c3b330d

…minate some unnecessary work to improve performance

chylex mentioned this pull request Nov 17, 2020

Latency issues when searching large files #217

Closed

chylex added 2 commits November 18, 2020 21:32

Avoid sorting same integer lists in Solver.map

9c5a6d6

Reduce allocations, boxing, and lowercasing work when computing word …

7692a0e

…fragments

breandan merged commit 3fd3edf into acejump:master Nov 19, 2020

breandan mentioned this pull request Nov 20, 2020

Improve test coverage #139

Open

chylex deleted the optimizations branch November 20, 2020 16:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixed several performance issues and stuttering #339

Fixed several performance issues and stuttering #339

chylex commented Nov 17, 2020

chylex commented Nov 18, 2020

breandan commented Nov 20, 2020

chylex commented Nov 20, 2020

Fixed several performance issues and stuttering #339

Fixed several performance issues and stuttering #339

Conversation

chylex commented Nov 17, 2020

Rationale

chylex commented Nov 18, 2020

breandan commented Nov 20, 2020

chylex commented Nov 20, 2020