Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed several performance issues and stuttering #339

Merged
merged 3 commits into from
Nov 19, 2020

Conversation

chylex
Copy link
Collaborator

@chylex chylex commented Nov 17, 2020

Was trying the plugin out and immediately ran into severe performance issues, so I figured I'd give some optimizations a shot.

Here are my results with the test tag latency test:

Before: 25.1s (avg 185ms)
After:   9.2s (avg  28ms)

I verified that unit tests succeed (apart from test pinyin selection which had a compile error in master, not sure if you know), and that the order of tags has not changed and basic uses of the plugin still work, but since I've only used it for a few minutes so far it's possible something subtle may have changed that I didn't catch.

Rationale

In a few places I avoided recomputing things that don't change, those cases should be obvious.

In Solver.tryToAssignTag, the first major change was storing used indices in an IntSet which already brought it down from about 24s to about 17s. HashMap value collections are O(n), plus IntSet has the benefit of not boxing primitives. I also changed newTags into an Object2IntOpenHashMap to avoid boxing values. These collections come from fastutil included in IJ.

In Solver.map, I completely rewrote the collections and removed parallel streams. Regarding parallelization, I have a 16 thread CPU and the overhead of starting threads and synchronizing on the TreeMultimap was so large it took all the threads just to match the performance of single-threaded streams. Reducing the thread count made performance worse, so this parallel stream likely runs worse on any CPU with fewer than 16 threads than a non-parallel stream.

Still, the major bottleneck was TreeMultimap. I found that the values only needed to be in a sorted state by the time they get passed to tryToAssignTag, so I completely replaced the multimap with a HashMap<String, IntList> where IntList not only avoids boxing, but also supports primitive integer comparators for sorting, and has an efficient conversion to array.

Switching out the map and removing parallel streams brought it down from 17s to 11s. Replacing the boxed comparator with primitive comparator made the code a bit more verbose, but it also saved roughly 2 additional seconds. I noticed the collections end up rather large and this is probably close to the limit of what can be done without completely rewriting how tags are assigned.

In AceUtil.wordBoundsPlus, I turned the sequence into a simple loop, it didn't save much time but I didn't see any reason to use a sequence and allocate collections there.

Finally, the old implementation of Trigger appears to have been freezing the UI thread and didn't work as described in the documentation comment (it wasn't resetting the timer on repeated calls despite the comment saying so). My reimplementation probably isn't the best either, but with it and with the other optimizations, I was able to drop the visibleAreaChanged delay to about half with no stuttering.

…minate some unnecessary work to improve performance
@chylex
Copy link
Collaborator Author

chylex commented Nov 18, 2020

I found a few more opportunities, at this point the included test is running too fast to have any meaningful data so I bumped up the repeats to 800 and got

Before: 34s (avg 22.8ms)
After:  28s (avg 15.6ms)

(might be a good idea to do a warmup in the tests, bumping up the repeats reduced averages overall)

I think it's pretty tight at this point, the next meaningful step could be only searching inside visible area and then lazily adding new markers as the user navigates past the top/bottom marker, but that'd be more work than what I want to put into it.

@breandan breandan merged commit 3fd3edf into acejump:master Nov 19, 2020
@breandan
Copy link
Collaborator

Hi @chylex, thanks for the PR, this is really solid work! I was able to reproduce the latency reduction on some texts, although the variance is still pretty high depending on the query and editor contents. I also added a lorem ipsum test in c8c6f9c and noticed an improvement, but did not notice a significant improvement on random text. Most of the optimizations make sense, although a more comprehensive set of performance tests are needed. Ideally, these would include a representative sample of random source code files.

I am running ./gradlew test -i on IU-2020.3 running on a Intel(R) Xeon(R) CPU E3-1505M v5 @ 2.80GHz.

These were the results I collected before merging this PR as of 6c19767:

AceTest > test lorem ipsum latency STANDARD_OUT
    Average time to tag results: 742ms

AceTest > test random text latency STANDARD_OUT
    Average time to tag results: 1403ms

These were the results I observed after merging this PR as of a38c670:

AceTest > test lorem ipsum latency STANDARD_OUT
    Average time to tag results: 72ms

AceTest > test random text latency STANDARD_OUT
    Average time to tag results: 1322ms

the next meaningful step could be only searching inside visible area and then lazily adding new markers as the user navigates past the top/bottom marker

Assigning text by only considering the visible area has been suggested, although this would not be compatible with AceJump's current search functionality due to the tag assignment problem. In order to assign tags, we must ensure no tags could ever result in a collision, which is basically an ambiguous string elsewhere in the editor text. We do not currently test for tag collisions, and I did not check very carefully, but it looks like your PR does not break this assumption.

I'll continue and probably make another PR tomorrow

I have merged this was planning to release in 3.6.4, but if you prefer, I will wait for your next PR.

unit tests succeed (apart from test pinyin selection which had a compile error in master, not sure if you know),

I could not reproduce this error. Please open an issue if you happen to encounter it again.

@chylex
Copy link
Collaborator Author

chylex commented Nov 20, 2020

unit tests succeed (apart from test pinyin selection which had a compile error in master, not sure if you know),

I could not reproduce this error. Please open an issue if you happen to encounter it again.

It's strange, I tried loading the project on a linux system with IJ 2020.2.3 and it compiles there, but my main windows system with 2020.3 EAP (203.5251.39) complains:
obrazek

I have merged this was planning to release in 3.6.4, but if you prefer, I will wait for your next PR.

I'll make another PR later today, so you could wait since it'll probably take until next week for JetBrains to review the update. I also added a few new features and improvements, but I'll open an issue with more details and ask you which features you'd like me to PR separately from optimizations.

@chylex chylex deleted the optimizations branch November 20, 2020 16:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants