-
Notifications
You must be signed in to change notification settings - Fork 348
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some optimizations (cont) #395
base: master
Are you sure you want to change the base?
Conversation
Refactor parents' representation. Speed: 187% Memory: 73%
|
Compress Speed: 189% Memory: 71%
|
Reserve vec capacity. Speed: 202% Memory: 71%
|
Compress seen map. Here Speed: 207% Memory: 68%
|
Skip visited vertices. Speed: 212% Memory: 68%
|
Refactor shortest path algorithm. This commit also removes some unsafe code. Speed: 222% Memory: 41%
|
Done. You can merge this PR now. If I have other optimizations I will open another PR. The next big optimization opportunity might be parallelizing the |
Wow, this is really cool. I'm not sure about the changes to SeenMap: I deliberately wanted a vec so I could easily experiment with different sizes. The rest looks good at first glance, I think the use of a visited flag on graph nodes is particularly nice. I'm a little busy at the moment, but I will do a proper merge and review as soon as I can :) |
The third and maybe the last big performance improvement PR is ready. Given that you haven't merged this one, I would like to know that do you prefer adding them into this PR? |
Fix a small problem: I was using ZSH's built-in TIMEFMT="\
%J %U user %S system %P cpu %*E total
avg shared (code): %X KB
avg unshared (data/stack): %D KB
total (sum): %K KB
max memory: %M KB
page faults from disk: %F
other page faults: %R" According to ZSH's doc, |
OK, I've cherry-picked the first three commits and I'll follow up on the rest when I can :) If you have further awesome improvements, perhaps it would be clearer as a separate PR? I don't feel strongly though. |
Any update? If you have any questions, feel free to ask me. It's my pleasure to explain my optimizations. |
chore: generate and sync latest changes
This time I tried some radical optimizations.
Benchmark Approach
To make the result more accurate, I updated my benchmark approach. Here's the command:
where
with-bench
is a simple Zsh function that fixes CPU frequency and disables boost:Note that this time
difft
is directly invoked instead of throughcargo run
, so the speedup percentage will be higher (cargo run
has a fixed extra cost).Benchmark Results
Before my first PR:
Before my second PR:
Now:
Conclusion
Speed: 100% -> 135% -> 176% (according to hyperfine)
Memory: 100% -> 79% -> 74%
Caveats
In commit Eliminate some vec clones, the memory usage abnormally increased, which was not in line with my expectation. I haven't figured out why.In commit Change a RefCell in Vertex to UnsafeCell, a lot of unsafe code is applied, and they are apparently out of the boundary that they should stay. I don't know how to design abstractions for them.In commit Refactor seen map, I don't understand why your original code was written in this way. I just faithfully convert your code into a faster one.