-
Notifications
You must be signed in to change notification settings - Fork 793
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Removing internal interface usage, and tail calls in "dict" #594
Conversation
Allow the JIT more opportunities for inlining
Can't stop tail calls easily, so do some craziness which hopefully gets inlined away by the JIT.
Hi @manofstick, I'm your friendly neighborhood Microsoft Pull Request Bot (You can call me MSBOT). Thanks for your contribution! The agreement was validated by Microsoft and real humans are currently evaluating your PR. TTYL, MSBOT; |
(here is the gist of dictionary types which is a slight expanded from the original mentioned. it's not fantastic; you should turn the generation of tail calls off within the project) |
Did you try to discover why the tail calls cause these performance issues? I had read through most of #513 but didn't recall seeing anything. Given the functional nature of f# it's a bit disheartening to see performance necessitated changes like this. |
@rassilon tail calls are expected to be slower in some cases. The idea is to re-use the caller function's stack frame when invoking the callee function in tail position. For tail recursion, by definition the caller and callee are the same, so the stack and register layout of their arguments is identical, and things should be fast. What we see here should not affect functional-style tail-recursive algorithms (and besides, the F# compiler will usually re-arrange those into imperative loops anyhow). For a general non-recursive plain tail call, though, (i.e. returning the result of another function call as your own result, in a situation where the F# compile decides to add the So the problem is that the JIT is diligently honoring our |
@latkin , Ah... Thanks for explaining it to me. I knew there was something I was missing. |
@manofstick Looks nice! I tried this out and found that perf is not changed significantly except for the case of x64 struct/generic struct keys, where perf is now ~10% better. Just wanted to confirm this is what you expect here. |
Compliments #549 and #513.
The performance test from #513 (comment) shows the improvement of this pull request.
Almost at performance parity with using Dictionary directly which is great. The only real area that is remaining is related to #574 (which is only really a concern for generic objects; but it would be nice to be able to use a generic value based key, so I think that should still be discussed. I will create a more directed example over there)