-
Notifications
You must be signed in to change notification settings - Fork 790
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GetMemberTypeInFSharpForm produces indirectly produces a large amount of Tuple instantiations causing UI delays in VS #5938
Comments
from first look: GetTopTauTypeInFSharpForm looks Ok. |
To note, just from experience, I have tried changing tuples similar to this to value tuples and the results were minimal. The copying of value tuples can be more expensive thus not really helping much. The benefit it relieves GC pressure, but it's hard decide: more copying and less GC pressure, or no copying and GC pressure? It depends on the scenario. |
I would evaluate why we are allocating this much data in such a short period. This is 6GB (only 20% of the trace!) of data over a couple of minutes. We must be missing caches, not caching enough or need a better algorithm. For a comparison, I'm looking at the trace of a load of a solution with 780 C# projects and 30 seconds after load; it allocates 2 GB in total over that entire period and I'm filing bugs to remove 30MB here and 50MB there. |
Yes I assume the problem is on higher level. As I said above the function
itself looks OK. At least I dont see obvious performance problems in it.
Am Do., 22. Nov. 2018, 08:45 hat David Kean <notifications@github.com>
geschrieben:
… I would evaluate why we are allocating this much data in such a short
period. This is 6GB (only 20% of the trace!) of data over a couple of
minutes. We must be missing caches, not caching enough or need a better
algorithm.
For a comparison, I'm looking at the trace of a load of a solution with
780 C# projects and 30 seconds after load; it allocates 2 GB *in total*
over that entire period and I'm filing bugs to remove 30MB here and 50MB
there.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#5938 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AADgNPDl7ePzn3MmRyqIWMZuHIQFyo_Iks5uxlYagaJpZM4Yqd2K>
.
|
@davkean There are a lot of allocs due to heavy use of linked lists, tuples & immutable AVL tree (map) construction throughout (incl DU) but in general overuse of temporary (bulky) data structures. It is consistent throughout so would be a lot of work to change the fundamental design, introducing more algos & mutable states/structures. Your analysis will hopefully uncover some memory bugs that can be easily fixed or improved as we clearly need to reduce the heavy memory bloat. One of the things that comes to mind on these tuple lists are that they could be converted to enumerators, avoiding loads of temp lists, so only a few persisted structures need to be kept in memory? |
We'll need to spend more time looking at this particular trace to see if there's an easy win here or not. For example, the function in the issue title just directly uses the results of [this function](https://github.com/Microsoft/visualfsharp/blob/master/src/fsharp/TastOps.fs#L1566, so maybe there's more investigative work involved. One of the goals of doing this analysis work is identifying if there is an easy win (e.g., #5941) vs. something that is fundamental to the design. |
I can look into doing a PR for this if you like? if you do not want to do drastic changes than I can change usages of |
An example of my proposal is as follows: let IsCompiledAsStaticProperty g (v:Val) =
match v.ValReprInfo with
| Some valReprInfoValue ->
match GetTopValTypeInFSharpForm g valReprInfoValue v.Type v.Range with
| [], [], _, _ when not v.IsMember -> true
| _ -> false
| _ -> false Goes to my new refactored version of let IsCompiledAsStaticProperty g (v:Val) =
if v.IsMember then
false
else
match v.ValReprInfo with
| Some (ValReprInfo (ntps, argInfos, _)) ->
match ntps , argInfos with
| [] , [] -> isForallTy g v.Type |> not
| _ -> false
| _ -> false Although it looks like same, it computes and allocates significantly less as I traced how the lists map and this achieves same result, all the mappings inside GetTopValTypeInFSharpForm are not needed in this instance. I can refactor and reduce down each of the usages of these functions so all the extra work for data not needed is not computed? |
@gerardtoconnor We'd certainly welcome help! I think the main thing is establishing a good perf test. A good first start could be to use this: https://github.com/Microsoft/visualfsharp/blob/master/tests/scripts/compiler-perf.fsx |
@cartermp are the compiler perf tests not already operational with this script? I was going to do the PR, and include "[CompilerPerf]" in the title so that I (we) could then run a comparison with master to verify it had produced the improvement? My example above turned out to be the simplest case where given there were no list reductions, only mappings, we could match empty on the source lists without mapping. The rest are more tricky and seems like we may need some cache/algo strategy. |
Sorry, I should have clarified - we'll likely want to verify memory usage in a scenario that stresses these code paths. I don't think the existing scrips do this, but that area would be where to add one. |
Before we look at making stuff faster, we should look to see if we have a miss in an existing cache, or we are missing caches in general. I did some quick math to show how serious this is and why this situation isn't because VS is low on memory, this situation is causing the low memory situation. Based on the data above, 326 MB/s (326k/ms) was surviving collection. This is either because it's faster than the GC can collect (I have no idea on the limits here) or its still being rooted (looking at the dump will probably confirm this). At that rate, it would take approx 10 seconds before VS runs out of memory. The GC is absolutely forced to collect because of the sheer amount of allocations that this code path and other code paths in that trace. This is a crazy amount of data, and smells like we are calculating stuff over and over again. |
Looking at the GC heap size, it's significantly smaller than what we would expect; 700 MB versus up to 3 GB we see in other traces. This is definitely exacerbating the issue, but not the only cause, the allocation pattern is still huge. I suspect in this case, and another trace I saw from 15.7, we have some sort of native leak on top of this. Maybe it's tied to a managed object lifetime so we can see allocation matching it. The other thing is that this trace is from 15.7, given some of the changes since then I'm not sure we can trust its data. I'd like to get traces against newer version that match above. |
@davkean are you referring to CPU caching or computation caching? either way, the compiler is using deep nested recursive functions, some of which are not tail call optimised, as well as mapping from list, to list , to list and blowing up allocs, I'm guessing the interim lists are rooted in temp stack-frames (this might explain blow-up, then cleardown). There is no caching strategy I can see on these paths, |
@cartermp ok, I will wait till these are in place before doing anything further … but from looking over the code, there is no caching, abusive over-mapping list & allocs (linked lists allocate ~x2 more then an array). The compiler functions really should be redesigned to use streaming functions, go from |
On doing my testing, the recursive type striping function was re-querying the same types over and over again, the constraint resolver resolution prevents caching of Vars/Funs but caching everything else works and passes the tests. I used a WeakConditionalTable for the TType Cache mapping. I want to sequence/stream the workflow throughput away from lists & tuples if I can, so still investigating that before I do PR but in my build of compiler it's retrieving from cache 300k times (30k for small app), which shows how wasteful the repeated recursive calls were being. Calling from cache speeds things up but as I said, would like to try remove heavy list usage in this workflow as this is ballooning the memory. we may need to change |
Kudos for even trying this!
Am Do., 13. Dez. 2018, 17:14 hat Gerard <notifications@github.com>
geschrieben:
… On doing my testing, the recursive type striping function was re-querying
the same types over and over again, the constraint resolver resolution
prevents caching of Vars/Funs but caching everything else works and passes
the tests. I used a WeakConditionalTable for the TType Cache mapping. I
want to sequence/stream the workflow throughput away from lists & tuples if
I can, so still investigating that before I do PR but in my build of
compiler it's retrieving from cache 300k times (30k for small app), which
shows how wasteful the repeated recursive calls were being.
Calling from cache speeds things up but as I said, would like to try
remove heavy list usage in this workflow as this is ballooning the memory.
we may need to change ArgInfos to arrays instead of lists to compact the
data usage, provided there is no mutability issues. will test later.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#5938 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AADgNNyGip2o_QItv3-WstuJfHCBHADWks5u4nzdgaJpZM4Yqd2K>
.
|
@gerardtoconnor, the original trace was mine, and while the project is closed source, if you need help testing I can try certain fixes, and redo the trace, see if it helps. Also, I can share the code privately (it's a relatively straight project structure, no weird msbuild steps, should load and compile out of the box), ping me if you think this would be helpful in any way. I'm still daily having issues with VS getting sluggish, though subjectively there's been some improvement over the last couple of versions of the F# stack. Noticeable because I need to restart VS a bit less than one an hour now ;). |
@abelbraaksma I have the caching code already done, I can prioritise this if needed? Are you able to pull down a branch if I post it to see if it's improved? Also you mentioned you posted the trace, I couldn't see it, can you share link? |
Most traces that @davkean posted were originally posted by me on the visual studio forum at the top of this post including this one, so I think you already have them and that you're actually working with them ;). I think the original traces are only visible to Microsoft employees, I could send them privately though if that's helpful. |
And yes, I assume I should be able to pull a branch, though it's been a while that I did it, it can't be that hard, I assume the process is still mostly the same. |
The details above are enough for now, registering big CPU times on these functions also just editing compiler. Will try have PR up tomorrow |
I have added basic caching on this path in two places in PR #6202, tests pass. I had to roll back most of the changes as there were failing tests I just didn't have time to investigate and fix. Given this issue relates to memory bloat during VS usage, how do we want to test? |
I'm closing out this specific performance tracking issue as a lot of work has been done on it back in 2019 |
Trace: https://developercommunity.visualstudio.com/content/problem/245786/slow-f-editing-experience-up-to-a-minute-until-typ.html
GC Rollup By Generation
This method indirectly causes large amounts of allocations, in above trace we're talking 6 GB:
Of which are made up of nearly entirely Tuples or lists of Tuples:
I don't know enough about this method to know if this needs to broken up further into a little more actionable task.
The text was updated successfully, but these errors were encountered: