Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Source files are ending up on the large object heap #4881

Closed
davkean opened this issue May 11, 2018 · 7 comments
Closed

Source files are ending up on the large object heap #4881

davkean opened this issue May 11, 2018 · 7 comments
Milestone

Comments

@davkean
Copy link
Member

davkean commented May 11, 2018

Looking through perf traces of IntelliSense, I can see that large source files are ending up on the large object heap because they are read as a single chunk:

image

This is going to put pressure on Gen2 collections, and I can see that in one trace opening a completion window in a large file causes a 77ms GC2 collection.

We should read these in chunks to avoid this.

Trace: [internalshare]public\wismith\PerfViewData5.etl.zip.

@smoothdeveloper
Copy link
Contributor

I wonder if we could avoid the copies occuring at those two places:

https://github.com/Microsoft/visualfsharp/blob/75e435e74b81bb47c4a0ced11be0d7560d2a66ea/src/fsharp/UnicodeLexing.fs#L68

https://github.com/Microsoft/visualfsharp/blob/44fa027b308681a1b78a089e44fa1ab35ff77b41/src/utils/prim-lexing.fs#L161

in first location, we create a char array by calling https://msdn.microsoft.com/en-us/library/ezftk57x(v=vs.110).aspx, and then we call Array.copy in second location.

Is there a leaner way to get the char array from the stream?

In second location, in this case we don't keep the char array around at call site, we could save extra call to Array.copy (need to check other places we call LexBuffer<_>.FromChars, we might need a separate factory method skipping the copy).

@davkean
Copy link
Member Author

davkean commented May 11, 2018

Yes those too, there's a bunch of them, the following are all ending up on LOH:

image

@smoothdeveloper
Copy link
Contributor

I made a test PR #4882 that removes the call to Array.copy in that code path.

I still have issues to build the whole repository with VS tooling so I currently can't test the impact on intellisense.

@dsyme
Copy link
Contributor

dsyme commented May 13, 2018

This might be harder to fix than you think.... Or at least what you'll need to fix isn't what's described :)

Among other things we use the entire contents of source files to act as a key for some operations in the language service caches. That can be changed I'm sure (e.g. use the hash of the contents) but until we do that we need to store the whole file, breaking into chunks isn't solving the fundamental problem

@Pilchie
Copy link
Member

Pilchie commented May 14, 2018

@dsyme - the Roslyn workspace uses the editor's types to back all the files in the workspace, (sometimes privately for closed files, and sometimes publicly for open files), but we should be able to use editor version types or even instances to manage these caches. In fact, if we just threaded the Workspace/Solution/Document/Project types further through, we might be able to eliminate these caches.

@cartermp
Copy link
Contributor

#4948 by @auduchinok also reduces some copying. With that and @smoothdeveloper's changes in place it would be good to measure this again and see if it's had a positive impact.

@cartermp
Copy link
Contributor

cartermp commented Dec 18, 2018

This is likely lessened greatly by #6001

We closed #5944 because we'll need to incorporate a hash code (or some integer ID) for an ISourceText in the language service cache keys, which should help us in reducing the size of these data structures that we keep in memory. But for the time being, things should be a lot better now to #6001

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants