Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize Filename.checkPathForIllegalChars #1662

Merged

Conversation

vasily-kirichenko
Copy link
Contributor

No description provided.

@KevinRansom
Copy link
Member

Do we really want to cache tested paths for the run of the program just to speed up 'Filename.checkPathForIllegalChars'. Especially since I assume most apps are going to go to the disk after the check which will undoubtedly consume vast amounts more clock time than this little function.

Perhaps I am wrong though .... what do you think?

Kevin

@forki
Copy link
Contributor

forki commented Oct 26, 2016

@KevinRansom did you check the linked issue in fsharp/fsharp-compiler-docs#659 AFAIK this is mainly a FCS issue but we probably want to keep code synced whenever possible.

@KevinRansom
Copy link
Member

@forki,
It would be interesting to know why it is called so much, and also why the IO component didn't swamp this. Perhaps the path caching could be done higher up the ionide stack. Alternatively I suppose we could consider caching the most recent n paths for some small n. Before we do that though, I would like to understand why this is called so much.

@KevinRansom
Copy link
Member

KevinRansom commented Oct 26, 2016

@forki,

I agree keeping the core code aligned between the compilerservice and the compiler is essential.

E.g
chopExtension calls it twice for all strings that aren't "." ... doh!!!!

if ionide uses chopextension fixing this would halve the number of calls

Also as for the comment ' // for OCaml compatibility' I don't know why that is interesting.

let hasExtension (s:string) = 
    checkPathForIllegalChars s
    let sLen = s.Length
    (sLen >= 1 && s.[sLen - 1] = '.' && s <> ".." && s <> ".") 
    || Path.HasExtension(s)

let chopExtension (s:string) =
    checkPathForIllegalChars s
    if s = "." then "" else // for OCaml compatibility
    if not (hasExtension s) then 
        raise (System.ArgumentException("chopExtension")) // message has to be precisely this, for OCaml compatibility, and no argument name can be set
    Path.Combine (Path.GetDirectoryName s,Path.GetFileNameWithoutExtension(s))

@dsyme
Copy link
Contributor

dsyme commented Oct 27, 2016

@vasily-kirichenko The PR is good, I'll reopen :) thanks

@dsyme
Copy link
Contributor

dsyme commented Oct 27, 2016

@KevinRansom Agreed we should work out why this is being called so often - people keep reporting this from time to time as being top of profiling traces. Also just cleaning up the legacy helpers and using standard .NET ones is good too. This fix itself is good enough for now, once it's green lets merge it.

@dsyme dsyme merged commit 9753f72 into dotnet:master Oct 27, 2016
@KevinRansom
Copy link
Member

@dsyme It's a grow only cache and therefore bad, especially for long running apps. Large F# solutions inside VS would particularly suffer from this creeping memory bloat.
Also if we created a compiler server, certainlyu a possibility if we make the compiler solidly free threaded. This would eventually contain a path for every single file seen by the compiler, and consume vast tracts of memory. The bulk of that memory would be duplicated, since it would be the path to the root of each solution, each directory containing multiple files etc..

Perhaps we should:
. Remove this check from these very low level Apis,
. add a validate api and validate the path once then do work manipulating the paths.

I shall revert this change ... I realize there is a problem ... but we need a solution that will not grow our memory in a big way

Kevin

@forki
Copy link
Contributor

forki commented Oct 27, 2016

How many files does the compiler see during the life time? How often have we cache hits?

KevinRansom added a commit that referenced this pull request Oct 27, 2016
@vasily-kirichenko
Copy link
Contributor Author

I think this cache may improve performance of VFT (IDE), not the compiler.

@forki
Copy link
Contributor

forki commented Oct 27, 2016

Yeah I meant that compiler as a service/ VS integration.

Am 27.10.2016 20:46 schrieb "Vasily Kirichenko" notifications@github.com:

I think this cache may improve performance of VFT (IDE), not the compiler.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#1662 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AADgNC60DJBoJWYRtmGB7Up3hXHpgXLCks5q4PGggaJpZM4Kg9kk
.

@KevinRansom
Copy link
Member

KevinRansom commented Oct 27, 2016

@forki @vasily-kirichenko I don't doubt that this change will speed up validation of paths already seen. I expect that we validate already seen paths too many times. It is hard to explain how this functionality in the normal use of a compiler or even an IDE can get to the top of any performance graph. Since it seems to indicate that we are about to do IO. I'm just suggesting that before we work on improving this function we look to see if we are not it using too aggressively. An easy example that took less than a minute to find was in cropExtension, where we caused it to be invoked twice per string.

https://github.com/Microsoft/visualfsharp/pull/1666/files

I expect the proper solution is to look carefully at where we use this API and see if it is really necessary, or where we can reduce the number of times it is invoked. After all not invoking the function will save us a hashing and a table look up.

Does that make sense ?

Kevin

@KevinRansom
Copy link
Member

Okay:

I metered these functions, compiling fsharp.core.dll this is the usage count:

checkPathForIllegalCharsCalls : {contents = 947;}

hasExtensionsCalls            : {contents = 0;}
chopExtensionsCalls           : {contents = 63;}
directoryNameCalls            : {contents = 1;}
fileNameOfPathCalls           : {contents = 138;}
fileNameWithoutExtensionCalls : {contents = 745;}

before my earlier check-in for chopExtension it would have been and extra 63 so 1010.

This is not really a large number of invocations so any reductions in the compiler are unlikely to noticeably reduce compile time. Someone needs to look at why it pops so high when you use the compiler server from an IDE.

I will continue to see what I can do to drive the number of validations down here in the compiler. But I think the issue has to be elsewhere.

@KevinRansom
Copy link
Member

@forki @vasily-kirichenko @dsyme
Checkout this PR: #1677

It reduces the number of invocations for an fsharp.core.dll compile by about 75%. and is tiny :-)

Let me know how these changes perform in your fcs scenarios, if you still have issues or need some help identifying possible improvements let me know and I will take a further look.

Kevin

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants