-
Notifications
You must be signed in to change notification settings - Fork 789
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
reduce allocations #1207
reduce allocations #1207
Conversation
|
||
if x.IsResolved && y.IsResolved && not compilingFslib then | ||
x.ResolvedTarget === y.ResolvedTarget | ||
else |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style only: elif
to avoid nesting
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
92133d3
to
f95ad73
Compare
That is pretty cool. |
I'm not sure it helps even in long running scenarios like FCS/VFPT. It seems allocations do not cause any performance problems. |
@vasily-kirichenko , @dsyme we are going to implement struct tuples in the next release, Don has already submitted a prototype PR. Do you think that struct tuples would have an equivalent improvement. I am particularly concerned about manually unwinding idiomatic code such as the pattern matching and it's replacement with complex if then else expressions. What do you think, would struct tuples solve some of these allocation issues? Kevin |
(unrelated to perf:) If you look at the nested pattern match in https://github.com/Microsoft/visualfsharp/pull/1207/files#diff-5a2b2c121409423e80d58b7ffaccd472L4401 - I think we can argue if that really was better readable. I'm not saying it was not better, but it also wasn't exactly beautiful ;-) |
Agree with @forki on the The rest of changes do look idiomatic (non tupled arguments). Style-wise, for involved conditionals, I tend to be very pedantic by putting each expression on it's own line with the operator at the beginning (a win especially when the expression is long). If the compiler is not turning inline tuple in match expression it would be worth to have that optimized away, in many cases, tuples are just used as local sugar (I have no idea what the compiler is doing in the optimization phase so sorry if it's obvious question). |
I think this PR also improves readability - the conditionals using |
@@ -201,7 +201,7 @@ namespace Internal.Utilities.Text.Lexing | |||
let numUnicodeCategories = 30 | |||
let numLowUnicodeChars = 128 | |||
let numSpecificUnicodeChars = (trans.[0].Length - 1 - numLowUnicodeChars - numUnicodeCategories)/2 | |||
let lookupUnicodeCharacters (state,inp) = | |||
let lookupUnicodeCharacters state inp = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is harmless but AFAICS doesn't alter the representation or calls of the function? e.g. for
type C() =
let f (x,y) = x + y
member a.M(b,c) = f (b,c)
we get
.method assembly hidebysig instance int32
f(int32 x,
int32 y) cil managed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vasily-kirichenko Could you remove the changes in prim-lexing.fs
please? I'm pretty sure they don't remove any allocations. (If they do then let's discuss further, there must be something I'm missing). Thanks!!
Well I would agree that the original author went out of his way to make the code unreadable. (comments in the middle of expressions, idiosyncratic indenting, nested pattern matching.) My real question though is would struct tuples eliminate the need to go through the code eliminating tuples and replacing them with separate arguments? |
I'm not sure about this PR. We need to see actual perf benefit. We can make more and more things structs, and the risk is that they are just getting copied around a whole lot (at possibly extra cost). It's very difficult to work out the amount of struct copying being done by looking at the code unless we are sure of the storage location of the struct (e.g. in an array). Heap allocations have the advantage that passing the value around is relatively cheap (one word). In this case, the TokenTup struct is now very, very, very big. I can't even count the number of words.
TokenTup is now so large that it's possible that this actually slows down the lexer. So we need to see concrete performance benefits - not just reduced allocations - to know if this is a good change. Running repeated lexings of a file should help determine where the threshold for useful struct size is. |
To be clear, the changes in this function are good and the ones removing the 20MB of tuple allocation. We should definitely accept this part of the change. It's the other changes in the Lexfilter I'm not sure of. |
@KevinRansom Struct tuples would be of only marginal use here - they would allow a more local change by using |
Here's the test failure on Jenkins, I'm not sure what's causing it.
|
@dsyme it's not clear to me that a struct tuple would cause any additional copying at that pattern match after all the values are not used beyond the actual pattern match. I do agree that the original code would not be improved by adding struct ( ) as it was quite hideous code anyway. I'm not arguing to not make change but I'm not certain that manually unlinking tuples and rewriting pattern matches throughout the compiler is the way to go. If there are things about tuples and patterns that are inefficient we should fix them ... somehow. |
@vasily-kirichenko Could you resubmit (or reopen this & update) the part of this change that's an incontrovertible improvement, i.e. this bit? https://github.com/Microsoft/visualfsharp/pull/1207/files#diff-5a2b2c121409423e80d58b7ffaccd472L4401 . Please :) Thanks! |
I'll reopen this for tracking, as we definitely want to take the part mentioned above. |
@KevinRansom F# does a pretty good job of eliminating tuples - but I agree, in this sort of code below we should do it automatically
If |
match x.IsLocalRef,y.IsLocalRef with | ||
| false, false when | ||
|
||
if x.IsResolved && y.IsResolved && not compilingFslib then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the same stuff is happening couple of lines down in primValRefEq - and that method shows up in hot path when I compile Paket.Core
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah ok, then yes, that should also be fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we have this fix included as well.
f95ad73
to
678d463
Compare
I removed everything but tuples elimination. |
678d463
to
a23a52c
Compare
Reduce allocations futher
Just looked at the reported error that is interesting. The code in question is:
|
|
||
match dispatchSlots |> List.filter (fun (RequiredSlot(dispatchSlot,_)) -> OverrideImplementsDispatchSlot g amap m dispatchSlot overrideBy) with | ||
| [] -> | ||
if dispatchSlots |> List.exists (fun (RequiredSlot(dispatchSlot,_)) -> OverrideImplementsDispatchSlot g amap m dispatchSlot overrideBy) then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is wrong! there should be a |> not
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we bind to a variable, the if condition is super long?
@vasily-kirichenko can you please merge vasily-kirichenko#2 - I think that will solve it |
Fix bug in error reporting
yay that helped. |
I also refactored |
let res = ref true | ||
let fail exn = (res := false ; if showMissingMethodsAndRaiseErrors then errorR exn) | ||
let mutable res = true | ||
let fail exn = (res <- false; if showMissingMethodsAndRaiseErrors then errorR exn) | ||
|
||
// Index the availPriorOverrides and overrides by name | ||
let availPriorOverridesKeyed = availPriorOverrides |> NameMultiMap.initBy (fun ov -> ov.LogicalName) | ||
let overridesKeyed = overrides |> NameMultiMap.initBy (fun ov -> ov.LogicalName) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vasily-kirichenko Is this part below cleanup or performance improvements? If the former let's put in in a separate PR? If the latter then please look for a way to minimize the diff, e.g. by locally using 2-space indentation so old/new lines match exactly, or some other technique. Thanks!
@vasily-kirichenko This is looking good - just a couple of new comments above. |
@vasily-kirichenko @forki There's still a test failure - could you reduce the diff in the changes to |
all removed, all fixed |
@vasily-kirichenko Thanks. Please put 828dfab in a separate PR? |
@vasily-kirichenko Could you remove the changes in prim-lexing.fs please? I'm pretty sure they don't remove any allocations. (If they do then let's discuss further, there must be something I'm missing). Thanks!! |
I tried to rebase with squash, tried to push it as a new branch, but github does not seem to peek up the changes. I've given up on all this. Frankly, all this allocations story causes only trouble with literally no performance improvements. |
@dsyme would it be possible to get a set of guidelines WRT performing benchmarking of code changes in the compiler, and also benchmarking pieces of the compiler in isolation (kind of like one would do in unit tests setup). I spent significant time yesterday trying to work on #343 and trying to benchmark runs of fsi.exe but the variation in runtime and overhead (benchmarking the whole roundtrip rather than benchmarking code change itself) made my attempt a waste of effort. I'm sure all the great F# hackers who contributed to the compiler have a wealth of knowledge they could maybe share in a informal way in a wiki page or markdown file in this repository to help others following in their steps. We also have people on slack channel sharing few hints about how to use 3rd part tools (like dottrace) or what to test against, but this is only adhoc and "not as informed as ideally" kind of support. Also, it would be amazing to have benchmarking environment for fsharp, see what the chaps at xamarin have done: http://open.xamarin.com/benchmarker/front-end/ Also @ Mozilla: https://arewefastyet.com/ |
Vasily the github has serious trouble right now and all kinds of things are
|
OK, I've managed to open a new PR from against branch, it look like all the good changes are there. |
I like the title of the new PR btw, seems pretty accurate :) |
Compiling FSharp.Configuration project:
Before
After
~8.5% less allocations. Compilation time has not changed.