Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reduce allocations #1207

Closed

Conversation

vasily-kirichenko
Copy link
Contributor

Compiling FSharp.Configuration project:

Before
baseline

After
detupled lookupunicodecharacters

~8.5% less allocations. Compilation time has not changed.


if x.IsResolved && y.IsResolved && not compilingFslib then
x.ResolvedTarget === y.ResolvedTarget
else
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style only: elif to avoid nesting

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@forki
Copy link
Contributor

forki commented May 19, 2016

That is pretty cool.

@vasily-kirichenko
Copy link
Contributor Author

I'm not sure it helps even in long running scenarios like FCS/VFPT. It seems allocations do not cause any performance problems.

@KevinRansom
Copy link
Member

@vasily-kirichenko , @dsyme we are going to implement struct tuples in the next release, Don has already submitted a prototype PR.

Do you think that struct tuples would have an equivalent improvement. I am particularly concerned about manually unwinding idiomatic code such as the pattern matching and it's replacement with complex if then else expressions.

What do you think, would struct tuples solve some of these allocation issues?

Kevin

@forki
Copy link
Contributor

forki commented May 19, 2016

(unrelated to perf:)

If you look at the nested pattern match in https://github.com/Microsoft/visualfsharp/pull/1207/files#diff-5a2b2c121409423e80d58b7ffaccd472L4401 - I think we can argue if that really was better readable. I'm not saying it was not better, but it also wasn't exactly beautiful ;-)

@smoothdeveloper
Copy link
Contributor

Agree with @forki on the if/elif now kind of looking more readable, mostly due to the | _ with nested match at same level which I'm not fond of.

The rest of changes do look idiomatic (non tupled arguments).

Style-wise, for involved conditionals, I tend to be very pedantic by putting each expression on it's own line with the operator at the beginning (a win especially when the expression is long).

If the compiler is not turning inline tuple in match expression it would be worth to have that optimized away, in many cases, tuples are just used as local sugar (I have no idea what the compiler is doing in the optimization phase so sorry if it's obvious question).

@tpetricek
Copy link
Contributor

I think this PR also improves readability - the conditionals using if make more sense to me than the original pattern matching on pairs of Booleans.

@@ -201,7 +201,7 @@ namespace Internal.Utilities.Text.Lexing
let numUnicodeCategories = 30
let numLowUnicodeChars = 128
let numSpecificUnicodeChars = (trans.[0].Length - 1 - numLowUnicodeChars - numUnicodeCategories)/2
let lookupUnicodeCharacters (state,inp) =
let lookupUnicodeCharacters state inp =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is harmless but AFAICS doesn't alter the representation or calls of the function? e.g. for

type C() = 
   let f (x,y) = x + y
   member a.M(b,c) = f (b,c)

we get

.method assembly hidebysig instance int32 
        f(int32 x,
          int32 y) cil managed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vasily-kirichenko Could you remove the changes in prim-lexing.fs please? I'm pretty sure they don't remove any allocations. (If they do then let's discuss further, there must be something I'm missing). Thanks!!

@KevinRansom
Copy link
Member

Well I would agree that the original author went out of his way to make the code unreadable. (comments in the middle of expressions, idiosyncratic indenting, nested pattern matching.)

My real question though is would struct tuples eliminate the need to go through the code eliminating tuples and replacing them with separate arguments?

@dsyme
Copy link
Contributor

dsyme commented May 19, 2016

I'm not sure about this PR. We need to see actual perf benefit.

We can make more and more things structs, and the risk is that they are just getting copied around a whole lot (at possibly extra cost). It's very difficult to work out the amount of struct copying being done by looking at the code unless we are sure of the storage location of the struct (e.g. in an array).

Heap allocations have the advantage that passing the value around is relatively cheap (one word).

In this case, the TokenTup struct is now very, very, very big. I can't even count the number of words.

  • PositionTuple is already quite big and is a struct
  • LexbufState is already quite big and is a struct
  • So TokenTup is the sum of these and more

TokenTup is now so large that it's possible that this actually slows down the lexer. So we need to see concrete performance benefits - not just reduced allocations - to know if this is a good change. Running repeated lexings of a file should help determine where the threshold for useful struct size is.

@dsyme
Copy link
Contributor

dsyme commented May 19, 2016

To be clear, the changes in this function are good and the ones removing the 20MB of tuple allocation.

We should definitely accept this part of the change. It's the other changes in the Lexfilter I'm not sure of.

@dsyme
Copy link
Contributor

dsyme commented May 19, 2016

@KevinRansom Struct tuples would be of only marginal use here - they would allow a more local change by using match (struct (x.IsResolved, y.IsResolved)) with ... but that would only make the code worse. And while it would have removed the allocation it would equally cause more copying.

@dsyme
Copy link
Contributor

dsyme commented May 19, 2016

Here's the test failure on Jenkins, I'm not sure what's causing it.

1) Failed : FSharp-Tests-Typecheck+Sigs.sigs
Error running command 'D:\j\workspace\release_ci_pa---866fd2c3\tests\fsharpqa\testenv\bin\diff.exe' with args 'neg10.err neg10.bsl normalize' in directory 'D:\j\workspace\release_ci_pa---866fd2c3\tests\fsharp\typecheck\sigs'. ERRORLEVEL 1 ERRORLEVEL 1
at NUnitConf.checkTestResult(Result`2 result) in D:\j\workspace\release_ci_pa---866fd2c3\tests\fsharp\nunitConf.fs:line 14

@KevinRansom
Copy link
Member

@dsyme it's not clear to me that a struct tuple would cause any additional copying at that pattern match after all the values are not used beyond the actual pattern match. I do agree that the original code would not be improved by adding struct ( ) as it was quite hideous code anyway.

I'm not arguing to not make change but I'm not certain that manually unlinking tuples and rewriting pattern matches throughout the compiler is the way to go. If there are things about tuples and patterns that are inefficient we should fix them ... somehow.

@dsyme
Copy link
Contributor

dsyme commented May 20, 2016

@vasily-kirichenko Could you resubmit (or reopen this & update) the part of this change that's an incontrovertible improvement, i.e. this bit? https://github.com/Microsoft/visualfsharp/pull/1207/files#diff-5a2b2c121409423e80d58b7ffaccd472L4401 . Please :) Thanks!

@dsyme
Copy link
Contributor

dsyme commented May 20, 2016

I'll reopen this for tracking, as we definitely want to take the part mentioned above.

@dsyme
Copy link
Contributor

dsyme commented May 20, 2016

@KevinRansom F# does a pretty good job of eliminating tuples - but I agree, in this sort of code below we should do it automatically

match expr1, expr2 with 
| true, true -> ...
| _ -> 

If expr1 and expr2 are known not to have side effects then we get good code for this. However in this particular case expr1 and expr2 both include a property access, which is not known to be side-effect-free, so not all optimizations kick in.

match x.IsLocalRef,y.IsLocalRef with
| false, false when

if x.IsResolved && y.IsResolved && not compilingFslib then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the same stuff is happening couple of lines down in primValRefEq - and that method shows up in hot path when I compile Paket.Core

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok, then yes, that should also be fixed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have this fix included as well.

@vasily-kirichenko
Copy link
Contributor Author

I removed everything but tuples elimination.

@forki
Copy link
Contributor

forki commented May 21, 2016

Just looked at the reported error

image

that is interesting.

The code in question is:

module EnumOfString_FSharp_1_0_bug_1743 = begin
    type IA<'a> =
        interface 
            abstract X : unit -> 'a
        end

    type IB<'a,'b> =
        interface 
            inherit IA<'a>
            inherit IA<'b>
        end

    let x = { new IB<_,_> with X() = failwith "" } // this is the reported line
end


match dispatchSlots |> List.filter (fun (RequiredSlot(dispatchSlot,_)) -> OverrideImplementsDispatchSlot g amap m dispatchSlot overrideBy) with
| [] ->
if dispatchSlots |> List.exists (fun (RequiredSlot(dispatchSlot,_)) -> OverrideImplementsDispatchSlot g amap m dispatchSlot overrideBy) then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is wrong! there should be a |> not

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we bind to a variable, the if condition is super long?

@forki
Copy link
Contributor

forki commented May 21, 2016

@vasily-kirichenko can you please merge vasily-kirichenko#2 - I think that will solve it

Fix bug in error reporting
@forki
Copy link
Contributor

forki commented May 21, 2016

yay that helped.

@vasily-kirichenko
Copy link
Contributor Author

I also refactored CheckDispatchSlotsAreImplemented, it was a mess.

let res = ref true
let fail exn = (res := false ; if showMissingMethodsAndRaiseErrors then errorR exn)
let mutable res = true
let fail exn = (res <- false; if showMissingMethodsAndRaiseErrors then errorR exn)

// Index the availPriorOverrides and overrides by name
let availPriorOverridesKeyed = availPriorOverrides |> NameMultiMap.initBy (fun ov -> ov.LogicalName)
let overridesKeyed = overrides |> NameMultiMap.initBy (fun ov -> ov.LogicalName)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vasily-kirichenko Is this part below cleanup or performance improvements? If the former let's put in in a separate PR? If the latter then please look for a way to minimize the diff, e.g. by locally using 2-space indentation so old/new lines match exactly, or some other technique. Thanks!

@dsyme
Copy link
Contributor

dsyme commented May 21, 2016

@vasily-kirichenko This is looking good - just a couple of new comments above.

@dsyme
Copy link
Contributor

dsyme commented May 21, 2016

@vasily-kirichenko @forki There's still a test failure - could you reduce the diff in the changes to CheckDispatchSlotsAreImplemented please so we can see what's changed? Or perhaps put those changes in another PR (since they seem to be causing failures)

@vasily-kirichenko
Copy link
Contributor Author

all removed, all fixed

@dsyme
Copy link
Contributor

dsyme commented May 23, 2016

@vasily-kirichenko Thanks. Please put 828dfab in a separate PR?

@dsyme
Copy link
Contributor

dsyme commented May 23, 2016

@vasily-kirichenko Could you remove the changes in prim-lexing.fs please? I'm pretty sure they don't remove any allocations. (If they do then let's discuss further, there must be something I'm missing). Thanks!!

@vasily-kirichenko
Copy link
Contributor Author

I tried to rebase with squash, tried to push it as a new branch, but github does not seem to peek up the changes. I've given up on all this. Frankly, all this allocations story causes only trouble with literally no performance improvements.

@smoothdeveloper
Copy link
Contributor

@dsyme would it be possible to get a set of guidelines WRT performing benchmarking of code changes in the compiler, and also benchmarking pieces of the compiler in isolation (kind of like one would do in unit tests setup).

I spent significant time yesterday trying to work on #343 and trying to benchmark runs of fsi.exe but the variation in runtime and overhead (benchmarking the whole roundtrip rather than benchmarking code change itself) made my attempt a waste of effort.

I'm sure all the great F# hackers who contributed to the compiler have a wealth of knowledge they could maybe share in a informal way in a wiki page or markdown file in this repository to help others following in their steps.

We also have people on slack channel sharing few hints about how to use 3rd part tools (like dottrace) or what to test against, but this is only adhoc and "not as informed as ideally" kind of support.

Also, it would be amazing to have benchmarking environment for fsharp, see what the chaps at xamarin have done:

http://open.xamarin.com/benchmarker/front-end/

Also @ Mozilla: https://arewefastyet.com/

@forki
Copy link
Contributor

forki commented May 23, 2016

Vasily the github has serious trouble right now and all kinds of things are
failing.
On May 23, 2016 7:17 PM, "Gauthier Segay" notifications@github.com wrote:

@dsyme https://github.com/dsyme would it be possible to get a set of
guidelines WRT performing benchmarking of code changes in the compiler, and
also benchmarking pieces of the compiler in isolation (kind of like one
would do in unit tests setup).

I spent significant time yesterday trying to work on #343
#343 and trying to
benchmark runs of fsi.exe but the variation in runtime and overhead
(benchmarking the whole roundtrip rather than benchmarking code change
itself) made my attempt a waste of effort.

I'm sure all the great F# hackers who contributed to the compiler have a
wealth of knowledge they could maybe share in a informal way in a wiki page
or markdown file in this repository to help others following in their steps.

We also have people on slack channel sharing few hints about how to use
3rd part tools (like dottrace) or what to test against, but this is only
adhoc and "not as informed as ideally" kind of support.

Also, it would be amazing to have benchmarking environment for fsharp, see
what the chaps at xamarin have done:

http://open.xamarin.com/benchmarker/front-end/

Also @ Mozilla: https://arewefastyet.com/


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#1207 (comment)

@vasily-kirichenko
Copy link
Contributor Author

OK, I've managed to open a new PR from against branch, it look like all the good changes are there.

#1215

@smoothdeveloper
Copy link
Contributor

I like the title of the new PR btw, seems pretty accurate :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants