-
Notifications
You must be signed in to change notification settings - Fork 790
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] [CompilerPerf] Seq - the next generation. #2745
Conversation
6afd2a3
to
fb00c6a
Compare
I'm starting to create a "real world" collection of tests, I'm not sure if these are what you were thinking of previous comment for adding to the build? So far I'm mining @theburningmonk Euler Problem set efforts. I'm basically coping them verbatum, except where they exclusively don't use Seq, in which case I am modifying them (such as problem 1, which originally was on List, but I changed to Seq) where they contain a combination of Seq as well as List or Array items I have left them in the original state. The collated tests are temporarily here, until I find a better home for them. I'll do a follow up post with current timings, which I will update. Note: the site of tests are not particularly optimised. Performance was obviously not a consideration when this code was written. I might provide slightly modified versions that do run faster at some stage, especially in cases where we are seeing some negative performance impacts. |
TheBurningMonk - Euler - 5 This is worse under both 32-bit and 64-bit. The reason is that the inner loop is the function isEvenlyDividedByAll which calls Seq.forall on numbers, which is an array, and the forall condition is broken regularly very early in the array. In the new Seq this requires a wrapper class for the Array, and creation of a Fold object. It can run via executeThin which makes it a little less heavy, but still requires some initialization and cleanup. Some alternative coding are available here. 5a - This modifcation calls Seq.ofArray up front so the type checks don't need to be performed at each step. 32-bit
64-bit
So after either change the new version is significantly faster again. |
Just so I'm not wasting my time; is this the sort of information that you are after? Obviously still need to do the full matrix of all function, but as an adjunct... |
@manofstick It's fascinating - and yes, very helpful! |
d987910
to
0527641
Compare
Latest timings (link to my data collation function so I can create this more easily next time) ... deleted ... more results below ... |
3773ea8
to
882ee69
Compare
OK; had a bit of a speedbump with nested concat, but think back on track. Anyway, gives an opportunity to boast about performance. In a particularly nasty case for the old Seq I present:
Then the time difference between the old and the new is (after warm up) 60ms vs 16000s! So over 250x faster... |
882ee69
to
5726129
Compare
An initial test indicates that this doesn't hurt the overall perf of the compiler
v. some recent times on master:
That may even indicate an improvement, e.g. the I'll do more test runs but they above figures are vaguely indicative and would show up a big regression (note they are measuring wall-time for process completion) |
@dsyme well that's promising! I have made a couple of architectural changes. I have pushed the "bare bones" types up in to prim_types, which has then allowed me to derive seq computational expressions and will the range operators from ISeq. There is still room for improvement in regards to the computational expressions as really the ISeq root shouldn't need to call any virtual functions to iterate the state machine, but I would need to do some slight modifications to the state machine generation logic, which I'm unwilling to do at this current juncture. I was keen to add ISeq to list though, to save the need for the wrapper, but I couldn't even get the proto-compiler to compile even with just empty functions (I ponder if this is related to generic function on generic interface in prim types? Anyway I don't really know, nor did I attempt any sort to find out. I thought someone more knowledgeable in that area (ummm, you!) might come to a quicker diagnosis.) Anyway, should be enjoying my Easter in the mountains, damn modern telecommunications. |
Yes you should. Paste a picture of the mountains in here please. Then turn your phone and computer off. :) |
5726129
to
5f4bc24
Compare
Back on deck; I've obviously broken something because the build isn't finishing... Will investigate when I get an opportunity. |
5f4bc24
to
2baa6a2
Compare
Looking at those again, I think nearly all are explicit uses of GetEnumerator. |
Hmmm. Look, its not going to kill me if you don't slurp this PR in. I think it was a neat re-imagining of how Seq could be implemented, but everything is a matter of trade offs. Personally those trade offs have meant that pretty much all Seq code in our (my works) code base reflexively gets written as seq comprehensions, rather than using Seq composition, but I might be a little more performance conscious than most (but obviously not too much, or I would ditch the whole stuff and write my code in Rust or something!) One of the most egregious cases - Seq.init(Infinite)? - has been lessened a bit by cleaning up of lazy in .net core (which reminds me that I should submit a PR to add "inline" to LazyExtensions.Create to avoid extra creation of superfluous object.) (Don't get me wrong, even after both theses are done this is still a wildly dubious Seq.init implementation that really should never have existed and basically still renders it useless if you care about performance at all...) This really is a case of the whole or none - mixing and matching between "normal" pull seqs and these push seqs comes at a performance penalty. A "smaller" improvement could just make filter/map special cases - like Linq does. Or just accept status quo. Hmmm. |
Oh, and with the optimization changes that have occurred within the JIT it's probably worth rerunning the test suite, as they may have had effects for good or for ill... |
@manofstick Possible minimal cuts:
|
@manofstick If it's easy for you to re-run the performance tests that would be great. |
FWIW, I'd like at least some of these to be merged, as building this by hand is rather tricky to get right and the performance gains are significant in quite a few scenarios. I'd suggest to start with a minimal set and see what feedback that brings. |
(0) easiest, but defeatist and a little luddite. So if you're happy with (1) then so am I. The final question would be do you think this has any relevance beyond Seq? i.e. should it exist in System.Linq (i.e. would it ever natively be implemented on ResizeArray etc.)? I'm think probably not, because it is a little bit more than just the interface (i.e. you have to support the Activity object, which isn't a role model of encapsulation...) I'm just asking because once it's embedded in FSharp.Core then this possibility is off the table. |
@manofstick FYI: "We present the first approach that represents the full general- [1] https://strymonas.github.io/ |
@zpodlovics this implementation is closer to Nessos's Stream library but attempting to retain full compatibility with the f# Seq implementation. It uses no runtime compilation. |
@manofstick True, and sorry for the noise. I just wanted to share some additional ideas that may worth to explore. |
I think this is my biggest issue with what you've done. |
Hahaha, it did make me feel dirty! Although this could be better if it was defined in C#, because they i could of used the Hmm. And really the whole State thing could be thrown away - that was just done to make up for the fact that fsharp object expressions when they captured mutable values they wrapped them in FSharpRef even when they are fully scoped within the object expression - so this was done to minimize garbage (the same issue occurs with general lambdas, seq expression, etc.) A modification to the compiler could eradicate the need for this - making the code simpler, and possibly providing some wider benefits as well. Hmm... Thoughts on any of this? |
The conversation here dried up, so just trying to revive it... Not that I'm finding much time (hmm... nothing seems to change!) but I am keen for the external nuget FastSeq package as previously discussed. So I think you were in agreement there? So maybe I'll close this PR, and create a new one with just the changes in In the FastSeq I would expose both the Seq namespace overlay as well as the inline ISeq namespace (i.e. so you could modify your code to make use of the inlining versions) Any thoughts on the root namespace of FastSeq? namespace |
I'm still not sure. What I'd love is if there was some simple truly abstract interface that could admit multiple future implementations. The types in prim-types.fsi are still very concrete. This sort of thing feels acceptable:
If we could have something as small as that and there was basically no public implementation code that would be great (it would have an internalized implementation on list and ranges) . But I can't see how to make two implementations (one internal and one external) cooperate together. What happens if the whole thing goes in a "FastSeq" package? How problematic is that? |
OK, when I get a chance I'll have a little play to see if I can bash it into a more acceptable shape. |
Hmmm. This was a good piece of work, but once again if it had been there in the early days I think it would have been good, but now... Hmmm. A bit too invasive. Maybe I'll try again at some stage to do another effort that is less grandiose... |
@manofstick, sorry to hear this project is being scrapped :(. It's a little hard to follow all the twists and turns in this set of issues. Can you point me to the best way to start using this work if it's not going to be included in FSharp.Core? Is there a Nuget package that has everything needed? |
I think this is important work, and for sure I have enjoyed seeing it evolve. seq is so fundamental a part of our programming model that any attempt to modernize it was going to be a tough row to hoe. Do you think it is better to close it, and start again from ground up or just leave it around for us to mull over a while. This is my way of saying, it is not a PR I regarded as orphaned, merely 'bloody hard'. Kevin |
For anyone who was interested in this, this has been revised in a somewhat morphed form into an (almost there!) complete re-implementaion of It's got a partially complete f# wrapper, and an exceedingly rudimentary example. Hopefully over the next few weeks/months these will become more fleshed out. I haven't current fashioned a nuget package suite, but will do so when I get time. The constant bane of my life :-) Anyway, any calls for assistance have usually fallen on deaf ears, but hey, worth another call out if anyone wants help, in whatever fashion! |
@manofstick, I'd love to help, where can I start, any particular area? It would be great to move this forward. Maybe this should be reopened? |
I don't think this should be reopened (at this stage at least). It's too big a change to be sucked into the main repository without the code being battled tested for a while... (and people at MS-HQ have never really embraced my big PRs, I have always wanted them to be joint efforts, because personally I find them quite important, but alas, other than some code reviews, I have yet to gain any traction... Really just to the point where I have given up) Anyway, come over to Cistern.Linq and help out over there if you wish. There the core is written in C#, but I have F# wrapper with the goal of being highly API compatible with Seq (I currently has stubs to Seq filling in missing parts). Here is an of usage . Anyway, I am still pondering how highly API compatible, as there are some different decisions, such as, for example, I don't throw exceptions on So if you want to help, from least knowledge to most, you could:
Any assistance would be appreciated. Thanks, |
...and just for jiggles I modified one of my csharp benchmarks into f#... (
And so you can see why I think Seq should be replaced - in this not particularly contrived example we run about 4 times faster... |
@manofstick Wow, impressive! I'm currently out of the country, but should be behind my desk next week and I'll have a look at the tasks you mentioned. I'll probably download, compile and run it first to get a feeling for it. Awesome work! |
A rewrite of Seq. This will be a branch which is maintained from the head of #2587, which is the underlying tech.
Things to consider about to exist [WIP] status (unedited from comments below):
The main one is should ISeq be exposed beyond it's bare minimum rrequirements. I think it has value, especially as I have made Seq a bit more of the swiss-army knife, using the ISeq, List or Array modules are most appropriate.
[dsyme] Resolution: No. I think the focus for now should be on having ISeq be as private as possible while still getting performance gains
Should an FSharpFunc be part of the public ISeq interface? It's relatively easy to convert that signature of Fold to take an interface for the Folder creation.
[dsyme] Resolution: No. I think the focus for now should be on having ISeq be as private as possible while still getting performance gains
I have for the most part really tried to minimize object construction. This could be further reduced to collapse combined ITransformFactory objects into the same objects that implement ISeq. It comes at the expense of an extra generic parameter to those objects though, so I think it would just make more runtime code bloat. I tried it and it also slowed code down, so I didn't go ahead.
[dsyme] ok, marking as resolved, we won't do this
On the flipside of runtime code bloat I could move the main iteration code out of the fold methods so that that code doesn't rely on the Result generic parameter (and adding another layer in between Activity and Folder which doesn't take that parameter). I didn't take time to measure runtime size, but it did slow things down a tad, so I didn't go ahead.
[dsyme] ok, marking as resolved, we won't do this
I'm not really happy with the Thin and Enumerable monikers I have used. So maybe a alternative consistent naming should be applied.
[dsyme] ok, I'll make comments on naming as I do the code review
Decide if helper classes should be public. I'm thinking the IdentityFactory and ComposeFactory (at the moment, even internally, I have them duplicated - they are trivial implementations, but maybe should be exposed externally)
[dsyme] What are the pros here?
[manofstick] Well as ISeq will be private, this is moot.
Possibly add a
register
function which would add a hook into ofSeq. This would allow alternative collections types to then expose the ISeq functionality. i.e. at the moment there is ISeq.ofResizeArrayUnchecked to utilize the ISeq functionality for ResizeArray, but ISeq wouldn't get used otherwise. Register might take something likeseq<'a> -> option<ISeq<'a>>
and just iterate through.[dsyme] Resolution: No. We don't want this kind of type-registry state in FSharp.Core
Possibly add a
force
function. This is kind of the equivalent oftoList
,toArray
etc. but more flexible and efficient. i.e. if it already is an appropriateThin
ISeq then that type is just returned directly, if it a Delayed type then it is executed and that is returned, or otherwise it would populate a ResizeArray and the ISeq.ofResizeArrayUnchecked would be run.[dsyme] Resolution: I think we can do without this for now
Decide if the functionality in EnumerableBase (bad name!) should be exposed. This could allow Length to be optimized for alternative collections
[dsyme] Resolution: I think we can do without this for now
Decide if ISeq should carry all the API of List/Array/Seq. The previous release just got all those APIs in sync, but maybe we should only have the functions that actually make sense at that level... I'm thinking
findBack
etc...[dsyme] Resolution: No. I think the focus for now should be on having ISeq be as private as possible while still getting performance gains
ISeq was approached as the inner implementation of Seq, as as such makes extensive use of inline. Maybe wind back in some cases if considered excessive.
[dsyme] Yes, this is a concern, need to look at this and FSHarp.Core size