Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Collaboration #1

Open
manofstick opened this issue Oct 30, 2020 · 2 comments
Open

Collaboration #1

manofstick opened this issue Oct 30, 2020 · 2 comments

Comments

@manofstick
Copy link
Owner

Hi anyone!

Looking for anyone else who wants to:

  • help complete the whole surface area (commenting out functions as I do them here)
  • write more benchmarks
  • expand testing to ensure both up and down pipes are covered
  • think of any further optimizations
  • offer better naming
  • organise/cleanup project
  • whatever else

@reegeek Iooking at you! :-) I think my underlying architecture here offers greater benefits, but obviously I'm not as far down the path as you.

@reegeek
Copy link
Contributor

reegeek commented Oct 31, 2020

Hi,
StructLinq was initially a playground to discover a lot of aspect of C# and OSS project.
And now, I think I can achieve all LINQ functionality with better performance and almost 0 allocation.
I am doing this on my part time.

For the moment I can add StructLinq version to your benchmarks and have a look to your underlying architecture.

Regards.

@manofstick
Copy link
Owner Author

manofstick commented Oct 31, 2020

That'd be great!

Well you can use this interaction as an additional playground for OSS by interacting with people (which I think is the hardest bit! You can create the greatest thing in the world, but if you can't get people along for the ride then it doesn't matter one little bit...)

And I too would consider this a playground, it's done in my spare time (but in the "obsession" phase at the moment, where it's taking family time too, but hey after months of lock-down need some sort of escape :-P)

Anyway, I'm not sure if anyone will actually use this in real projects, so definitely not the palace to be hanging out if you want fame and fortune, but maybe just for fun and/or for bending you brain!

So architecturally it's a hybrid design, supporting both a push and a pull model.

The pull type model is the more flexible one, and can be used to create everything (and originally I used to for the aggregation functions). It's supported by the CreateObject(Ascent|Descent)? functions across INode & INodes. The Ascent/Descent of the objects is used so we can delay knowledge of the enumerator type, hence not having to carry it around with us at the top level, which also means we can, at only the expense of an allocation, but no affect to pipeline speed, seemly switch to being passed around as an IEnumerable.

The push type model is supported by CreateObjectViaFastEnumerator which just stacks all the elements together directly, but can't be delayed. It turned out to be "fast enough" so that I deprecated specific optimizations for various patterns (i.e. exisiting System.Linq and my prior Cistern.Linq "recognised" certain patterns like Where.Select.ToList and so "hardcoded" them in).

Finally there is CheckForOptimization, which is my of replicating "interface-interrogation" without using interfaces. i.e. this was what was used to find the Where.Select.ToList pattern (which I have now removed, but still use it for other things). To see how it can be used to find pattern, you can go back to the changeset where I removed it.

Anyway, I should write this down in a better forum than this issue, but time, time, time... I'll get there :-)

So would be great if you can add your stuff into Benchmarks - I added some into the ValueLambdas area, which was after I stumbled across your project! I thought LinqAF was the only previous value-based linq - but yeah, obviously haven't spread it wider, so your assistance would be fabulous.

Oh, don't have this in the repository, but another vector to add to benchmarks is organisation of source data in relation to logic (which makes benchmarking super hard!). I was playing with Select.Where.Max (as I was beating handcoded, which was puzzling me, but appears that in the case I was working on it was actually true!), so here is a summary of my findings:

(From https://github.com/manofstick/Cistern.ValueLinq/tree/main/Benchmark/ValueLambdas/SelectWhereMax although I haven't yet added these different generators to the code and run the results is a systematic fashion)

Generator: for (var i = 0; i < size; ++i) yield return i;

Example: [0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,...]

Method Length ContainerType Mean Error StdDev Ratio Gen 0 Gen 1 Gen 2 Allocated
CisternValueLinq 1000000 Array 2.766 ms 0.0234 ms 0.0219 ms 1.50 - - - -
Handcoded 1000000 Array 1.839 ms 0.0063 ms 0.0049 ms 1.00 - - - -

Generator: for (var i = 0; i < size; ++i) yield return -i;

Example: [0,-1,-2,-3,-4,-5,-6,-7,-8,-9,-10,-11,-12,-13,-14,-15,...]

Method Length ContainerType Mean Error StdDev Ratio Gen 0 Gen 1 Gen 2 Allocated
CisternValueLinq 1000000 Array 3.085 ms 0.0162 ms 0.0127 ms 0.47 - - - -
Handcoded 1000000 Array 6.540 ms 0.0910 ms 0.0851 ms 1.00 - - - -

Generator: for (var i = 0; i < size; ++i) yield return i * ((i & 1) == 1 ? 1 : -1);

Example: [0,1,-2,3,-4,5,-6,7,-8,9,-10,11,-12,13,-14,15,-16,17,...]

Method Length ContainerType Mean Error StdDev Ratio Gen 0 Gen 1 Gen 2 Allocated
CisternValueLinq 1000000 Array 2.745 ms 0.0083 ms 0.0077 ms 1.49 - - - -
Handcoded 1000000 Array 1.838 ms 0.0072 ms 0.0064 ms 1.00 - - - -

Generator: for (var i = 0; i < size; ++i) yield return (size-i) * ((i & 1) == 1 ? 1 : -1);

Example: [-100,99,-98,97,-96,95,-94,93,-92,91,-90,89,-88,87,-86,...]

Method Length ContainerType Mean Error StdDev Ratio Gen 0 Gen 1 Gen 2 Allocated
CisternValueLinq 1000000 Array 2.922 ms 0.0091 ms 0.0076 ms 0.49 - - - -
Handcoded 1000000 Array 5.948 ms 0.0349 ms 0.0327 ms 1.00 - - - -

Generator: for (var i = 0; i < size; ++i) yield return (i < size/2) ? i : -i;

Example: [0,1,2,3,4,5,...,-50,-51,-52,-53,-54,-55,...]

Method Length ContainerType Mean Error StdDev Ratio Gen 0 Gen 1 Gen 2 Allocated
CisternValueLinq 1000000 Array 2.839 ms 0.0221 ms 0.0184 ms 0.71 - - - -
Handcoded 1000000 Array 3.990 ms 0.0258 ms 0.0241 ms 1.00 - - - -

Generator: for (var i = 0; i < size; ++i) yield return (i < size/2) ? -i : i;

Example: [0,-1,-2,-3,-4,-5,...,50,51,52,53,54,55,...]

Method Length ContainerType Mean Error StdDev Ratio Gen 0 Gen 1 Gen 2 Allocated
CisternValueLinq 1000000 Array 2.832 ms 0.0128 ms 0.0113 ms 0.70 - - - -
Handcoded 1000000 Array 4.069 ms 0.0356 ms 0.0333 ms 1.00 - - - -

So yeah, the CisternValueLinq had consistent speed regardless, but the handcoded version changed based on the ordering of the data. So we're getting into the realms of processor branch-prediction effects - which is great, but also means that tests are highly subjective.

(And I'm guessing my Intel i7 Nehalem from 2008 probably isn't the most representative of current CPU branch-prediction technology. So I guess I'm saying "your results may vary!")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants