-
Notifications
You must be signed in to change notification settings - Fork 416
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve GroupAdjacent #1032
Comments
Sorry, but I am having trouble to understand the problem and what you're asking for. If this is the input:
What do you expect the output to be? |
Basically, I added this to my project. It's a slightly less efficient way of doing Proposed Solution 1: IEnumerable<IGrouping<G, E>> GroupAdjacent<T, K, G, E>(
this IEnumerable<T> source,
Func<T, K> key,
Func<T, G> group,
Func<T, E> element)
{
K cachedKey = default;
G cachedGroup = default;
return source.GroupAdjacent(
x =>
{
var nextKey = key(x);
if (nextKey != cachedKey || cachedGroup is null)
(cachedKey, cachedGroup) = (nextKey, group(x));
return cachedGroup;
},
element
} The key point is that I don't want to create a |
Thanks for sharing the code. It helps to understand the mechanics a bit. So if the grouped elements were exposed as a list instead of a sequence, like your proposed solution 2, then it would solve your problem?
|
Yes, if I like Proposed Solution 1 a little more because it's simpler on consumer side. With Solution 2 consumer has to take care of the IGrouping construction. Independently of Solution 1 or 2: if buffering is not an implementation detail that will change, MoreLinq might as well expose I think it's fair to commit to buffering, because streaming is a valuable alternative but it comes with strict constraints on the way you consume it. |
Thanks for the clarifications. |
ah, I had forgotten this :) What I want can be implemented on top of IEnumerable<IGrouping<G, E>> GroupByAdjacent<T, K, G, E>(
this IEnumerable<T> source,
Func<T, K> key,
Func<T, G> group,
Func<T, E> element)
{
return source.AggregateAdjacent(
key,
(k, x) => new Grouping(group(x)),
(k, g, x) =>
{
g.Add(element(x));
return g;
},
(k, g) => g
);
} |
@jods4 Yes, that's it! |
What about the following with the existing api: seq.GroupAdjacent(
x => (x.aKey, x.a1, x.a2, x.a3),
x => new B(x.b1, x.b2, x.b3),
(k, g) => (new A(k.aKey, k.a1, k.a2, k.a3), g)); Unless |
@viceroypenguin |
Problem
GroupAdjacent
doesn't let you cheaply compare a small key (e.g. single field) to define groups, then use different selectors for the group key and elements.For example, consider an
IEnumerable
with the following fields:You want to efficiently group by
aKey
, then usex => A(x.AKey, x.a1, x.a2, x.a3)
as group key andx => B(x.b1, x.b2, x.b3)
as group elements.One option that works today is
GroupAdjacent(x => A(...), x => B(...))
and have a comparer or overrideEquals
onA
.The drawback is that one
A
instance is generated for every element in the sequence, when you only need a single one per group.Proposed solution 1
A new overload?
GroupAdjacent(F<X, A> keySelector, F<X, B> elementSelector, F<X, Y> equalitySelector)
Proposed solution 2
GroupAdjacent
doesn't stream its results, it actually buffers each group in a list. Internally these areIList
but only exposed asIEnumerable
.Exposing the
IList
would allow the following solution:Today, this is doable with a cast from
IEnumerable
toIList
but it can break with any MoreLINQ release as it relies on an implementation detail.Alternatively, using
elements.First()
instead ofelements[0]
will always work but can be quite dangerous if the implementation ofGroupAdjacent
becomes non-buffering, aselements
is enumerated twice.What about non-buffering?
When working with long groups, it'd be interesting to not buffer the groups.
Maybe a new API
StreamGroupAdjacent
that has an added constraint that group elements must be consumed in order and only once (but not necessarily fully).Predicate vs IEqualityComparer
It'd also be nice to have overloads accepting
Func<X, X, bool>
instead ofIEqualityComparer
.GroupAdjacent
is relies on adjacency, so uses only equality and never hashes.The text was updated successfully, but these errors were encountered: