Skip to content

Specialize Order Distinct Iterator in LINQ #120125

@henriquewr

Description

@henriquewr

The current implementation of Distinct uses a HashSet to verify if the item exists, on collections that we know that is "pure ordered" this approach is a little bit memory inefficient

"Pure ordering" means that always equal elements will be side by side, like [1,1,2,2,3,3,3,3,4], and instead of the "costly" HashSet implementation, we can simply verify if the last element returned is different than the current

Like that:

public static IEnumerable<T> PureOrderedDistinct<T>(IOrderedEnumerable<T> pureOrdered)
{
    using var enumerator = pureOrdered.GetEnumerator();

    if (!enumerator.MoveNext())
    {
        yield break;
    }

    var lastReturnedItem = enumerator.Current;
    yield return lastReturnedItem;

    while (enumerator.MoveNext())
    {
        var currentItem = enumerator.Current;
        if (!EqualityComparer<T>.Default.Equals(lastReturnedItem, currentItem))
        {
            lastReturnedItem = currentItem;
            yield return lastReturnedItem;
        }
    }
}

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions