Make `uniqued()` lazy by default #71

timvermeulen · 2021-01-27T18:16:10Z

Adds the Uniqued sequence that lazily produces the unique elements of a sequence. uniqued() and lazy.uniqued(on:) now return a Uniqued instance, the eager uniqued(on:) still produces an array.

Checklist

I've added at least one test that validates that my change is working, if appropriate
I've followed the code style of the rest of the project
I've read the Contribution Guidelines
I've updated the documentation if necessary

kylemacomber

This all looks good to me.

My only question is how much folks are attached to uniqued() returning an Array. I think that we probably want to take uniqued() in this lazy direction (as opposed to, say, adding lazy.uniqued()), because it's more consistent with the rest of Algorithms and stdlib, but whenever I see this in the wild it's returning an Array.

Sources/Algorithms/Unique.swift

Tests/SwiftAlgorithmsTests/UniqueTests.swift

xwu · 2021-01-31T16:14:42Z

Guides/Unique.md

 }
 ```

 ### Complexity

-The `uniqued` methods are O(_n_) in both time and space complexity.
+The eager `uniqued(on:)` method is O(_n_) in both time and space complexity.
+The lazy versions are O(_1_).


I think it's a very salient consideration mentioned by @kylemacomber that most uses seem to expect an Array, in which case a .lazy.uniqued design would make more sense.

Additionally, this discrepancy here, where uniqued(on:) isn't lazy by default but uniqued would be, seems like it invites confusion. I certainly would not expect that supplying a custom predicate would change the complexity or behavior so fundamentally, and I can see that issue arising when people make changes to their code and encounter this difference. Therefore, if it makes sense to make uniqued lazy, then I think the same change should be applied to uniqued(on:).

Do people expect uniqued() to return an array, or do implementations usually simply return an array because it's easier to implement that way? I genuinely don't know the answer to this.

I share your concern about the discrepancy between uniqued() and uniqued(on:), but I do think that viewing them in isolation, this change is consistent with other sequence operations in the standard library. Operations that can be lazy without having to compromise (except perhaps on the return type) typically are, even when called on a sequence not conforming to LazySequenceProtocol. Collection's joined() and reversed() fall into this category, and I argue that uniqued() does as well. uniqued(on:) does not, because making it lazy would require the closure to be escaping and non-throwing.

Do any other algorithms in this library or the standard library differ in laziness depending on the presence of a custom predicate?

I believe the pair of operations that comes closest is joined() being lazy and flatMap { $0 } being eager while effectively doing the same thing. Of course they don't have similar names like uniqued() and uniqued(on:) do.

This is a fairly unique situation because other potential pairs lack one of the variants. chunked(by: ==) and compactMap { $0 } have no corresponding closure-less version, and operations like zip, reversed, and combinations have no versions that do take a closure.

Ya I think the joined()/flatMap { $0 } and compacted()/compactMap { $0 } (see #112) serve as good precedent of similar of lazy/eager pairs.

To try to articulate the philosophy:

Anything that can efficiently be lazy should be lazy, because (i) it can avoid extra work (e.g. unnecessary computation or allocation) and (ii) it's easy to go from lazy to eager by constructing an Array, but it's impossible to go the other way.

Require an explicit .lazy for algorithms that take a closure to emphasize that (i) the closure will not run immediately and (ii) may run more than once per element in the collection, which can introduce a surprising vector for error.

An algorithm shouldn't be lazy if its being lazy would pessimize its runtime complexity. For example, a lazy reversed adapter over a plain Collection would be absurd because looping over all of its indices would be O(n²).

Now that we have OrderedSet via the Swift Collections packages, I think it's even clearer to me that uniqued should be lazy... the ability to do partial computation is really the only thing (other than method call syntax) distinguishing this algorithm from just creating an OrderedSet

Now that we have OrderedSet via the Swift Collections packages, I think it's even clearer to me that uniqued should be lazy... the ability to do partial computation is really the only thing (other than method call syntax) distinguishing this algorithm from just creating an OrderedSet

+1

kylemacomber · 2021-03-24T23:45:33Z

@swift-ci please test

natecook1000 · 2021-04-07T19:51:04Z

Make uniqued() lazy by default

16b9373

kylemacomber approved these changes Jan 28, 2021

View reviewed changes

Sources/Algorithms/Unique.swift Outdated Show resolved Hide resolved

LucianoPAlmeida reviewed Jan 28, 2021

View reviewed changes

Sources/Algorithms/Unique.swift Outdated Show resolved Hide resolved

LucianoPAlmeida reviewed Jan 28, 2021

View reviewed changes

Tests/SwiftAlgorithmsTests/UniqueTests.swift Show resolved Hide resolved

Tim Vermeulen added 2 commits January 30, 2021 00:12

Fix docs

44c9e28

Add repeated element tests

bc133a8

xwu reviewed Jan 31, 2021

View reviewed changes

natecook1000 added the source breaking This change affects existing source code label Feb 9, 2021

natecook1000 merged commit 718220d into apple:main Apr 7, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Make `uniqued()` lazy by default #71

Make `uniqued()` lazy by default #71

Uh oh!

timvermeulen commented Jan 27, 2021

Uh oh!

kylemacomber left a comment •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

xwu Jan 31, 2021

Uh oh!

timvermeulen Feb 3, 2021

Uh oh!

xwu Feb 3, 2021

Uh oh!

timvermeulen Feb 4, 2021

Uh oh!

kylemacomber Mar 30, 2021

Uh oh!

kylemacomber Apr 6, 2021

Uh oh!

LucianoPAlmeida Apr 6, 2021

Uh oh!

kylemacomber commented Mar 24, 2021

Uh oh!

natecook1000 commented Apr 7, 2021

Uh oh!

Uh oh!

Make uniqued() lazy by default #71

Make uniqued() lazy by default #71

Uh oh!

Conversation

timvermeulen commented Jan 27, 2021

Checklist

Uh oh!

kylemacomber left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

xwu Jan 31, 2021

Choose a reason for hiding this comment

Uh oh!

timvermeulen Feb 3, 2021

Choose a reason for hiding this comment

Uh oh!

xwu Feb 3, 2021

Choose a reason for hiding this comment

Uh oh!

timvermeulen Feb 4, 2021

Choose a reason for hiding this comment

Uh oh!

kylemacomber Mar 30, 2021

Choose a reason for hiding this comment

Uh oh!

kylemacomber Apr 6, 2021

Choose a reason for hiding this comment

Uh oh!

LucianoPAlmeida Apr 6, 2021

Choose a reason for hiding this comment

Uh oh!

kylemacomber commented Mar 24, 2021

Uh oh!

natecook1000 commented Apr 7, 2021

Uh oh!

Uh oh!

Make `uniqued()` lazy by default #71

Make `uniqued()` lazy by default #71

kylemacomber left a comment •

edited

Loading