-
Notifications
You must be signed in to change notification settings - Fork 440
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make uniqued()
lazy by default
#71
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This all looks good to me.
My only question is how much folks are attached to uniqued()
returning an Array. I think that we probably want to take uniqued()
in this lazy direction (as opposed to, say, adding lazy.uniqued()
), because it's more consistent with the rest of Algorithms and stdlib, but whenever I see this in the wild it's returning an Array.
} | ||
``` | ||
|
||
### Complexity | ||
|
||
The `uniqued` methods are O(_n_) in both time and space complexity. | ||
The eager `uniqued(on:)` method is O(_n_) in both time and space complexity. | ||
The lazy versions are O(_1_). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's a very salient consideration mentioned by @kylemacomber that most uses seem to expect an Array
, in which case a .lazy.uniqued
design would make more sense.
Additionally, this discrepancy here, where uniqued(on:)
isn't lazy by default but uniqued
would be, seems like it invites confusion. I certainly would not expect that supplying a custom predicate would change the complexity or behavior so fundamentally, and I can see that issue arising when people make changes to their code and encounter this difference. Therefore, if it makes sense to make uniqued
lazy, then I think the same change should be applied to uniqued(on:)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do people expect uniqued()
to return an array, or do implementations usually simply return an array because it's easier to implement that way? I genuinely don't know the answer to this.
I share your concern about the discrepancy between uniqued()
and uniqued(on:)
, but I do think that viewing them in isolation, this change is consistent with other sequence operations in the standard library. Operations that can be lazy without having to compromise (except perhaps on the return type) typically are, even when called on a sequence not conforming to LazySequenceProtocol
. Collection
's joined()
and reversed()
fall into this category, and I argue that uniqued()
does as well. uniqued(on:)
does not, because making it lazy would require the closure to be escaping and non-throwing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do any other algorithms in this library or the standard library differ in laziness depending on the presence of a custom predicate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the pair of operations that comes closest is joined()
being lazy and flatMap { $0 }
being eager while effectively doing the same thing. Of course they don't have similar names like uniqued()
and uniqued(on:)
do.
This is a fairly unique situation because other potential pairs lack one of the variants. chunked(by: ==)
and compactMap { $0 }
have no corresponding closure-less version, and operations like zip
, reversed
, and combinations
have no versions that do take a closure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ya I think the joined()
/flatMap { $0 }
and compacted()
/compactMap { $0 }
(see #112) serve as good precedent of similar of lazy/eager pairs.
To try to articulate the philosophy:
- Anything that can efficiently be lazy should be lazy, because (i) it can avoid extra work (e.g. unnecessary computation or allocation) and (ii) it's easy to go from lazy to eager by constructing an Array, but it's impossible to go the other way.
- Require an explicit
.lazy
for algorithms that take a closure to emphasize that (i) the closure will not run immediately and (ii) may run more than once per element in the collection, which can introduce a surprising vector for error. - An algorithm shouldn't be lazy if its being lazy would pessimize its runtime complexity. For example, a lazy
reversed
adapter over a plainCollection
would be absurd because looping over all of its indices would be O(n²).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that we have OrderedSet
via the Swift Collections packages, I think it's even clearer to me that uniqued
should be lazy... the ability to do partial computation is really the only thing (other than method call syntax) distinguishing this algorithm from just creating an OrderedSet
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that we have
OrderedSet
via the Swift Collections packages, I think it's even clearer to me thatuniqued
should be lazy... the ability to do partial computation is really the only thing (other than method call syntax) distinguishing this algorithm from just creating anOrderedSet
+1
@swift-ci please test |
Adds the
Uniqued
sequence that lazily produces the unique elements of a sequence.uniqued()
andlazy.uniqued(on:)
now return aUniqued
instance, the eageruniqued(on:)
still produces an array.Checklist