Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

\iter\isEmpty() can indirectly cause unexpected consumption from iterators in rare cases #97

Open
athrawes opened this issue Oct 11, 2023 · 4 comments

Comments

@athrawes
Copy link
Contributor

I've come across a rare case where the behavior of \iter\isEmpty() can indirectly cause unexpected consumption from iterators in normal use cases.

Specifically, in the case where isEmpty is provided an object of \IteratorAggregate, the object's getIterator() method is called. Any further uses of this object may call getIterator() a second time, so if this method is not a pure function then the first item(s?) we might expect from the iterator could be lost to the void.

I've put together what I believe is a minimal reproduction case here:
https://gist.github.com/athrawes/e451484fdb4d0646f9aa03c9e47b566a

Basically, if the item passed to isEmpty looks like this:

class MyClass implements \IteratorAggregate {
    public function getIterator(): \Traversable {
        // some logic which consumes some resource
        // or otherwise mutates global mutable state
        while ($item = array_pop($someGlobalState)) { yield $item; }
    }
}

then the following does not behave as expected:

$iterable = new MyClass();
\iter\isEmpty($iterable); // <- initially appears to behave as expected, items have not technically been consumed yet
foreach ($iterable as $item) { ... } // <- missing item(s?) from the front as we got a 'new' iterable with said items missing

Then interacting with that item in the future with any foreach constructs or any methods in this library will be missing some item(s?) from the front of the iterable.

I'm not sure if this is really a bug in this library per se, but it is rather unexpected - especially as the docblock on isEmpty claims to not consume items from iterators. Would this simply be a matter of warning users about this edge case, or is there some way to handle this here?

@nikic
Copy link
Owner

nikic commented Oct 17, 2023

I don't think it's possible to do anything against this, apart from rejecting IteratorAggregate in isEmpty entirely, which doesn't seem desirable.

I'd consider a side-effecting IteratorAgggregate implementation to be a bug -- where did you encounter this?

@athrawes
Copy link
Contributor Author

The side-effecting \IteratorAggregate implementations turn out to be defined in my own codebases. It appears that once upon a time I wrote some convenience classes which look almost exactly like the following to abstract over popping items off a global (external) queue:

class MyClass implements \IteratorAggregate {
    public function getIterator(): \Traversable {
        // some logic which mutates global mutable state
        while ($item = array_pop($someGlobalState)) { yield $item; }
    }
}

So, to be clear, the buggy \IteratorAggregate classes are issues of my own design. That said, the docs for \IteratorAggregate don't seem to indicate that side-effecting implementations of getIterator() should be considered a bug or avoided for any reason.

It just so happens that there is a bit of a footgun here that I've encountered; I've shot myself in the foot with a valid (if not necessarily best practice) implementation of \IteratorAggregate being used with isEmpty(), where the combination causes side effects which were not clearly signposted anywhere.

I think I agree with you here that there's not much code-wise we can do here while still handling \IteratorAggregate; I'm wondering if it'd be best to simply add a note to the function documentation to call out this behavior?

@nikic
Copy link
Owner

nikic commented Oct 23, 2023

The side-effecting \IteratorAggregate implementations turn out to be defined in my own codebases. It appears that once upon a time I wrote some convenience classes which look almost exactly like the following to abstract over popping items off a global (external) queue:

class MyClass implements \IteratorAggregate {
    public function getIterator(): \Traversable {
        // some logic which mutates global mutable state
        while ($item = array_pop($someGlobalState)) { yield $item; }
    }
}

So, to be clear, the buggy \IteratorAggregate classes are issues of my own design. That said, the docs for \IteratorAggregate don't seem to indicate that side-effecting implementations of getIterator() should be considered a bug or avoided for any reason.

Interesting case! The actual IteratorAggregate implementation here is side-effect free -- getIterator() will just return the Generator. The problem is that the call to valid() has a side-effect. For generators, the only way to determine whether there are any more elements is to actually complete the next one, and that has a side-effect in your case.

When working on an Iterator, this is fine, because even though valid() has a side-effect, it will not be executed a second time when calling current(). The computed value is stored. But for IteratorAggregate, we will execute the side effect twice.

I can't really blame you for this one -- I think your code looks perfectly reasonable, and there's really no other way to implement it short of a manual Iterator implementation, which could implement a side-effect-free valid() method.

I think I agree with you here that there's not much code-wise we can do here while still handling \IteratorAggregate; I'm wondering if it'd be best to simply add a note to the function documentation to call out this behavior?

So yeah, I think a documentation note is probably the best we can do here.

@minhtkh
Copy link

minhtkh commented Oct 25, 2023

I close my PR then if, it fixes the problem but it's not in really good way

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants