Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-31861: Add operator.aiter and operator.anext #8895

Closed
wants to merge 4 commits into from
Closed

Conversation

jab
Copy link
Contributor

@jab jab commented Aug 24, 2018

...along with passing tests. (I added the tests to test_asyncgen.py rather than to test_operator.py because they fit much better there.)

/cc @1st1 @njsmith @JelleZijlstra et al.

https://bugs.python.org/issue31861

@jab jab force-pushed the aiter branch 2 times, most recently from b467ec3 to 6aa0fa5 Compare August 24, 2018 05:09
@1st1 1st1 self-assigned this Aug 24, 2018
Lib/operator.py Outdated Show resolved Hide resolved
Lib/operator.py Outdated Show resolved Hide resolved
Lib/operator.py Outdated Show resolved Hide resolved
@bedevere-bot
Copy link

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@jab
Copy link
Contributor Author

jab commented Sep 7, 2018

@asvetlov Thank you for the helpful review. I have addressed all your comments in 908227b. GitHub reports that all CI checks are passing. In particular, I see "test_asyncgen passed" (which contains the tests for these changes) in e.g. this job. The latest coverage report for these changes shows 97% coverage.

Locally I get:

$ ./python.exe Lib/test/test_asyncgen.py
...........................Task was destroyed but it is pending!
task: <Task pending name='Task-45' coro=<<async_generator_athrow without __name__>()>>
......................
----------------------------------------------------------------------
Ran 49 tests in 1.185s

OK

I'm not sure the destroyed pending task there is my fault or whether this is even a problem.

Any further reworking needed before this can be merged?

@jab
Copy link
Contributor Author

jab commented Sep 7, 2018

I have made the requested changes; please review again.

@bedevere-bot
Copy link

Thanks for making the requested changes!

@asvetlov: please review the changes made to this pull request.

Copy link
Member

@1st1 1st1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't merge this yet. I'm not convinced that aiter and anext shouldn't be builtins.

@bedevere-bot
Copy link

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@jab
Copy link
Contributor Author

jab commented Sep 7, 2018

@1st1 Would it be useful if I submitted another change here that moved these two Python functions from operator into builtins, assuming that's possible? Or do you want more time to think about this? If so, can you relabel this back to awaiting review? (@bedevere-bot automatically changed the label here to "awaiting changes" after you requested changes.)

@jab
Copy link
Contributor Author

jab commented Sep 7, 2018

Regardless, I'd be curious if you can spot anything in the tests I added that could be responsible for the lingering pending task and if there's anything I should do about that. The question about whether the async def __aiter__ should have been def __aiter__ in the (existing) tests is also still outstanding, in case you can help with the answer.

@1st1
Copy link
Member

1st1 commented Sep 7, 2018

@jab I know that @wolkodav was (is?) working on implementing builtin versions of these functions (in C), but I'm not sure what's his latest progress. Perhaps you guys can collaborate and work together on this.

In my opinion it would be great to have aiter() and anext() as builtin functions and I'm 80% confident that that's what we are going to do in 3.8. The problem is that adding new builtins is a serious decision, and @gvanrossum is no longer available to grant us a permission to do that right now. So we'll have to wait until we have a new governance model in CPython to have a final decision on this.

@jab
Copy link
Contributor Author

jab commented Sep 7, 2018

@1st1 Thanks for the additional info!

Perhaps you guys can collaborate and work together on this.

(I'm more comfortable with Python than C but) would be happy to try to help @wolkodav in any way I can.

So we'll have to wait until we have a new governance model in CPython to have a final decision on this.

So even if there were a PR adding this to builtins that was ready to merge now, we would still have to wait for a new governance model before it could be merged.

Rather than blocking all progress on this, would it make sense to merge this PR so we can make incremental progress in the meantime, and then once the governance question has been settled, the decision to move these into builtins could be made independently? Then if at that point it's decided to keep these in operator, it would be done already, and in either case there would be more time for people to try out having these available in the standard library somewhere which could help with making the builtins decision. (Are there any other core developers you think would want to weigh in here who we should ping?) Thanks again.

@1st1
Copy link
Member

1st1 commented Sep 7, 2018

Rather than blocking all progress on this, would it make sense to merge this PR so we can make incremental progress in the meantime, and then once the governance question has been settled, the decision to move these into builtins could be made independently?

I don't see how merging either PR right now (with a non-zero chance of it being reverted later) can be beneficial. I don't think this feature would see a lot of work and refinement after we merge it. I would also like to avoid any confusing churn in APIs/docs if possible.

(Are there any other core developers you think would want to weigh in here who we should ping?)

I think that we'll simply email python-dev and ask when the time is right. Please be patient!

If you want to work on something else meanwhile I'm sure that there are some open issues in asyncio that could be easier to merge/fix.

@jab
Copy link
Contributor Author

jab commented Sep 7, 2018

Ok, just trying to maximize results while I had the bandwidth now. I don't subscribe to python-dev so please link to the thread here once it's started if possible.

I actually already have another PR outstanding that's been awaiting review for a few weeks, in case it's one you feel comfortable reviewing: #8792

@ikrivosheev
Copy link

@jab I know that @wolkodav was (is?) working on implementing builtin versions of these functions (in C), but I'm not sure what's his latest progress. Perhaps you guys can collaborate and work together on this.

In my opinion it would be great to have aiter() and anext() as builtin functions and I'm 80% confident that that's what we are going to do in 3.8. The problem is that adding new builtins is a serious decision, and @gvanrossum is no longer available to grant us a permission to do that right now. So we'll have to wait until we have a new governance model in CPython to have a final decision on this.

Yes, I work on C implementation in builtins for aiter and anext. I had problems with my laptop. I apologize for the delay.

@jab
Copy link
Contributor Author

jab commented Oct 6, 2018

Hey @wolkodav, sorry to hear you had laptop trouble and great to hear you're working on this! If/when you're ready to share your work, please link to it from here.

@ikrivosheev
Copy link

@jab, i have some problem at the moment with default parameter. I need some time to understand how work coroutine at C level... At the moment you can view work in my fork: https://github.com/WolkoDav/cpython

@jab
Copy link
Contributor Author

jab commented Nov 8, 2018

Thanks for sharing the link @wolkodav, and sorry I haven't had time or enough recent C experience to help. I defer to @1st1, @asvetlov, or any other maintainers who can jump in here.

@ikrivosheev
Copy link

@jab, Yuri has already watched my code and left comments there. I think that in a week I will be able to add the parameter "default".

@nanjekyejoannah
Copy link
Contributor

@ikrivosheev are you still working on this? I want to work on another PR for a C implementation for this. Can i pick up. Am fine if you still want to work on this.

@jab
Copy link
Contributor Author

jab commented Jun 2, 2019

I think that we'll simply email python-dev and ask when the time is right. Please be patient!

@1st1, would now be an ok time to ask about this?

@jab
Copy link
Contributor Author

jab commented Jun 3, 2020

Thanks for reviewing and approving this, @eric-wieser, great to see some activity here again!

Is the question of whether these should be in operator vs. builtins is still unresolved, though? If so, @1st1, do you still want (me?) to start a thread on python-dev about this when the time is right? Happy to do that now if that'd be helpful.

It'd be great for aiter and anext to be available somewhere in the standard library, whether in operator or builtins, before too much longer. I'd love to submit an alternative PR that provided the builtins version, in case that ends up being the preferred choice, but would like to request a mentor for that if possible, since I still don't know my way around the C parts of CPython very well yet.

Would be sweet to get a fix for BPO-31861 landed by the time this PR turns 2 years old at end of August, and just want to help in any way I can :)

@eric-wieser
Copy link
Contributor

eric-wieser commented Aug 19, 2020

I want to work on another PR for a C implementation for this.

A C implementation might make more sense, especially if it could live next to the existing callable_iterator type:

typedef struct {
PyObject_HEAD
PyObject *it_callable; /* Set to NULL when iterator is exhausted */
PyObject *it_sentinel; /* Set to NULL when iterator is exhausted */
} calliterobject;
PyObject *
PyCallIter_New(PyObject *callable, PyObject *sentinel)
{
calliterobject *it;
it = PyObject_GC_New(calliterobject, &PyCallIter_Type);
if (it == NULL)
return NULL;
Py_INCREF(callable);
it->it_callable = callable;
Py_INCREF(sentinel);
it->it_sentinel = sentinel;
_PyObject_GC_TRACK(it);
return (PyObject *)it;
}
static void
calliter_dealloc(calliterobject *it)
{
_PyObject_GC_UNTRACK(it);
Py_XDECREF(it->it_callable);
Py_XDECREF(it->it_sentinel);
PyObject_GC_Del(it);
}
static int
calliter_traverse(calliterobject *it, visitproc visit, void *arg)
{
Py_VISIT(it->it_callable);
Py_VISIT(it->it_sentinel);
return 0;
}
static PyObject *
calliter_iternext(calliterobject *it)
{
PyObject *result;
if (it->it_callable == NULL) {
return NULL;
}
result = _PyObject_CallNoArg(it->it_callable);
if (result != NULL) {
int ok;
ok = PyObject_RichCompareBool(it->it_sentinel, result, Py_EQ);
if (ok == 0) {
return result; /* Common case, fast path */
}
Py_DECREF(result);
if (ok > 0) {
Py_CLEAR(it->it_callable);
Py_CLEAR(it->it_sentinel);
}
}
else if (PyErr_ExceptionMatches(PyExc_StopIteration)) {
PyErr_Clear();
Py_CLEAR(it->it_callable);
Py_CLEAR(it->it_sentinel);
}
return NULL;
}
static PyObject *
calliter_reduce(calliterobject *it, PyObject *Py_UNUSED(ignored))
{
if (it->it_callable != NULL && it->it_sentinel != NULL)
return Py_BuildValue("N(OO)", _PyEval_GetBuiltinId(&PyId_iter),
it->it_callable, it->it_sentinel);
else
return Py_BuildValue("N(())", _PyEval_GetBuiltinId(&PyId_iter));
}
static PyMethodDef calliter_methods[] = {
{"__reduce__", (PyCFunction)calliter_reduce, METH_NOARGS, reduce_doc},
{NULL, NULL} /* sentinel */
};
PyTypeObject PyCallIter_Type = {
PyVarObject_HEAD_INIT(&PyType_Type, 0)
"callable_iterator", /* tp_name */
sizeof(calliterobject), /* tp_basicsize */
0, /* tp_itemsize */
/* methods */
(destructor)calliter_dealloc, /* tp_dealloc */
0, /* tp_vectorcall_offset */
0, /* tp_getattr */
0, /* tp_setattr */
0, /* tp_as_async */
0, /* tp_repr */
0, /* tp_as_number */
0, /* tp_as_sequence */
0, /* tp_as_mapping */
0, /* tp_hash */
0, /* tp_call */
0, /* tp_str */
PyObject_GenericGetAttr, /* tp_getattro */
0, /* tp_setattro */
0, /* tp_as_buffer */
Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC,/* tp_flags */
0, /* tp_doc */
(traverseproc)calliter_traverse, /* tp_traverse */
0, /* tp_clear */
0, /* tp_richcompare */
0, /* tp_weaklistoffset */
PyObject_SelfIter, /* tp_iter */
(iternextfunc)calliter_iternext, /* tp_iternext */
calliter_methods, /* tp_methods */
};

cpython/Python/bltinmodule.c

Lines 1509 to 1535 in c3dd7e4

/* AC: cannot convert yet, as needs PEP 457 group support in inspect */
static PyObject *
builtin_iter(PyObject *self, PyObject *const *args, Py_ssize_t nargs)
{
PyObject *v;
if (!_PyArg_CheckPositional("iter", nargs, 1, 2))
return NULL;
v = args[0];
if (nargs == 1)
return PyObject_GetIter(v);
if (!PyCallable_Check(v)) {
PyErr_SetString(PyExc_TypeError,
"iter(v, w): v must be callable");
return NULL;
}
PyObject *sentinel = args[1];
return PyCallIter_New(v, sentinel);
}
PyDoc_STRVAR(iter_doc,
"iter(iterable) -> iterator\n\
iter(callable, sentinel) -> iterator\n\
\n\
Get an iterator from an object. In the first form, the argument must\n\
supply its own iterator, or be a sequence.\n\
In the second form, the callable is called until it returns the sentinel.");

From what I can tell, that style of implementation of aiter(callable, sentinel) would be pickleable, whereas the one in this PR is not.

Copy link
Contributor

@nanjekyejoannah nanjekyejoannah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I agree, this is good feedback for @ikrivosheev but anyone can pick up these suggestions in a different PR unless @ikrivosheev commits again.

@jab
Copy link
Contributor Author

jab commented Aug 19, 2020

Thanks for looking at this, @eric-wieser and @nanjekyejoannah, good to cross paths with you again here and hope you’ve been well!

I’d love to work on a C implementation in another PR (which could maybe even reuse the tests I wrote in this PR), but would need a little help getting started on the C side (I know C pretty well from a past life, but have only written 100 lines in the last 15 years) and don’t know the CPython code well yet. Would anyone with more experience be willing to mentor me for this?

Lib/operator.py Outdated
Comment on lines 428 to 430
if isinstance(obj, AsyncIterator):
return obj
return (i async for i in obj)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if isinstance(obj, AsyncIterator):
return obj
return (i async for i in obj)
ait = obj.__aiter__()
if not hasattr(ait, '__anext__'):
raise TypeError(f"aiter() returned non-AsyncIterator of type {type(ait)!r}")
return ait

I think it should be this way for a few reasons:

  1. It eliminates the two separate non-raising cases and return statements/types
  2. It eliminates the overhead of the extra async generator expression and is more easily implemented in C.
  3. For any AsyncIterator a, we "should" have a.__aiter__() is a, so the behavior shouldn't change much.
  4. For the cases where that isn't true, in analogy with the builtin iter(), one might expect aiter(x) to always call x.__aiter__():
from collections.abc import Iterator, AsyncIterator

class A:
    def __next__(self):
        return 1
    def __iter__(self):
        return iter([])

# "a" is already an iterator, but a.__iter__ is called anyway.
a = A()
assert isinstance(a, Iterator)
assert iter(a) is not a
assert type(iter(a)) is not type(a)
assert type(iter(a)) is type(iter([]))

############################

async def another_aiter():
    yield 1

class B:
    def __anext__(self):
        return 1
    def __aiter__(self):
        return another_aiter()

b = B()
assert isinstance(b, AsyncIterator)
# These three assertions fail, but pass with the change.
assert aiter(b) is not b
assert type(aiter(b)) is not type(b)
assert type(aiter(b)) is type(aiter(another_aiter()))

See the implementation of iter() here, which uses PyObject_GetIter() from here.
I wonder if a PyObject_GetAsyncIter needs to be added to the C-API so that aiter() and the GET_AITER opcode can use the same code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we call type(obj).__aiter__(obj) (and the analogue for __anext__)? That's how dunder methods are generally called by the interpreter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for reviewing! Updated this patch based on the good suggestions.

(Note, I used isinstance(ait, AsyncIterator) rather than the hasattr(ait, '__anext__') that was suggested, figuring that's the preferred spelling. Also, after rebasing on latest master, I had to move the from collections.abc ... imports from module scope to function scope as a workaround to avoid a circular dependency between operator and collections, which wasn't an issue when I first submitted this PR. Sounds like this PR will now serve as a reference implementation for a reimplementation in C (submitted in a separate PR) anyway, where I guess such a workaround won't be necessary?)

@jab jab force-pushed the aiter branch 2 times, most recently from 69b302e to 8b2b120 Compare November 30, 2020 17:02
...on top of latest master.

Also drop now-removed `loop` kwarg from asyncio.sleep call.
Ref: https://bugs.python.org/issue42392
@jab
Copy link
Contributor Author

jab commented Dec 18, 2020

Please see #23847 for a C implementation of aiter and anext added to builtins.

@gvanrossum
Copy link
Member

Closing in favor of #23847 (adding them to builtins).

@gvanrossum gvanrossum closed this Mar 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.