-
-
Notifications
You must be signed in to change notification settings - Fork 18.6k
BUG: Ensure df.itertuples() uses plain tuples correctly #30600
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
pandas/tests/frame/test_api.py
Outdated
|
||
# Dataframes with >=255 columns will fallback to regular tuples | ||
with pytest.raises(AttributeError): | ||
result_255_columns.foo_1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because namedtuple
generates a new class when it is used, the only way of determining if something is a namedtuple or a plain one is to try and use it as such.
Hopefully you think this is sufficiently clear with the comment.
pandas/core/frame.py
Outdated
@@ -1018,8 +1018,8 @@ def itertuples(self, index=True, name="Pandas"): | |||
# use integer indexing because of possible duplicate column names | |||
arrays.extend(self.iloc[:, k] for k in range(len(self.columns))) | |||
|
|||
# Python 3 supports at most 255 arguments to constructor | |||
if name is not None and len(self.columns) + index < 256: | |||
# Python versions before 3.7 support at most 255 arguments to constructor |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it possible to skip the arg len check for >= 3.7? does it work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, this was simple to do.
pandas/tests/frame/test_api.py
Outdated
@@ -288,6 +288,22 @@ def test_sequence_like_with_categorical(self): | |||
for c, col in df.items(): | |||
str(s) | |||
|
|||
def test_itertuples_fallback_to_regular_tuples(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you move this next to the other itertuples tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Currently DataFrame.itertuples() has an off by one error when it inspects whether or not it should return namedtuples or plain tuples in it's response. This PR addresses that bug by correcting the condition that is used when making the check. Closes: pandas-dev#28282
730b22f
to
187196a
Compare
f6ddc89
to
6de8beb
Compare
1. Ensure we return named tuples in more cases (when using python >= 3.7) 2. Move test around to be with the itertuples test 3. Update docstring with the new behaviour.
6de8beb
to
64b381d
Compare
assert isinstance(tup3, tuple) | ||
if PY37: | ||
assert hasattr(tup3, "_fields") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test has changed as we will return named tuples always for python >= 3.7 now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
thanks @simongibbons very nice! |
Currently DataFrame.itertuples() has an off by one error
when it inspects whether or not it should return namedtuples
or plain tuples in it's response.
This PR addresses that bug by correcting the condition
that is used when making the check.
Closes: #28282
black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff