-
-
Notifications
You must be signed in to change notification settings - Fork 18.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pd.Series.loc.__getitem__ promotes to float64 instead of raising KeyError #25927
Comments
Same in: 0.23.4, 0.24.2 |
use .iloc as that is what is designed for selecting by position as the docs indicate: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#selection-by-position getitem it falling back here as described http://pandas.pydata.org/pandas-docs/stable/user_guide/advanced.html#miscellaneous-indexing-faq this is as expected behavior and is not likely to change |
My series has integer labels, not billions of rows. It fails for .loc and .getitem_ both. This is bug. Please reopen. |
Have update the title and sample to make it clear. .loc is the correct indexer (integer label, and integer index). Please reopen. |
this is on master
prob a bug somewhere, would need investigation by the community |
This doesn't really look like a bug. Once a |
no this is an issue i think; see how the value is i. the index; but there is a disconnect somewhere after the get_indexer call (way before the erroneous KeyError) |
Was there an erroneous KeyError? The key |
There was no KeyError.
For other examples of missing input a KeyError is thrown.
…On Tue, Apr 2, 2019 at 6:41 PM Chris Bertinato ***@***.***> wrote:
Was there an erroneous KeyError? The key 10047311000001102 isn't in a2b,
so the warning given for a2b.loc[key] seems appropriate.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#25927 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AHcErPj7PmN80jE8uwjY0rdWVHcjfmUuks5vdAbRgaJpZM4cTZQC>
.
|
I see the But I'm still trying to sleuth out whether there's another issue here. @jreback do you mean that because one of the labels in |
@cbertinato My apologies - I missed that a warning was being printed, and I had to do some digging to find the case where KeyError was thrown:
So it seems the behavior is: if ALL of the query labels are not in series's index, only then is a KeyError thrown. Otherwise, its promotion to float. Is expected? Not sure. But it sure is strange behavior:
Automatic value type promotion in select-like operations is also inconsistent with a Series being a 'datacolumn' that is 'kinda' like an SQL column + index. SQL doesn't do it (precisely because its not safe and leads to data-corruption). Is there any way to force a KeyError on missing labels for .loc? (since warnings easily slip through CI testing). |
For the cases where the Not sure when exactly this will turn from a warning into throwing a KeyError. |
Related? (seems similar but different #22252 ) |
It is similar. In both cases, the promotion to float is expected, but the message in 1 of that issue does seem misplaced. |
A |
can u see if we have a test for this; ok to add this one in a similar place |
Code Sample, a copy-pastable example if possible
For:
If a2b is a series that maps
{int64:int64}
and vals is anint64
array, the result should be a series that maps{int64:int64}
, or a KeyError should be thrownPasteable repo:
What happens:
Problem description
I don't like this behavior because:
Expected Output
Asserts should not fail.
Output of
pd.show_versions()
[paste the output of
pd.show_versions()
here below this line]In [4]: pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 2.7.15.candidate.1
python-bits: 64
OS: Linux
OS-release: 4.15.0-46-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: None.None
pandas: 0.22.0
pytest: None
pip: 18.1
setuptools: 40.6.2
Cython: 0.29.1
numpy: 1.16.1
scipy: 1.2.0
pyarrow: None
xarray: None
IPython: 5.0.0
sphinx: None
patsy: 0.5.1
dateutil: 2.6.0
pytz: 2016.10
blosc: None
bottleneck: None
tables: None
numexpr: 2.6.8
feather: None
matplotlib: 2.1.0
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: 4.6.0
html5lib: 0.9999999
sqlalchemy: 1.2.17
pymysql: None
psycopg2: 2.7.7 (dt dec pq3 ext lo64)
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
The text was updated successfully, but these errors were encountered: