Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Factorize fails with read-only array #12813

Closed
rabernat opened this issue Apr 6, 2016 · 2 comments · Fixed by #21811
Closed

Factorize fails with read-only array #12813

rabernat opened this issue Apr 6, 2016 · 2 comments · Fixed by #21811
Labels
Bug Compat pandas objects compatability with Numpy or Python functions
Milestone

Comments

@rabernat
Copy link

rabernat commented Apr 6, 2016

same as in #15286

Factorize raises a cython error when used with a read-only array. Seems related to #10043 and #10070. I discovered this via xarray via pydata/xarray#818.

Code Sample, a copy-pastable example if possible

a = np.arange(2)
a.flags.writeable = False
pd.factorize(a)

Raises the following

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-54-dddd925d767a> in <module>()
      1 a = np.arange(10)
      2 a.flags.writeable = False
----> 3 pd.factorize(a)

/Users/rpa/anaconda/lib/python2.7/site-packages/pandas/core/algorithms.pyc in factorize(values, sort, order, na_sentinel, size_hint)
    194     table = hash_klass(size_hint or len(vals))
    195     uniques = vec_klass()
--> 196     labels = table.get_labels(vals, uniques, 0, na_sentinel, True)
    197 
    198     labels = com._ensure_platform_int(labels)

pandas/hashtable.pyx in pandas.hashtable.Int64HashTable.get_labels (pandas/hashtable.c:7893)()

/Users/rpa/anaconda/lib/python2.7/site-packages/pandas/hashtable.so in View.MemoryView.memoryview_cwrapper (pandas/hashtable.c:29882)()

/Users/rpa/anaconda/lib/python2.7/site-packages/pandas/hashtable.so in View.MemoryView.memoryview.__cinit__ (pandas/hashtable.c:26251)()

ValueError: buffer source array is read-only

Expected Output

Should be the same as with a non-read-only array

>>> pd.factorize(np.arange(2))
(array([0, 1]), array([0, 1]))

output of pd.show_versions()

commit: None
python: 2.7.11.final.0
python-bits: 64
OS: Darwin
OS-release: 14.5.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8

pandas: 0.18.0
nose: 1.3.7
pip: 8.1.0
setuptools: 20.2.2
Cython: 0.22.1
numpy: 1.10.4
scipy: 0.16.0
statsmodels: 0.6.1
xarray: 0.7.2-4-g33efdcd
IPython: 4.0.0
sphinx: 1.2.3
patsy: 0.3.0
dateutil: 2.5.0
pytz: 2016.1
blosc: None
bottleneck: 1.0.0
tables: 3.2.1.1
numexpr: 2.5.1
matplotlib: 1.4.3
openpyxl: 1.8.5
xlrd: 0.9.3
xlwt: 1.0.0
xlsxwriter: 0.7.3
lxml: 3.4.4
bs4: 4.3.2
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: 1.0.5
pymysql: None
psycopg2: 2.6 (dt dec pq3 ext)
jinja2: 2.8
boto: 2.38.0

@jreback
Copy link
Contributor

jreback commented Apr 6, 2016

yeah this is a cython bug, you can try same soln as in: #10070

@jreback jreback added Bug Compat pandas objects compatability with Numpy or Python functions Difficulty Intermediate labels Apr 6, 2016
@jreback jreback added this to the 0.18.1 milestone Apr 6, 2016
@jreback jreback modified the milestones: 0.18.1, 0.18.2 Apr 26, 2016
@jorisvandenbossche jorisvandenbossche modified the milestones: 0.20.0, 0.19.0 Aug 21, 2016
@jreback jreback modified the milestones: 0.20.0, Next Major Release Mar 23, 2017
@jreback jreback modified the milestones: Next Minor Release, Next Major Release Apr 1, 2017
@jreback jreback modified the milestones: Interesting Issues, Next Major Release Nov 26, 2017
@jreback jreback modified the milestones: Contributions Welcome, 0.24.0 Jul 7, 2018
@jreback
Copy link
Contributor

jreback commented Jul 7, 2018

this can be easily fixed after #21688

xhochy added a commit to xhochy/pandas that referenced this issue Jul 8, 2018
@xhochy xhochy mentioned this issue Jul 8, 2018
4 tasks
jreback pushed a commit that referenced this issue Jul 8, 2018
Sup3rGeo pushed a commit to Sup3rGeo/pandas that referenced this issue Oct 1, 2018
dcherian added a commit to dcherian/xarray that referenced this issue Mar 10, 2023
dcherian added a commit to dcherian/xarray that referenced this issue Mar 18, 2023
dcherian added a commit to dcherian/xarray that referenced this issue Mar 30, 2023
dcherian added a commit to pydata/xarray that referenced this issue May 4, 2023
* Introduce Grouper objects.

* Remove a copy after stacking for a groupby.

Upstream bug pandas-dev/pandas#12813 is fixed

* Fix typing

* [WIP] typing

* Cleanup

* [WIP]

* group as Variable?

* Revert "group as Variable?"

This reverts commit 2a36e21a031b9e061b932682758551956f3f06d2.

* Small cleanup

* De-duplicate alignment check

* Fix resampling

* Bugfix

* Partial reverts commit 22ad7fa.

* fix tests

* small cleanup

* more cleanup

* Apply suggestions from code review

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Add ResolvedGrouper class

* GroupBy only handles ResolvedGrouper objects.

Much cleaner!

* review feedback

* minimize diff

* dataclass

* moar dataclass

Co-authored-by: Illviljan <14371165+Illviljan@users.noreply.github.com>

* Add typing

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Ignore type checking error.

* Update groupby.py

* Move factorize to _factorize

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update groupby.py

* Update xarray/core/groupby.py

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Calculate group_indices only when necessary

* Revert "Calculate group_indices only when necessary"

This reverts commit 917c77efb05bacffcf901e61eabb9defc9a429d7.

* Fix regression from deep copy

---------

Co-authored-by: Illviljan <14371165+Illviljan@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Compat pandas objects compatability with Numpy or Python functions
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants