Skip to content

Index with Categorical dtype #9668

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mrocklin opened this issue Mar 17, 2015 · 1 comment
Closed

Index with Categorical dtype #9668

mrocklin opened this issue Mar 17, 2015 · 1 comment
Labels
Categorical Categorical Data Type Enhancement Indexing Related to indexing on series/frames, not to indexes themselves

Comments

@mrocklin
Copy link
Contributor

Currently categorical data is converted to full form when made an index.

In [1]: import pandas as pd

In [2]: df = pd.DataFrame({'a': pd.Categorical(['a', 'b', 'a'])}, 
...                       index=pd.Categorical(['a', 'b', 'a'])) 
In [3]: df.dtypes 
Out[3]: 
a    category
dtype: object

In [4]: df.index.dtype 
Out[4]: dtype('O')

In [5]: df.index 
Out[5]: Index([u'a', u'b', u'a'], dtype='object')

Is there a way around this? I'm trying to keep my DataFrames free of object dtypes to reduce serialization costs. I would like to index off of a text column using categoricals.

This is with 0.15.2

@jreback
Copy link
Contributor

jreback commented Mar 17, 2015

requested in #7629 already

best way is simply to leave it as a coulumn and not set the index; it will then be serialized as a Categorical type

@jreback jreback closed this as completed Mar 17, 2015
@jreback jreback added Enhancement Indexing Related to indexing on series/frames, not to indexes themselves Categorical Categorical Data Type labels Mar 17, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Categorical Categorical Data Type Enhancement Indexing Related to indexing on series/frames, not to indexes themselves
Projects
None yet
Development

No branches or pull requests

2 participants