Skip to content

Index with Categorical dtype #9668

Closed
Closed
@mrocklin

Description

@mrocklin

Currently categorical data is converted to full form when made an index.

In [1]: import pandas as pd

In [2]: df = pd.DataFrame({'a': pd.Categorical(['a', 'b', 'a'])}, 
...                       index=pd.Categorical(['a', 'b', 'a'])) 
In [3]: df.dtypes 
Out[3]: 
a    category
dtype: object

In [4]: df.index.dtype 
Out[4]: dtype('O')

In [5]: df.index 
Out[5]: Index([u'a', u'b', u'a'], dtype='object')

Is there a way around this? I'm trying to keep my DataFrames free of object dtypes to reduce serialization costs. I would like to index off of a text column using categoricals.

This is with 0.15.2

Metadata

Metadata

Assignees

No one assigned

    Labels

    CategoricalCategorical Data TypeEnhancementIndexingRelated to indexing on series/frames, not to indexes themselves

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions