Skip to content

CategoricalData is converted to Objects when merging dataframes #15182

Closed
@watercrossing

Description

@watercrossing

Code Sample

testData = pd.DataFrame(data={"id" : range(10), "cat" : pd.Categorical(["a"]*5 + ["b"]*5)})
print(testData.cat.dtype)
# category
testData2 = pd.DataFrame(data={"id" : range(10), "other" : np.random.randint(1,24,10)})
merged = pd.merge(testData, testData2,how="left", on="id")
print(merged.cat.dtype)
# object

Problem description

When merging two dataframes, an otherwise untouched categorical column is converted to an ordinary column.

Expected Output

print(merged.cat.dtype)
# category

Output of pd.show_versions()

INSTALLED VERSIONS

commit: 742d4a5
python: 2.7.10.final.0
python-bits: 64
OS: Linux
OS-release: 2.6.32-431.11.2.el6.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: en_GB.utf8
LANG: en_GB.utf8
LOCALE: None.None

pandas: 0.19.0+346.g742d4a5
nose: 1.3.7
pip: 9.0.1
setuptools: 33.1.1
Cython: 0.25.2
numpy: 1.12.0
scipy: None
statsmodels: None
xarray: None
IPython: None
sphinx: None
patsy: None
dateutil: 2.6.0
pytz: 2016.10
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: None
s3fs: None
pandas_datareader: None

Metadata

Metadata

Assignees

No one assigned

    Labels

    CategoricalCategorical Data TypeDuplicate ReportDuplicate issue or pull requestReshapingConcat, Merge/Join, Stack/Unstack, Explode

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions