[BUG] Categorify combo doesnt work on list columns #1676

bschifferer · 2022-09-09T07:24:18Z

Describe the bug
As a user, I want to jointly Categorify two columns, one is list and one is normal. Usecase - I have items interacted and one is the current item to predict and the list feature are the historic ones.

Error:

df = cudf.DataFrame({
    'col1': [0,1,2,3,4,5],
    'col2': [[0,1],[1,2],[2,3],[3,4],[4],[5]]
})
dataset = nvt.Dataset(df)
cols = [['col1', 'col2']] >> nvt.ops.Categorify()
workflow = nvt.Workflow(cols)
workflow.fit(dataset)
workflow.transform(dataset).to_ddf().compute()

Error:

File /usr/local/lib/python3.8/dist-packages/pandas/core/dtypes/common.py:1619, in _is_dtype_type(arr_or_dtype, condition)
   1615         return condition(type(None))
   1617     return False
-> 1619 return condition(tipo)

File /usr/local/lib/python3.8/dist-packages/pandas/core/dtypes/common.py:146, in classes.<locals>.<lambda>(tipo)
    144 def classes(*klasses) -> Callable:
    145     """evaluate if the tipo is a subclass of the klasses"""
--> 146     return lambda tipo: issubclass(tipo, klasses)

TypeError: issubclass() arg 1 must be a class

What works:
No joint categorify

import cudf
import nvtabular as nvt

df = cudf.DataFrame({
    'col1': [0,1,2,3,4,5],
    'col2': [[0,1],[1,2],[2,3],[3,4],[4],[5]]
})
dataset = nvt.Dataset(df)
cols = ['col1', 'col2'] >> nvt.ops.Categorify()
workflow = nvt.Workflow(cols)
workflow.fit(dataset)
workflow.transform(dataset).to_ddf().compute()

Joint Categoriy with non-list columns

import cudf
import nvtabular as nvt

df = cudf.DataFrame({
    'col1': [0,1,2,3,4,5],
    'col2':  [1,2,3,4,4,5],
})
dataset = nvt.Dataset(df)
cols = [['col1', 'col2']] >> nvt.ops.Categorify()
workflow = nvt.Workflow(cols)
workflow.fit(dataset)
workflow.transform(dataset).to_ddf().compute()```

The text was updated successfully, but these errors were encountered:

rnyak · 2022-09-12T16:22:51Z

@rjzamora hello. is this something you can take a look? thanks.

rjzamora · 2022-09-30T19:23:21Z

@rjzamora hello. is this something you can take a look? thanks.

Sorry for the delay - I can look into this.

bschifferer added bug Something isn't working P0 P1 and removed P0 labels Sep 9, 2022

rnyak assigned rjzamora Sep 12, 2022

rjzamora mentioned this issue Sep 30, 2022

Fix joint Categorify with list columns #1685

Merged

karlhigley closed this as completed in #1685 Oct 1, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Categorify combo doesnt work on list columns #1676

[BUG] Categorify combo doesnt work on list columns #1676

bschifferer commented Sep 9, 2022 •

edited

Loading

rnyak commented Sep 12, 2022

rjzamora commented Sep 30, 2022

[BUG] Categorify combo doesnt work on list columns #1676

[BUG] Categorify combo doesnt work on list columns #1676

Comments

bschifferer commented Sep 9, 2022 • edited Loading

rnyak commented Sep 12, 2022

rjzamora commented Sep 30, 2022

bschifferer commented Sep 9, 2022 •

edited

Loading