Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle pandas categorical types for categorical columns in _causal_analysis.py #602

Merged
merged 6 commits into from
Jun 13, 2022

Conversation

gaugup
Copy link
Contributor

@gaugup gaugup commented Apr 5, 2022

If the categorical type is set for a treatment column explicitly then there is a failure in CausalAnalysis class.

~\AppData\Local\Continuum\miniconda3\envs\nhs-hips\lib\site-packages\econml\solutions\causal_analysis\_causal_analysis.py in individualized_policy(self, Xtest, feature_index, n_rows, treatment_costs, alpha)
   1714                 all_costs = np.array([0] + [treatment_costs] * (len(treatment_arr) - 1))
   1715                 # construct index of current treatment
-> 1716                 current_ind = (current_treatment.reshape(-1, 1) ==
   1717                                treatment_arr.reshape(1, -1)) @ np.arange(len(treatment_arr))
   1718                 current_cost = all_costs[current_ind]

~\AppData\Local\Continuum\miniconda3\envs\nhs-hips\lib\site-packages\pandas\core\ops\common.py in new_method(self, other)
     67         other = item_from_zerodim(other)
     68 
---> 69         return method(self, other)
     70 
     71     return new_method

~\AppData\Local\Continuum\miniconda3\envs\nhs-hips\lib\site-packages\pandas\core\arrays\categorical.py in func(self, other)
    131         if is_list_like(other) and len(other) != len(self) and not hashable:
    132             # in hashable case we may have a tuple that is itself a category
--> 133             raise ValueError("Lengths must match.")
    134 
    135         if not self.ordered:

Solution is to check for the type of the categorical column to see if it is of type pd.core.arrays.categorical.Categorical and extract the numpy array using to_numpy() method.

…alysis.py

If the categorical type is set for a treatment column explicitly then there is a failure in `CausalAnalysis` class.

```
~\AppData\Local\Continuum\miniconda3\envs\nhs-hips\lib\site-packages\econml\solutions\causal_analysis\_causal_analysis.py in individualized_policy(self, Xtest, feature_index, n_rows, treatment_costs, alpha)
   1714                 all_costs = np.array([0] + [treatment_costs] * (len(treatment_arr) - 1))
   1715                 # construct index of current treatment
-> 1716                 current_ind = (current_treatment.reshape(-1, 1) ==
   1717                                treatment_arr.reshape(1, -1)) @ np.arange(len(treatment_arr))
   1718                 current_cost = all_costs[current_ind]

~\AppData\Local\Continuum\miniconda3\envs\nhs-hips\lib\site-packages\pandas\core\ops\common.py in new_method(self, other)
     67         other = item_from_zerodim(other)
     68 
---> 69         return method(self, other)
     70 
     71     return new_method

~\AppData\Local\Continuum\miniconda3\envs\nhs-hips\lib\site-packages\pandas\core\arrays\categorical.py in func(self, other)
    131         if is_list_like(other) and len(other) != len(self) and not hashable:
    132             # in hashable case we may have a tuple that is itself a category
--> 133             raise ValueError("Lengths must match.")
    134 
    135         if not self.ordered:
```
Solution is to check for the type of the categorical column to see if it is of type `pd.core.arrays.categorical.Categorical` and extract the numpy array using `to_numpy()` method.
@gaugup gaugup changed the title Handle pandas categorical types for categorical columns in _causal_analsis.py Handle pandas categorical types for categorical columns in _causal_analysis.py Apr 5, 2022
@kbattocchi
Copy link
Collaborator

Please add a test that only passes with the new code.

Copy link
Collaborator

@kbattocchi kbattocchi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a test that ensures that we don't regress this.

Copy link
Collaborator

@kbattocchi kbattocchi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a test that verifies that the change fixes the behavior.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants