Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error running simpleimputer_intro.py in the example #118

Closed
CLWcynthia opened this issue Mar 14, 2020 · 11 comments
Closed

Error running simpleimputer_intro.py in the example #118

CLWcynthia opened this issue Mar 14, 2020 · 11 comments

Comments

@CLWcynthia
Copy link

When I ran the simpleimputer_intro.py in the example, the following error occurred
Traceback (most recent call last):File "/Users/chen/PycharmProjects/test2/datamissing/examples/simpleimputer_intro.py", line 41, in <module> predictions = imputer.predict(df_test)
File "/usr/local/lib/python3.7/site-packages/datawig/simple_imputer.py", line 420, in predict score_suffix, inplace=inplace)
File "/usr/local/lib/python3.7/site-packages/datawig/imputer.py", line 822, in predict if data_frame.columns.contains(imputation_col):
AttributeError: 'Index' object has no attribute 'contains'
It could be a data processing error in predict function

@felixbiessmann
Copy link
Contributor

this was due to a backwards-incompatible API change in pandas 1.0 and was addressed in this PR - sorry that we did not get to release a new version yet.

A quick fix could be to pip install pandas==0.25.0 manually before installing datawig.

In a jupyter notebook this could be done like:

!pip install pandas==0.25.0
!pip install datawig

Does this solve the problem?

@CLWcynthia
Copy link
Author

It works for the moment,but there are warnings about the future,I find it feasible to modify the following two lines in imputer.py
if data_frame.columns.str.contains(imputation_col).any():in line 822
if data_frame.columns.str.contains(imputation_proba_col).any():in line 829

@felixbiessmann
Copy link
Contributor

You're right, and we've fixed those lines in the last PR i had mentioned earlier, in particular the lines you mentioned are compliant with the new pandas API, see for instance here.

While the source code is fixed since some time that commit is not released in pip yet, we'll make sure that this and and some other mxnet related fix will be released asap.

Thanks for noticing this!

@felixbiessmann
Copy link
Contributor

Should be solved with latest release, please reopen if problem persists

@ziadzee
Copy link

ziadzee commented Jun 30, 2021

Hey, @felixbiessmann

It seems that this issue still persists. I get the initial error when trying to do:

imputer = datawig.SimpleImputer(
    input_columns = ["advice", "reason", "reason_id"],
    output_column = "advice_id"
)

imputer.fit(train_df = df3_train)
predictions = imputer.predict(df3_test)

Followed by error:

AttributeError                            Traceback (most recent call last)
<ipython-input-56-89a3fce1d6ab> in <module>
----> 1 predictions = imputer.predict(df3_test)

~/opt/anaconda3/lib/python3.8/site-packages/datawig/simple_imputer.py in predict(self, data_frame, precision_threshold, imputation_suffix, score_suffix, inplace)
    417         :return: data_frame original dataframe with imputations and likelihood in additional column
    418         """
--> 419         imputations = self.imputer.predict(data_frame, precision_threshold, imputation_suffix,
    420                                            score_suffix, inplace=inplace)
    421 

~/opt/anaconda3/lib/python3.8/site-packages/datawig/imputer.py in predict(self, data_frame, precision_threshold, imputation_suffix, score_suffix, inplace)
    820         for label, imputations in predictions:
    821             imputation_col = label + imputation_suffix
--> 822             if data_frame.columns.contains(imputation_col):
    823                 raise ColumnOverwriteException(
    824                     "DataFrame contains column {}; remove column and try again".format(

AttributeError: 'Index' object has no attribute 'contains'

Following the advice from the thread above, I attempt to do locally in a Notebook:

!pip install pandas==0.25.0
!pip install datawig

However that Pandas version seems to fail when installing as it seems to have been deprecated. Is there another solution to get around this? Thanks!

@felixbiessmann
Copy link
Contributor

Hey,

thanks for the heads up, we're currently in the process of refactoring the package and there's a pending PR that should solve some of these Problems - but it's in a preliminary stage. Until the next release I'd recommend to use the old package versions.

Thanks
Felix

@ziadzee
Copy link

ziadzee commented Jun 30, 2021

Okay, thank you @felixbiessmann - in the meantime, do you recommend a particular old package version?

@maqboolkhan
Copy link

Same problem. I am facing too :(

@aabuabat
Copy link

I am still facing the same issue. Any advice?

@awslabs awslabs deleted a comment from CLWcynthia Feb 16, 2022
@awslabs awslabs deleted a comment from CLWcynthia Feb 16, 2022
@aagrawal357
Copy link

Hi,

Is there an update on this issue? I am facing the same issue -
AttributeError: 'Index' object has no attribute 'contains'

I have pandas 1.2.4 and going back to 0.25.0 is not an option since it's been deprecated.

@CLWcynthia
Copy link
Author

CLWcynthia commented Oct 11, 2022 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants