Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Pandas dataframe has no attribute 'append' #2313

Open
philipwoods opened this issue Jul 26, 2024 · 4 comments
Open

[BUG] Pandas dataframe has no attribute 'append' #2313

philipwoods opened this issue Jul 26, 2024 · 4 comments
Assignees

Comments

@philipwoods
Copy link

Short description of the problem

anvi-analyze-synteny fails because Pandas has deprecated DataFrame.append() as of version 1.4.0 in favor of pandas.concat().

anvi'o version

Anvi'o .......................................: marie (v8)
Python .......................................: 3.10.13

Profile database .............................: 38
Contigs database .............................: 21
Pan database .................................: 16
Genome data storage ..........................: 7
Auxiliary data storage .......................: 2
Structure database ...........................: 2
Metabolic modules database ...................: 4
tRNA-seq database ............................: 2

System info

Operating system is RedHat enterprise Linux.
Anvi'o was installed in a conda environment.

Detailed description of the issue

I ran anvi-analyze-synteny on my pangenome and got the following error:

Traceback (most recent call last):
  File "/export/data1/sw/anaconda3-2019.07/envs/anvio-8/bin/anvi-analyze-synteny", line 75, in <module>
    ngram.report_ngrams_to_user()
  File "/export/data1/sw/anaconda3-2019.07/envs/anvio-8/lib/python3.10/site-packages/anvio/synteny.py", line 421, in report_ngrams_to_user
    df = self.convert_to_df()
  File "/export/data1/sw/anaconda3-2019.07/envs/anvio-8/lib/python3.10/site-packages/anvio/synteny.py", line 384, in convert_to_df
    df = df.append({'ngram': ngram,
  File "/export/data1/sw/anaconda3-2019.07/envs/anvio-8/lib/python3.10/site-packages/pandas/core/generic.py", line 6296, in __getattr__
    return object.__getattribute__(self, name)
AttributeError: 'DataFrame' object has no attribute 'append'. Did you mean: '_append'?

Looking into it, I found that requirements.txt forces pandas==1.4.4, while DataFrame.append() has been deprecated since pandas version 1.4.0. Therefore I expect that this will be an issue in every part of anvi'o that currently uses the pandas DataFrame.append() method.

@mschecht
Copy link
Contributor

@philipwoods, thanks for posting this bug!

Weirdly, I was not able to reproduce it on my end but I went ahead and refactored self.convert_to_df() to useDataFrame.concat() which should fix this issue. If possible, could you post a tar gzipped directory of the pangenome and command you used? I want to reproduce it before I commit.

@mschecht
Copy link
Contributor

Here is the branch tracking this issue: master...deprecate-pandas-append-synteny

@philipwoods
Copy link
Author

philipwoods commented Aug 7, 2024

Sorry for the delay! Here is the file and the command I used (I forget whether --annotation-source is necessary when using gene clusters as the ngram source, but if it is you can use --annotation-source COG20_FUNCTION).
pangenome.tar.gz
anvi-analyze-synteny --analyze-unknown-functions -n gene_clusters --ngram-window-range 3:15 -g ANME3EVO-revision-GENOMES.db -p pangenome/ANME3EVO-revision-PAN.db

@meren
Copy link
Member

meren commented Aug 8, 2024

I run your command in @mschecht's branch, and got this error:

Functions found ..............................: EGGNOG_BEST_TAX, Pfam, COG20_CATEGORY, EGGNOG_BACT, COG20_FUNCTION, EGGNOG_PFAMs, EGGNOG_COG_CATEGORY, EGGNOG_BRITE, KEGG_BRITE, EGGNOG_KEGG_KO, KOfam, EGGNOG_GENE_FUNCTION_NAME,
                                                EGGNOG_KEGG_REACTION, EGGNOG_BiGG_REACTIONS, EGGNOG_KEGG_MODULE, EGGNOG_KEGG_PATHWAYS, EGGNOG_KEGG_TC, KEGG_Class, EGGNOG_EC_NUMBER, KEGG_Module, COG20_PATHWAY, EGGNOG_KEGG_RCLASS,
                                                EGGNOG_CAZy, EGGNOG_GO_TERMS
Genomes storage ..............................: Initialized (storage hash: hash45b805d1)
Num genomes in storage .......................: 67
Num genomes will be used .....................: 67

WARNING
===============================================
Anvi'o is now looking for Ngrams in your contigs!


* What do we say to loci that appear to have no coherent synteny patterns...? Not
  today! ⚔️

Traceback (most recent call last):
  File "/Users/meren/github/anvio/bin/anvi-analyze-synteny", line 74, in <module>
    ngram.report_ngrams_to_user()
  File "/Users/meren/github/anvio/anvio/synteny.py", line 420, in report_ngrams_to_user
    df = self.convert_to_df()
  File "/Users/meren/github/anvio/anvio/synteny.py", line 408, in convert_to_df
    ngram_count_df_final = pd.concat(ngram_count_df_list, ignore_index=True)
  File "/Users/meren/miniconda3/envs/anvio-dev/lib/python3.10/site-packages/pandas/util/_decorators.py", line 311, in wrapper
    return func(*args, **kwargs)
  File "/Users/meren/miniconda3/envs/anvio-dev/lib/python3.10/site-packages/pandas/core/reshape/concat.py", line 347, in concat
    op = _Concatenator(
  File "/Users/meren/miniconda3/envs/anvio-dev/lib/python3.10/site-packages/pandas/core/reshape/concat.py", line 404, in __init__
    raise ValueError("No objects to concatenate")
ValueError: No objects to concatenate

So it is some improvement, but more things to fix clearly :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants