Skip to content
This repository has been archived by the owner on Apr 19, 2023. It is now read-only.

[BUG] loom with non-standard CellID and Gene attributes [SCENIC] #279

Open
cflerin opened this issue Dec 10, 2020 · 1 comment
Open

[BUG] loom with non-standard CellID and Gene attributes [SCENIC] #279

cflerin opened this issue Dec 10, 2020 · 1 comment
Labels
bug Something isn't working

Comments

@cflerin
Copy link
Member

cflerin commented Dec 10, 2020

Describe the bug
With the SCENIC workflow and a loom input with non-standard cell and gene attribute names (CellID/Gene) the workflow fails to complete.

To Reproduce
Steps to reproduce the behavior:
0. Use a loom with the following column and row attributes (as an example):

In [3]: lf.ca.keys()
Out[3]: ['CellID_renamed', 'nGene', 'nUMI']

In [4]: lf.ra.keys()
Out[4]: ['Gene_renamed']
  1. Configure with these options:
nextflow pull vib-singlecell-nf/vsn-pipelines -r v0.23.0
nextflow config vib-singlecell-nf/vsn-pipelines -profile scenic,test__scenic,singularity > test_scenic.config

The cell and gene attributes are set in the SCENIC config section:

cell_id_attribute = 'CellID_renamed'
gene_attribute = 'Gene_renamed'
  1. Run using this entry point:
nextflow -C test_scenic.config run vib-singlecell-nf/vsn-pipelines -entry scenic -r v0.23.0
  1. See error:
N E X T F L O W  ~  version 20.04.1
Launching `vib-singlecell-nf/vsn-pipelines` [cheesy_mcnulty] - revision: 0a585c246f [v0.23.0]
WARN: It appears you have never run this project before -- Option `-resume` is ignored
WARN: DSL 2 IS AN EXPERIMENTAL FEATURE UNDER DEVELOPMENT -- SYNTAX MAY CHANGE IN FUTURE RELEASE
executor >  local (5)
[27/feab5a] process > scenic:SCENIC:ARBORETO_WITH_MULTIPROCESSING (1) [100%] 1 of 1 ✔
executor >  local (5)
[27/feab5a] process > scenic:SCENIC:ARBORETO_WITH_MULTIPROCESSING (1) [100%] 1 of 1 ✔
executor >  local (5)
[27/feab5a] process > scenic:SCENIC:ARBORETO_WITH_MULTIPROCESSING (1) [100%] 1 of 1 ✔
[2d/a61d5c] process > scenic:SCENIC:ADD_PEARSON_CORRELATION (1)       [100%] 1 of 1 ✔
[1c/7488bd] process > scenic:SCENIC:CISTARGET__MOTIF (1)              [100%] 1 of 1 ✔
[b3/7d308b] process > scenic:SCENIC:AUCELL__MOTIF (1)                 [100%] 1 of 1 ✔
[0f/08d00c] process > scenic:SCENIC:VISUALIZE (1)                     [100%] 1 of 1, failed: 1 ✘
[-        ] process > scenic:SCENIC:PUBLISH_LOOM                      -
[-        ] process > scenic:PUBLISH_SCENIC:COMPRESS_HDF5             -
[-        ] process > scenic:PUBLISH_SCENIC:SC__PUBLISH               -
[-        ] process > scenic:PUBLISH_SCENIC:SC__PUBLISH_PROXY         -
WARN: To render the execution DAG in the required format it is required to install Graphviz -- See http://www.graphviz.org for more info.
Error executing process > 'scenic:SCENIC:VISUALIZE (1)'

Caused by:
  Process `scenic:SCENIC:VISUALIZE (1)` terminated with an error exit status (1)

Command executed:

  /user/leuven/325/vsc32528/.nextflow/assets/vib-singlecell-nf/vsn-pipelines/src/scenic/bin/add_visualization.py             --loom_input scenic_CI__auc_mtf.loom             --loom_output scenic_visualize.loom             --num_workers 4

Command exit status:
  1

Command output:
  (empty)

Command error:
  Traceback (most recent call last):
    File "/usr/local/lib/python3.7/site-packages/loompy/attribute_manager.py", line 115, in __getattr__
      vals = self.__dict__["storage"][name]
  KeyError: 'CellID'

  During handling of the above exception, another exception occurred:

  Traceback (most recent call last):
    File "/user/leuven/325/vsc32528/.nextflow/assets/vib-singlecell-nf/vsn-pipelines/src/scenic/bin/add_visualization.py", line 86, in <module>
      visualize_AUCell(args)
    File "/user/leuven/325/vsc32528/.nextflow/assets/vib-singlecell-nf/vsn-pipelines/src/scenic/bin/add_visualization.py", line 53, in visualize_AUCell
      auc_mtx = pd.DataFrame(lf.ca.RegulonsAUC, index=lf.ca.CellID)
    File "/usr/local/lib/python3.7/site-packages/loompy/attribute_manager.py", line 123, in __getattr__
      raise AttributeError(f"'{type(self)}' object has no attribute '{name}'")
  AttributeError: '<class 'loompy.attribute_manager.AttributeManager'>' object has no attribute 'CellID'

Work dir:
  /ddn1/vol1/staging/leuven/stg_00002/lcb/cflerin/testruns/scenic-nf_testing/cellid_attr/work/0f/08d00c9f4029a57c19516d122be1a6

Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out`

Expected behavior
Pipeline should be able to run with arbitrary cell/gene attribute labels.

Screenshots
NA

Please complete the following information:

  • OS: CentOS Linux release 7.8.2003 (Core)
  • Nextflow Version: 20.04.1
  • vsn-pipelines Version: v0.23.0

Additional context
This particular error is caused by:

auc_mtx = pd.DataFrame(lf.ca.RegulonsAUC, index=lf.ca.CellID)

But there are also a few other places where the cell and gene attributes are hard coded that will also cause problems:


Also important to note: this is related to aertslab/pySCENIC/issues/235 , and this issue caused a failure in the AUCell step when using pySCENIC 0.10.4. After fixing this bug in pySCENIC, and using the pySCENIC dev version here (container = 'aertslab/pyscenic:dev') we get the above problem.

@cflerin cflerin added the bug Something isn't working label Dec 10, 2020
@GreyRockIQ
Copy link

Hello @cflerin
I have across same error at scenic:SCENIC:VISUALIZE (1).
Is there a solution to proceed from this step on the pipeline? or the output of the previous steps can be used for further analysis in r or python?
Thanks
GreyRock

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants