Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pandas error on aucell step when using loom input/output #51

Closed
cflerin opened this issue Feb 7, 2019 · 5 comments
Closed

Pandas error on aucell step when using loom input/output #51

cflerin opened this issue Feb 7, 2019 · 5 comments
Assignees
Labels
aucell bug Something isn't working

Comments

@cflerin
Copy link
Contributor

cflerin commented Feb 7, 2019

This issue appears to be caused by a major update to pandas. Using pandas version 0.23.4, this step completes with no problems. Using the latest pandas (Jan 25, version 0.24.1), I get the following error:

$ pyscenic aucell expr_mat_converted.loom reg.csv -o auc.loom
2019-02-07 13:34:38,393 - pyscenic.cli.pyscenic - INFO - Loading expression matrix.
2019-02-07 13:34:38,397 - pyscenic.cli.pyscenic - INFO - Loading gene signatures.
2019-02-07 13:34:38,413 - pyscenic.cli.pyscenic - INFO - Calculating cellular enrichment.
2019-02-07 13:34:40,077 - pyscenic.cli.pyscenic - INFO - Writing results to file.
Traceback (most recent call last):
  File "/home/luna.kuleuven.be/u0125489/chris/envs/pyscenictest/bin/pyscenic", line 10, in <module>
    sys.exit(main())
  File "/home/luna.kuleuven.be/u0125489/chris/envs/pyscenictest/lib/python3.6/site-packages/pyscenic/cli/pyscenic.py", line 402, in main
    args.func(args)
  File "/home/luna.kuleuven.be/u0125489/chris/envs/pyscenictest/lib/python3.6/site-packages/pyscenic/cli/pyscenic.py", line 196, in aucell_command
    append_auc_mtx(args.output.name, auc_mtx, signatures)
  File "/home/luna.kuleuven.be/u0125489/chris/envs/pyscenictest/lib/python3.6/site-packages/pyscenic/cli/utils.py", line 241, in append_auc_mtx
    _, auc_thresholds = binarize(auc_mtx)
  File "/home/luna.kuleuven.be/u0125489/chris/envs/pyscenictest/lib/python3.6/site-packages/pyscenic/binarization.py", line 45, in binarize
    return (auc_mtx > thresholds).astype(int), thresholds
  File "/home/luna.kuleuven.be/u0125489/chris/envs/pyscenictest/lib/python3.6/site-packages/pandas/core/ops.py", line 2103, in f
    level=None)
  File "/home/luna.kuleuven.be/u0125489/chris/envs/pyscenictest/lib/python3.6/site-packages/pandas/core/ops.py", line 1930, in _combine_series_frame
    return self._combine_match_columns(other, func, level=level)
  File "/home/luna.kuleuven.be/u0125489/chris/envs/pyscenictest/lib/python3.6/site-packages/pandas/core/frame.py", line 5116, in _combine_match_columns
    return ops.dispatch_to_series(left, right, func, axis="columns")
  File "/home/luna.kuleuven.be/u0125489/chris/envs/pyscenictest/lib/python3.6/site-packages/pandas/core/ops.py", line 1157, in dispatch_to_series
    new_data = expressions.evaluate(column_op, str_rep, left, right)
  File "/home/luna.kuleuven.be/u0125489/chris/envs/pyscenictest/lib/python3.6/site-packages/pandas/core/computation/expressions.py", line 208, in evaluate
    return _evaluate(op, op_str, a, b, **eval_kwargs)
  File "/home/luna.kuleuven.be/u0125489/chris/envs/pyscenictest/lib/python3.6/site-packages/pandas/core/computation/expressions.py", line 68, in _evaluate_standard
    return op(a, b)
  File "/home/luna.kuleuven.be/u0125489/chris/envs/pyscenictest/lib/python3.6/site-packages/pandas/core/ops.py", line 1144, in column_op
    for i in range(len(a.columns))}
  File "/home/luna.kuleuven.be/u0125489/chris/envs/pyscenictest/lib/python3.6/site-packages/pandas/core/ops.py", line 1144, in <dictcomp>
    for i in range(len(a.columns))}
  File "/home/luna.kuleuven.be/u0125489/chris/envs/pyscenictest/lib/python3.6/site-packages/pandas/core/ops.py", line 1745, in wrapper
    raise ValueError('Lengths must match to compare')
ValueError: Lengths must match to compare

However, this step completes correctly when using an expression matrix text file as input (works on both pandas versions). There appears to be some differences in either the auc matrix or the binarization step when the input is a loom vs a text file that cause this.

@bramvds
Copy link
Contributor

bramvds commented Feb 7, 2019

Hi Chris,

I definitely has something to do with the binarization as this is only done when creating a loom file as output. After looking at the stacktrace and the code base it probably has something to do with
line 45 ( return (auc_mtx > thresholds).astype(int), thresholds ) and the way pandas broadcasts operations when the shape of the dataframes compared doesn't match. I'll have to investigate further though.

Kr,
Bram

@bramvds bramvds added the bug Something isn't working label Feb 7, 2019
@bramvds bramvds added the aucell label Mar 8, 2019
@bramvds bramvds self-assigned this Mar 8, 2019
@bramvds
Copy link
Contributor

bramvds commented Jul 7, 2019

Hi Chris,

When using the latest version of pandas (0.24.2) I'm not able to replicate this error.

Kindest regards,
Bram

@gokceneraslan
Copy link

@bramvds is pandas==0.23.4 in requirements.txt still required then?

@gokceneraslan
Copy link

It'd be great if you can replace pandas==0.23.4 with something more permissive like pandas!=0.24.1 or so.

bramvds added a commit that referenced this issue Oct 9, 2019
@bramvds
Copy link
Contributor

bramvds commented Oct 9, 2019

Fixed in latest release.

@bramvds bramvds closed this as completed Oct 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
aucell bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants