Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pVACvector failing #54

Closed
yang-yangfeng opened this issue Dec 13, 2017 · 13 comments
Closed

pVACvector failing #54

yang-yangfeng opened this issue Dec 13, 2017 · 13 comments

Comments

@yang-yangfeng
Copy link
Contributor

Reporting 2 runs with different inputs and different failures:

I first tried pVACvector with a list of the best peptide per variant from my pVACseq results, which contained 24 peptides. It errored out with "OSError: [Errno 2] No such file or directory", so I'm wondering if this was a cluster issue?

Command:
pvacvector run -k --iedb-install-directory /gscmnt/gc2502/griffithlab/yafeng -e 8,9,10,11 -v /gscmnt/gc2142/griffithlab/yafeng/cwl_toil_runs/somatic_variants/annotated_somatic_pick_87.vcf /gscmnt/gc2142/griffithlab/yafeng/cwl_toil_runs/pvacseq_output/hcc1395_pick/MHC_Class_I/best_variant.tsv sample HLA-A*29:02,HLA-A*29:02,HLA-B*08:01,HLA-B*45:01,HLA-C*07:01,HLA-C*06:02 NNalign NetMHC NetMHCIIpan NetMHCcons NetMHCpan PickPocket SMM SMMPMBEC SMMalign hcc1395_vector

Output:

ID: TLN2.ENST00000561311.missense.2265E/K_pos9_len8, sequence: VLVILQKPTPKFKQQLAAFSKRVAG
ID: ZBTB3.ENST00000394807.missense.455S/F_pos3_len11, sequence: PLPAPASLHEPLYLSFEYEAAPGSF
ID: TESK1.ENST00000620767.missense.539H/Y_pos11_len9, sequence: GEPWNRAQYSLPRAAALERTEPSPP
ID: MAP7D3.ENST00000316077.missense.628Q/R_pos5_len9, sequence: REKEEEERQREEMQRRVIKKSKDMA
ID: SLC35E1.ENST00000595753.missense.26E/Q_pos10_len9, sequence: AASSSGGARQGARVAALCLLWYALS
ID: HLA-C.ENST00000376228.missense.90K/N_pos4_len9, sequence: WVEQEGPEYWDRETQNYKRQAQADR
ID: VPS54.ENST00000272322.missense.453D/H_pos11_len9, sequence: PQWFDLLKHIFSKFTIFLQRVKATL
ID: KNL1.ENST00000346991.missense.865P/L_pos10_len8, sequence: DESVQKPKFLKEKQNVKIWGRKSVG
ID: PRKX.ENST00000262848.missense.43V/A_pos4_len9, sequence: PALCPSPEALSPEPPAYSLQDFDTL
ID: DDX3X.ENST00000399959.missense.294R/T_pos8_len9, sequence: YEEARKFSYRSTVRPCVVYGGADIG
ID: CD83.ENST00000379153.missense.86N/S_pos6_len10, sequence: HQKGQNGSFDAPSERPYSLKIRNTT
ID: TSPAN4.ENST00000397404.missense.67I/M_pos9_len9, sequence: LIITGAFVMAMGFVGCLGAIKENKC
ID: SIX4.ENST00000216513.missense.23E/Q_pos6_len8, sequence: IASAADIKQENGMQSASEGQEAHRE
ID: STX12.ENST00000373943.missense.88P/R_pos6_len9, sequence: KETNELLKELGSLRLPLSTSEQRQQ
ID: TPM4.ENST00000344824.missense.240E/Q_pos4_len10, sequence: KLLSDKLKEAETRAQFAERTVAKLE
ID: TUBGCP6.ENST00000248846.missense.220H/R_pos9_len9, sequence: TRVSLFGALVRSRTYDMDVRLGLPP
ID: SURF1.ENST00000371974.missense.89N/K_pos4_len9, sequence: FGLGTWQVQRRKWKLKLIAELESRV
ID: PLCD3.ENST00000619929.inframe_del.233-238SNNDRL/S_pos6_len9, sequence: DMYAYLLFKECDHSEGAEIEEFLRR
ID: TDP2.ENST00000378198.missense.249Q/E_pos11_len9, sequence: LKMVLKKMEEAPESATVIFAGDTNL
ID: RFWD3.ENST00000361070.missense.564I/V_pos5_len10, sequence: EANYIYAGLANGSVLVYDVRNTSSH
ID: TRPM7.ENST00000313478.missense.153K/T_pos5_len9, sequence: VHGGMQKFELHPRITQLLGKGLIKA
ID: ZNF25.ENST00000302609.missense.21E/K_pos7_len8, sequence: TLKDVIVEFTKEKWKLLTPAQRTLY
ID: MAP7D3.ENST00000316077.missense.502E/A_pos1_len11, sequence: KKRLSSYTECYKWSSSPANACGLPS
ID: ATRX.ENST00000373344.missense.929E/Q_pos9_len10, sequence: GVDKLSGKEQSFTSLEVRKVAETKE
FASTA file written
Executing MHC Class I predictions
Generating Variant Peptide FASTA and Key Files
Generating Variant Peptide FASTA and Key Files - Entries 1-2
Completed
Processing entries for Allele HLA-A*29:02 and Epitope Length 8 - Entries 1-2
Running IEDB on Allele HLA-A*29:02 and Epitope Length 8 with Method NetMHC - Entries 1-2
Completed
Running IEDB on Allele HLA-A*29:02 and Epitope Length 8 with Method NetMHCcons - Entries 1-2
Completed
Running IEDB on Allele HLA-A*29:02 and Epitope Length 8 with Method NetMHCpan - Entries 1-2
Completed
Running IEDB on Allele HLA-A*29:02 and Epitope Length 8 with Method PickPocket - Entries 1-2
Completed
Running IEDB on Allele HLA-A*29:02 and Epitope Length 8 with Method SMM - Entries 1-2
Completed
Running IEDB on Allele HLA-A*29:02 and Epitope Length 8 with Method SMMPMBEC - Entries 1-2
Completed
Parsing IEDB Output for Allele HLA-A*29:02 and Epitope Length 8 - Entries 1-2
Completed
Processing entries for Allele HLA-A*29:02 and Epitope Length 9 - Entries 1-2
Running IEDB on Allele HLA-A*29:02 and Epitope Length 9 with Method NetMHC - Entries 1-2
Completed
Running IEDB on Allele HLA-A*29:02 and Epitope Length 9 with Method NetMHCcons - Entries 1-2
Completed
Running IEDB on Allele HLA-A*29:02 and Epitope Length 9 with Method NetMHCpan - Entries 1-2
Completed
Running IEDB on Allele HLA-A*29:02 and Epitope Length 9 with Method PickPocket - Entries 1-2
Completed
Running IEDB on Allele HLA-A*29:02 and Epitope Length 9 with Method SMM - Entries 1-2
Completed
Running IEDB on Allele HLA-A*29:02 and Epitope Length 9 with Method SMMPMBEC - Entries 1-2
Completed
Parsing IEDB Output for Allele HLA-A*29:02 and Epitope Length 9 - Entries 1-2
Completed
Processing entries for Allele HLA-A*29:02 and Epitope Length 10 - Entries 1-2
Running IEDB on Allele HLA-A*29:02 and Epitope Length 10 with Method NetMHC - Entries 1-2
Traceback (most recent call last):
  File "/gscmnt/gc2502/griffithlab/yafeng/mhc_i/src/predict_binding.py", line 383, in <module>
    Prediction().main()
  File "/gscmnt/gc2502/griffithlab/yafeng/mhc_i/src/predict_binding.py", line 375, in main
    elif (len(args)  == 4):                            self.commandline_input(args)  # args=[method, mhc, length, fname]
  File "/gscmnt/gc2502/griffithlab/yafeng/mhc_i/src/predict_binding.py", line 107, in commandline_input
    mhc_scores = mhc_predictor.predict(input.input_protein.sequences)
  File "/gscmnt/gc2502/griffithlab/yafeng/mhc_i/src/seqpredictor.py", line 1040, in predict
    scores.append(predictor.predict_sequence(sequence,pred))
  File "/gscmnt/gc2502/griffithlab/yafeng/mhc_i/src/seqpredictor.py", line 255, in predict_sequence
    scores = netmhc_prediction(allele_name, str(self.length), sequence)
  File "/gscmnt/gc2502/griffithlab/yafeng/mhc_i/method/netmhc_4_0_executable/__init__.py", line 32, in predict
    process = Popen(cmd, stdout=PIPE)
  File "/gscuser/yafeng/miniconda2/lib/python2.7/subprocess.py", line 711, in __init__
    errread, errwrite)
  File "/gscuser/yafeng/miniconda2/lib/python2.7/subprocess.py", line 1343, in _execute_child
    raise child_exception
OSError: [Errno 2] No such file or directory
Traceback (most recent call last):
  File "/gscuser/yafeng/miniconda2/envs/python3/bin/pvacvector", line 11, in <module>
    load_entry_point('pvacseq', 'console_scripts', 'pvacvector')()
  File "/gscmnt/gc2142/griffithlab/yafeng/pVACtools/tools/pvacvector/main.py", line 33, in main
    args[0].func.main(args[1])
  File "/gscmnt/gc2142/griffithlab/yafeng/pVACtools/tools/pvacvector/run.py", line 268, in main
    parsed_output_files = run_pipelines(input_file, base_output_dir, args)
  File "/gscmnt/gc2142/griffithlab/yafeng/pVACtools/tools/pvacvector/run.py", line 79, in run_pipelines
    parsed_output_files.extend(pipeline_i.call_iedb_and_parse_outputs([[1, 1]]))
  File "/gscmnt/gc2142/griffithlab/yafeng/pVACtools/lib/pipeline.py", line 470, in call_iedb_and_parse_outputs
    '-e', self.iedb_executable,
  File "/gscmnt/gc2142/griffithlab/yafeng/pVACtools/lib/call_iedb.py", line 56, in main
    response = run(prediction_class_object.iedb_executable_params(args), stdout=PIPE, check=True)
  File "/gscuser/yafeng/miniconda2/envs/python3/lib/python3.5/subprocess.py", line 398, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['python2.7', '/gscmnt/gc2502/griffithlab/yafeng/mhc_i/src/predict_binding.py', 'ann', 'HLA-A*29:02', '10', '/gscmnt/gc2142/griffithlab/yafeng/cwl_toil_runs/pvacseq_output/hcc1395_vector/MHC_Class_I/tmp/sample_.fa.split_1-2']' returned non-zero exit status 1

I then tried running with a smaller set of inputs (8 snvs + 6 fusions). The earliest error complained about "UnboundLocalError: local variable 'plen' referenced before assignment"

Command:
pvacvector run -k --iedb-install-directory /gscmnt/gc2502/griffithlab/yafeng -e 8,9,10,11 /gscmnt/gc2142/griffithlab/yafeng/cwl_toil_runs/pvacseq_output/hcc1395_vector_chosen/vector_input.fa sample HLA-A*29:02,HLA-A*29:02,HLA-B*08:01,HLA-B*45:01,HLA-C*07:01,HLA-C*06:02 NNalign NetMHC NetMHCIIpan NetMHCcons NetMHCpan PickPocket SMM SMMPMBEC SMMalign hcc1395_vector_chosen

See log here: /gscuser/yafeng/pvacvector_chosen.lsf

@susannasiebert
Copy link
Contributor

  1. This error is from IEDB itself. I'm not sure what is going on. I reduced the command to just the allele/length/prediction method combination where the error occurred and was not able to reproduce it. Can you try rerunning it and see if the error persists. I haven't tried it myself but you might be able to rerun with the same output directory and it might not try and recreate the existing outputs.

  2. I see the error you mentioned and that is inside of IEDB as well. It doesn't seem to cause a failure of the processing. The real reason it is failing is because of _tkinter.TclError: no display name and no $DISPLAY environment variable. This seems to be related to X forwarding( https://stackoverflow.com/questions/37604289/tkinter-tclerror-no-display-name-and-no-display-environment-variable). We should try to set matplotlib.use('Agg') inside of the program and see if that fixes it.

@yang-yangfeng
Copy link
Contributor Author

Still getting the same error:

Traceback (most recent call last):
  File "/gscuser/yafeng/miniconda2/envs/python3/bin/pvacvector", line 11, in <module>
    load_entry_point('pvacseq', 'console_scripts', 'pvacvector')()
  File "/gscmnt/gc2142/griffithlab/yafeng/pVACtools/tools/pvacvector/main.py", line 33, in main
    args[0].func.main(args[1])
  File "/gscmnt/gc2142/griffithlab/yafeng/pVACtools/tools/pvacvector/run.py", line 273, in main
    VectorVisualization(results_file, base_output_dir).draw()
  File "/gscmnt/gc2142/griffithlab/yafeng/pVACtools/lib/vector_visualization.py", line 25, in __init__
    self.turtle = turtle.Turtle()
  File "/gscuser/yafeng/miniconda2/envs/python3/lib/python3.5/turtle.py", line 3812, in __init__
    Turtle._screen = Screen()
  File "/gscuser/yafeng/miniconda2/envs/python3/lib/python3.5/turtle.py", line 3662, in Screen
    Turtle._screen = _Screen()
  File "/gscuser/yafeng/miniconda2/envs/python3/lib/python3.5/turtle.py", line 3678, in __init__
    _Screen._root = self._root = _Root()
  File "/gscuser/yafeng/miniconda2/envs/python3/lib/python3.5/turtle.py", line 434, in __init__
    TK.Tk.__init__(self)
  File "/gscuser/yafeng/miniconda2/envs/python3/lib/python3.5/tkinter/__init__.py", line 1876, in __init__
    self.tk = _tkinter.create(screenName, baseName, className, interactive, wantobjects, useTk, sync, use)
_tkinter.TclError: no display name and no $DISPLAY environment variable

@yang-yangfeng
Copy link
Contributor Author

Already discussed with @susannasiebert, but for the sake of documenting:

works with ssh -X, both bsub and locally (this is without the matplotlib change)

@susannasiebert
Copy link
Contributor

Great. Where you able to get a successful run for (1)?

@yang-yangfeng
Copy link
Contributor Author

I restarted it on Dec 14th (as in ran the same command in the same directory without clearing any output) and it is still running. There were quite a few peptides so seems like everything is going as expected so far?

@susannasiebert
Copy link
Contributor

Did it already process Allele HLA-A*29:02 and Epitope Length 10 with Method NetMHC?

@yang-yangfeng
Copy link
Contributor Author

yang-yangfeng commented Dec 19, 2017

Yes, correct, it made it past the break point from last time. Just failed again with the same OS error again, but it made progress.

See log here: /gscuser/yafeng/pvacvector_OS_error.lsf

@susannasiebert
Copy link
Contributor

The error message and the fact that it happens at different points in the process makes me think that this is a filesystem problem where the intermediate file seems to disappear intermittently. All of these Running IEDB on steps run on the same fasta file and "No such file or directory" makes me think that file disappears or isn't accessible intermittently.

When you ran it the second time, did you run it on the same, existing output directory? Did it try to recreate the existing files or did it use the existing ones?

@yang-yangfeng
Copy link
Contributor Author

I ran it on the same, existing output dir and it used existing files.

@susannasiebert
Copy link
Contributor

@yang-yangfeng can this issue be resolved?

@yang-yangfeng
Copy link
Contributor Author

@susannasiebert so I just restarted (1) again. This is the one that kept failing periodically, but made progress - so it seems like perhaps the files are just becoming inaccessible intermittently. Not sure if this a problem that needs to be addressed because it may just be internal and will only occur with a large number of peptides (I used 24).

For (2), we were able to fix this with "ssh -X", so if we're ok with just making a note about this in the docs, I don't know that we need to continue pursuing a solution for this.

@yang-yangfeng
Copy link
Contributor Author

Oops didn't mean to close this before others had a chance to chime in

@susannasiebert
Copy link
Contributor

This issue if over a year old. With the refactor/pVACvector bugfix and the new multithreading option, this will hopefully not be an issue anymore

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants