Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nohup not helping in Helixer.py, run is stopped when connection is lost #128

Open
alexandrosbousios opened this issue May 12, 2024 · 5 comments

Comments

@alexandrosbousios
Copy link

Hi Team,

Great tool, thank you so much! Unless I am doing something wrong, when I run Helixer.py with nohup in the terminal, the process is killed when the connection with the server freezes, so nohup is not doing what is supposed to do in this case. The genome is big and it will takes >1 day to finish. I have tried with A. lyrata chr8, and with other small genomes, and they all finish fine in a short period of time.

Is there any way around this?

Thank you
Alex

@nhartwic
Copy link

I'm not the dev.

I've been running helixer using singularity and have encountered no issues backgrounding and nohuping. Example execution below...

nohup singularity run --bind /data1:/data1 --nv helixer-docker_helixer_v0.3.2_cuda_11.8.0-cudnn8.sif  Helixer.py \
    --fasta-path Taestivum.Chinese_Spring.HPIv02.fasta \
    --gff-output-path Taestivum.Chinese_Spring.HPIv02.gff3 \
    --lineage land_plant \
    --species Taestivum.Chinese_Spring.HPIv02 \
    > Taestivum.Chinese_Spring.HPIv02.log 2>&1 &

If I were you, I'd double check the commands you are using, then double check your environment and relevant bashrc/login/whatever. And I guess also try the containerized version of Helixer if you haven't?

IDK. Just trying to be helpful.

@colindaven
Copy link

Nohup should work, but try using screen or tmux as an alternative. I haven't seen any problems like this running helixer via a nextflow pipeline using a singularity container.

@alisandra
Copy link
Collaborator

Hi, yeah, I am unaware of any interaction between Helixer and nohup, and such an interaction would be surprising.

I would hazard a guess the run could be failing for a different reason on the larger genome, and nohup is only obscuring the output.

Thus I agree with the comment from @nhartwic, which importantly includes an example on how to differentiate standard error and standard out with nohup. Were I running this, I might further change 2>&1 to 2>Taestivum.Chinese_Spring.HPIv02.err or similar, just for my own organization. Then the useful error messages would be in Taestivum.Chinese_Spring.HPIv02.err.

I can also only second @colindaven's advice on using tmux or screen. Either of those will make remote work easier, in general.

@colindaven
Copy link

@alexandrosbousios you might need to also google ssh keepalive and look into associated answers if this issue persists, but it is related to your infrastructure and nothing to do with helixer.

@alexandrosbousios
Copy link
Author

Thank you all for the advice and discussion. I know that nohup is expected to work, possibly my set up was not the optimal. I will try what you suggest in future helixer runs and come back to this thread with updates!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants