Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix fast container job hangs #1962

Merged

Conversation

kuanyili
Copy link
Contributor

Starting from Podman 4.4.0, cidfile is removed along with the container.

Always return if container is terminated so we don't stuck in the loop waiting for a file which has already been removed.

Fixes: #1961

@mr-c mr-c enabled auto-merge (rebase) January 2, 2024 11:56
Copy link

codecov bot commented Jan 2, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (9b5a6ea) 83.80% compared to head (399b9db) 83.80%.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1962   +/-   ##
=======================================
  Coverage   83.80%   83.80%           
=======================================
  Files          46       46           
  Lines        8221     8221           
  Branches     2182     2182           
=======================================
  Hits         6890     6890           
- Misses        852      854    +2     
+ Partials      479      477    -2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@kuanyili
Copy link
Contributor Author

kuanyili commented Jan 3, 2024

Hmm.... It's strange that such a small change triggers this many CI errors.

Some udocker jobs are failing but the change is in docker_monitor which is a function NOT used by udocker jobs. On the other hand, this patch runs well on my machine and fixed the aforementioned issue beautifully.

The CI error pattern is very similar to that in the most recent PR #1960. Maybe they're related?

Starting from Podman 4.4.0, cidfile is removed along with the container.

Always return if container is terminated so we don't stuck in the loop
waiting for a file which has already been removed.
@mr-c mr-c force-pushed the fix-container-job-hangs branch from c3f239f to 399b9db Compare January 3, 2024 15:40
@mr-c
Copy link
Member

mr-c commented Jan 3, 2024

Thanks for alerting me to that @kuanyili ; I think I fixed the udocker problem

@mr-c mr-c merged commit 8ff1fd0 into common-workflow-language:main Jan 3, 2024
45 checks passed
@kuanyili kuanyili deleted the fix-container-job-hangs branch January 4, 2024 03:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

cwltool hangs running fast jobs in Podman
2 participants