-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
is there a limit to the tree length? Some commands not executing despite similar ones are #38
Comments
Stefano,
I'm going through your logs now...
Ben
…On Sun, Oct 4, 2020 at 11:56 PM Stefano Mangiola ***@***.***> wrote:
As you can see I have few holes in my benchmark
[image: image]
<https://user-images.githubusercontent.com/7232890/95038823-d6401480-071a-11eb-8a41-694da25d81e7.png>
The workflow hangs and does not submit any more jobs, and if I interrupt
and start again it hangs on starting workflow
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#38 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAXQMYXP2U2FBWS5LDCBG63SJE7XTANCNFSM4SDRDF3A>
.
|
Stefano, which command line are you using to run the workflows? When you say you are changing parameters, are you also changing cores, memory, etc., or only parameters of your tasks? |
Each block of tests depending on what algorithm is tested is run with different resources here the command
|
Could you send me the log.out file from: |
parsing dev/test_simulation_makeflow_pipeline/makefile_test_simulation.makeflow... and hangs forever |
I forgot to add the -dall debug flag, sorry about that: makeflow -dall -T slurm -j 100 --do-not-save-failed-output test_simulation_makeflow_pipeline/makefile_test_simulation.makeflow > log.out 2>&1 |
Stefano, could you also send me |
I don't have batchlog. I have rerun the whole workflow. I think one of the issue (non consistent) is that I increased the combination in the makefile after the workflow was completed and some of the new banchmark dies not execute. It is common to execute the whole workflow and try some some parameter combinations |
Stefano, something that just occurred to me. Are you re-running the makeflow in place without a cleaning operation in between? It could be that makeflow is getting confused by a mismatch between the previous execution log and a newly modified makeflow. |
Probably it is the case. But does cleaning lead to the deletion of the dependencies that are already completed. Of course if I delete the log everything gets deleted when the makeflow is called again |
Yes, they will be deleted. A safer mode of operation in this case is to not modify the original file, but instead write the updates to differently named makeflow files. Then you can execute each update in sequence. |
I understand, but this is not always possible in combinatorics scenario.
I can add arbitrary parameter space here with no effort. It would be great if makeflow could update the log file with the new dependencies, and just add them to the tree. Otherwise makeflow would be suitable to only static workflows. |
I think that just appending new rules may be workable, with the understanding that removing a rule, or changing a previously executed rule will result in failure. Would that be something helpful to your use case? |
Yes. Usually when doing benchmarking we want to increase combinations. We don't need to delete rules as we can ignore already executed dependencies. And we would eliminate rules on another run if needed. The issue is that if now I add rules to an existing makefile (with log) the only one executing are the new one at the bottom. The new one in the middle are ignored. This mixed behaviour seems more unwanted than designed. |
Stefano, thanks for your input! Let me discuss it with the team. |
I have a makeflow file with ~17K commands. Some of them at the root of the tree
Are not executed for some reason, while other combination of parameters are. I don't understand why.
The text was updated successfully, but these errors were encountered: