-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error while executing distributed version of ngen #456
Comments
Can you add the partition file you generated? It looks like the issue is in parsing that file. |
Here are the contents of the file partition_config.json: |
So I'm not sure the root cause of the poorly formatted partition file, but the issue causing the parsing error is here:
There is a You can try removing that |
Ok, looking a little more at the input data used to build the partition file, there are only 3 catchments and 3 nexuses in the input data, but the partition generator tried making 4 partitions. So I think the serialization loop didn't terminate the strings correctly, as it adds a You should be able to generate a valid partitioning of the example data by using less than 4 partitions, but we should also fix this issue in the partitionGenerator and probably add a warning/message when trying to partition too few catchments. @stcui007 can you maybe look into catching this condition where the number of possible partitions is less than the number of requested partitions and make sure this serialization loop doesn't write invalid JSON? |
Noting that this partition file also falls afoul of the empty |
Yes, if using current HEAD that bug will show up. I should have a fix for that up shortly, though. |
I was thinking about using 4 CPUs in the script for 3 catchments before
coming to Nels comment. That is probably the cause of the problem.
…On Mon, Nov 14, 2022 at 8:47 AM Nels ***@***.***> wrote:
Yes, if using current HEAD that bug will show up. I should have a fix for
that up shortly, though.
—
Reply to this email directly, view it on GitHub
<#456 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACA4SRK3IO3T3MMC36BTPTTWIJGHHANCNFSM6AAAAAARFNCQD4>
.
You are receiving this because you were assigned.Message ID:
***@***.***>
|
See also #475 |
@pvbangalore Note that the fix in #474 basically prevents the creation of a partition file with more partitions than there are catchments in the catchment dataset, which it appears you did (3 catchments, 4 partitions). So you would continue to get this error with the same partition file, but that partitioning is not supported, basically. With only 3 catchments in the dataset, try creating only 2 partitions (and running only 2 processes), or try a larger dataset to run with > 3 partitions. If those configurations work, we can close this issue. |
Believe this is fully fixed in #475... I just ran a config with zero remote nexuses in a partition and it succeeded. Will go ahead and close this but it can be reopened if the issue persists on the latest version. |
Short description explaining the high-level reason for the new issue.
Current behavior
Program terminates with an exception in 'boost::wrapexceptboost::property_tree::json_parser::json_parser_error' due to unexpected value in partition_config.json (output.log attached has complete output)
Expected behavior
Program should execute without any runtime errors.
Steps to replicate behavior (include URLs)
Attached file (build_notes.log) shows all the steps to build and execute ngen with MPI
Screenshots
build_notes.txt
output.log
The text was updated successfully, but these errors were encountered: