Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: Auto rerun-from #1720

Merged
merged 4 commits into from
Nov 13, 2023
Merged

Feat: Auto rerun-from #1720

merged 4 commits into from
Nov 13, 2023

Conversation

pierotofy
Copy link
Member

Implements auto-rerun logic. 💥

E.g.

  • Run with ./run.sh /datasets/brighton [process completes]
  • Re-run with ./run.sh /datasets/brighton --orthophoto-resolution 2 (change resolution)
  • ODM automatically sets rerun-from: odm_orthophoto

If rerun-from, rerun-all or rerun is set, the flag is respected instead (does not automatically attempt to choose the rerun stage).

Please help test this, especially with split-merge and distributed split-merge? 🙏

#1718 #1719

@pierotofy pierotofy marked this pull request as ready for review November 8, 2023 16:39
@smathermather
Copy link
Contributor

Happy to test split-merge. Other than bombing out granularity for other flags, anything I should be looking for with split?

@pierotofy
Copy link
Member Author

pierotofy commented Nov 9, 2023

Nothing too specific, test the happy path cases (local split, remote split), try changing some parameters, see if the rerun selection behaves the way you would expect it to if you were to choose rerun-from by hand.

@smathermather
Copy link
Contributor

smathermather commented Nov 9, 2023

Sounds good. Local split is running. It's been blissfully long since I've had a project that needed split.

Sanity check on my command:
docker run -ti --rm -v /smallfast_00/kakuma:/smallfast_00/kakuma odm_rerun --project-path /smallfast_00/kakuma/ Kakuma_13B_Flight_01_2 --orthophoto-resolution 30 --split 400 --sm-cluster http://192.168.whateverman:3002

Edit:

Something's wrong:

[INFO]    sm_cluster: True
[INFO]    sm_no_align: False
....
[INFO]    split: 400
[INFO]    split_image_groups: None
[INFO]    split_overlap: 150

This seems fine, but nothing is happening on the ClusterODM node (task list returns nothing) and matching started on the machine this is run from as though it were a sequential split-merge. Am I missing something?

@jprates
Copy link

jprates commented Nov 9, 2023

I don't know if @pierotofy already included this validation, but one should only compare the current flag settings with the previous ones for the same project, please filter by project.

Unhappy path in question if not filtering by project:
user runs project 1 runs with flags A, B, C - he's now satisfied with the results
user goes back to project 2 (he's unhappy with this one's last output) and changes flag D to re-run
autorerun will consider flags ABC used on project 1 for comparison with project 2 flag D and make a poor decision about where to restart from

EDIT: Does this feature have persistence? Is it saved on database/disk?
Assuming ODM has a project table/entity, one extra column for the last flags used on json format might do the trick, just an idea.

@smathermather
Copy link
Contributor

This seems fine, but nothing is happening on the ClusterODM node (task list returns nothing) and matching started on the machine this is run from as though it were a sequential split-merge. Am I missing something?

Still not sure what's happening here, need to rerun and monitor for uploading to cluster which doesn't appear to be happening, but now running with sm-cluster set through webodm, and that's working fine.

@pierotofy
Copy link
Member Author

pierotofy commented Nov 9, 2023

Still not sure what's happening here

Not sure either, this PR has not modified the sm_cluster logic. Can you post the task output?

@smathermather
Copy link
Contributor

Will do after tests are complete.

@smathermather
Copy link
Contributor

Retraction, the above command is working fine. Should be done with testing this morning for distributed split-merge.

Just to verify: I don't think I can test yet in WebODM, correct? There's no way to re-run without setting a stage for rerun.

@pierotofy
Copy link
Member Author

I don't think I can test yet in WebODM, correct? There's no way to re-run without setting a stage for rerun.

Correct, WebODM will require some modification.

@smathermather
Copy link
Contributor

Ah, the brittlness of split-merge. Still waiting for a successful completion. Should have chosen a smaller dataset. Hopefully something today.

@smathermather
Copy link
Contributor

Still not sure what's happening here

Not sure either, this PR has not modified the sm_cluster logic. Can you post the task output?

Circling back, this was a dataset that required a geo.txt but I forgot to include that, so split couldn't happen based on image tags.

Now running a different (smaller) dataset anyway.

@smathermather
Copy link
Contributor

smathermather commented Nov 11, 2023

First command:
docker run -ti --rm -v /smallfast_00/kakuma:/smallfast_00/kakuma odm_rerun --project-path /smallfast_00/kakuma/ Kakuma_13B_Flight_03 --split 400 --sm-cluster http://192.168.20.24:3002

Second command run in sequence (no rerun triggered, addition of orthophoto flag ignored):
docker run -ti --rm -v /smallfast_00/kakuma:/smallfast_00/kakuma odm_rerun --project-path /smallfast_00/kakuma/ Kakuma_13B_Flight_03 --split 400 --sm-cluster http://192.168.20.24:3002 --orthophoto-resolution 1

Third command run in sequence (rerun gets triggered):
docker run -ti --rm -v /smallfast_00/kakuma:/smallfast_00/kakuma odm_rerun --project-path /smallfast_00/kakuma/ Kakuma_13B_Flight_03 --split 200 --sm-cluster http://192.168.20.24:3002

@pierotofy
Copy link
Member Author

pierotofy commented Nov 12, 2023

Second command run in sequence (no rerun triggered, addition of orthophoto flag ignored):

Mm, typically one would need to rerun the entire pipeline --rerun-from split here. But the fact that the rerun wasn't triggered is not bad either, it just matches the current ODM logic, which is good, nothing breaks.

This gets a bit tricky, I think most parameters in split-merge need to re-trigger the split stage, but there could be exceptions, e.g. if you run split-merge with --dsm and then without --dsm, then technically you should just rerun the merge stage, rerunning the entire pipeline would not be the expected behavior.

I'm starting to think that split-merge, as an advanced feature, might just need to continue having the manual --rerun-from input from users.

A good question is: what are the cases when changing a parameter and using split-merge should not result in the entire pipeline to be re-run from the beginning? (Most parameters will require a rerun)

@smathermather
Copy link
Contributor

As currently implemented, split-merge is an expert only tool, so I don't necessarily mind if it stays as-is, although defaulting to rerunning for most parameter changes would be preferable. If I do the split-merge re-write, I'll need help thinking through how it interacts with this pull request, though I probably won't have bandwidth for the at rewrite until I get the org through the first end of year reporting.

@pierotofy
Copy link
Member Author

pierotofy commented Nov 12, 2023

I'll change it so that if any parameter is modified and --split is set, the pipeline will rerun from split (unless explicitly overriden via --rerun-*). Thanks for helping with testing.

@smathermather
Copy link
Contributor

Sounds good.

@pierotofy pierotofy merged commit 5557038 into OpenDroneMap:master Nov 13, 2023
2 checks passed
@lupionpe
Copy link

I was on the forum and I saw this post that mentioned this PR.
Is this implemented in NodeODM?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants