Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Solves several issues with Nuth&Kaab coregistration #135

Merged
merged 9 commits into from
Jun 3, 2021

Conversation

adehecq
Copy link
Member

@adehecq adehecq commented May 28, 2021

No description provided.

@adehecq adehecq marked this pull request as draft May 28, 2021 15:01
@adehecq adehecq linked an issue May 28, 2021 that may be closed by this pull request
@adehecq
Copy link
Member Author

adehecq commented May 28, 2021

@erikmannerfelt Should I just get rid of the progress bar? It looks ugly when the iterations are being stopped, which is the case most of the time I would assume now. And it would make handling the verbose easier. Right now I allow two levels of verbose...
image

xdem/coreg.py Outdated Show resolved Hide resolved
@adehecq
Copy link
Member Author

adehecq commented May 28, 2021

Left to do:

  • add option to plot figures of intermediate results
  • check why coreg does not work on some own DEMs, while it works with old geoutils...

@rhugonnet
Copy link
Member

Should we think of a reproducible format of plotting function we can use for any coreg/bias correction (start/end at least, and during iteration for some)?
For saving plots, I'd recommend putting everything in a beautiful PDF (like this https://github.com/iamdonovan/pybob/blob/master/pybob/coreg_tools.py#L444)

@erikmannerfelt
Copy link
Contributor

@adehecq sure, the progress bar can go! I still recommend tqdm without specifying the total, as it nicely updates on the iteration count, number of iterations per second, the total elapsed time, and any other optional text (such as the current NMAD).

@adehecq
Copy link
Member Author

adehecq commented May 31, 2021

@adehecq sure, the progress bar can go! I still recommend tqdm without specifying the total, as it nicely updates on the iteration count, number of iterations per second, the total elapsed time, and any other optional text (such as the current NMAD).

What do you mean exactly?
Currently the loop is as follow:
for i in trange(self.max_iterations, disable=not progressbar):
If the progress bar is enabled and I print any statement within the loop, then the progress bar and prints intertwine and it looks ugly. But if there is a better way to do it, let me know!

@erikmannerfelt
Copy link
Contributor

@adehecq sure, the progress bar can go! I still recommend tqdm without specifying the total, as it nicely updates on the iteration count, number of iterations per second, the total elapsed time, and any other optional text (such as the current NMAD).

What do you mean exactly?
Currently the loop is as follow:
for i in trange(self.max_iterations, disable=not progressbar):
If the progress bar is enabled and I print any statement within the loop, then the progress bar and prints intertwine and it looks ugly. But if there is a better way to do it, let me know!

Use the tqdm class:

>>> from tqdm import tqdm
>>> import time
>>> progress_bar = tqdm()
>>> for i in range(1000):
...     time.sleep(1)
...     progress_bar.update()
...     progress_bar.desc = f"Running iteration {i}"
... progress_bar.close()
Running iteration 0: 2it [00:35, 14.99s/it]

It could also be used in a with context:

with tqdm() as progress_bar:
    for i in range(1000):
        time.sleep(1)
        progress_bar.update()
        progress_bar.desc = f"Running iteration {i}"

@erikmannerfelt
Copy link
Contributor

It's great that you're finding these issues! Can you find a way to improve the tests, so that we know for certain that it works?

…mp, as it is currently too slow on large arrays
xdem/coreg.py Outdated
@@ -892,13 +892,15 @@ class Deramp(Coreg):
Estimates an n-D polynomial between the difference of two DEMs.
"""

def __init__(self, degree: int = 1):
def __init__(self, degree: int = 1, max_points: int = 5e5):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! It kind of reinvents the subsample argument for self.fit(). On the other hand, the subsample argument cannot currently subsample differently for different parts in the pipeline.

I'll add an issue for that, and we can have it like you implemented right now (but it would be nice to have automatically for all Coreg subclasses!)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My idea was that this argument would be used by the subsampling functionality directly. I could just rename "max_point" into "subsample" if that makes it more consistent.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the Coreg implementation as an issue in #137

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The already existing subsample script solves this I think:

random_falses = np.random.choice(np.argwhere(full_mask.flatten()).squeeze(), int(subsample), replace=False)

It takes a random choice of row/col indices that are finite.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't know this existed! I need to check that out. Anyway, this should be a separate function as we discussed.

The issue with how it is implemented now is that it masks pixels. This raises two issues:

  • it does not reduce the number of points, so we're possibly duplicating the data. Ideally, we should pass a flattened 1D array to the underlying _fit_func, but then they all need to accept 1d arrays. This will not work for Nuth&Kaab yet as we need a raster to calculate the slope/aspect.
  • masking random pixels will have a terrible effect on the Nuth & Kaab as the slope/aspect need to be calculated, which involve using neighbor pixels that might possibly be masked...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also explains my question to you regarding the inliers_mask @erikmannerfelt. Currently, by default subsample = 1, which means inlier_mask is never used!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's an issue with your subsampling method... rows and cols are reversed!
This is why we should have a separate function for this and test it.
Also, for now it is only implemented for a 2D case. I would like it to be more generic.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found an easy way to code this subsampling function. I will add it as an issue and work on it in a different PR.

@adehecq
Copy link
Member Author

adehecq commented Jun 2, 2021

Use the tqdm class:

>>> from tqdm import tqdm
>>> import time
>>> progress_bar = tqdm()
>>> for i in range(1000):
...     time.sleep(1)
...     progress_bar.update()
...     progress_bar.desc = f"Running iteration {i}"
... progress_bar.close()
Running iteration 0: 2it [00:35, 14.99s/it]

It could also be used in a with context:

with tqdm() as progress_bar:
    for i in range(1000):
        time.sleep(1)
        progress_bar.update()
        progress_bar.desc = f"Running iteration {i}"

It was even simpler than this. I found out that tqdm objects have a write method to print statements on screen. See dc93dfc.
But I'm not really convinced by the usefulness of the progressbar now that each iteration is quick and the algorithm usually converges in 2-3 iterations...

@adehecq
Copy link
Member Author

adehecq commented Jun 2, 2021

Left to do:

  • improve tests, in particular to include more NaNs, and test several subfunctions like get_horizontal_shift
  • add option for plotting some outputs
  • add a raster subsampling function
    This PR took a lot more time than I though. Maybe for the sake of fixing the major bugs, should we merge the PR as it is, @erikmannerfelt and @rhugonnet ? Then I can add the above point to the issue list.

@adehecq adehecq marked this pull request as ready for review June 2, 2021 17:05
xdem/coreg.py Outdated Show resolved Hide resolved
xdem/coreg.py Outdated Show resolved Hide resolved
@erikmannerfelt
Copy link
Contributor

Left to do:

* improve tests, in particular to include more NaNs, and test several subfunctions like get_horizontal_shift

* add option for plotting some outputs

* add a raster subsampling function
  This PR took a lot more time than I though. Maybe for the sake of fixing the major bugs, should we merge the PR as it is, @erikmannerfelt and @rhugonnet ? Then I can add the above point to the issue list.

I agree that your points can be issues for later instead! I just have some questions and then I am happy

Copy link
Contributor

@erikmannerfelt erikmannerfelt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants