-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: unify Learner1D, Learner2D and LearnerND #220
base: main
Are you sure you want to change the base?
Conversation
cd579ae
to
1e633cd
Compare
The following are not yet supported:
|
Also compared to the existing implementation, global rescaling is missing. (e.g. computing y-scale of the values, and normalizing the data by it) |
Should this perhaps be something that the loss function does itself? For example the isosurface loss needs the unmodified data. I could imagine extending the loss function signature to include the x and y scales. |
Indeed, but then the learner needs two extra hooks: one at each step to update global metrics, and another one to trigger loss recomputation for all subdomains once the global metrics changes sufficiently much so that old losses become obsolete. |
I added this now |
TODO
|
All other learners implement Would that change anything? Now I see you set |
Now I believe the only 3 things left to do are:
|
There's also the question of how to review this MR. It's a lot of code. It may also be that this implementation is inferior to the current LearnerND, and we don't want to merge it at all. For example, the old LearnerND is 1180 lines, whereas this new implementation in 1350 (including the priority queue, domain definitions etc) |
This case is very common when inserting 1 point and then splitting at the same point. This change gives a ~25% speedup to the new LearnerND.
vectors = np.subtract(coords[1:], coords[0]) | ||
if np.linalg.matrix_rank(vectors) < dim: | ||
raise ValueError( | ||
"Initial simplex has zero volumes " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not necessarily a single simplex at this point; this should read "Hull has a zero volume", right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. This error message was already wrong before I touched it. Maybe I should update it
__all__ = ["Domain", "Interval", "ConvexHull"] | ||
|
||
|
||
class Domain(metaclass=abc.ABCMeta): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be useful to add a class docstring with a description of the important properties of this object.
# in 'sub_intervals' in a SortedList. | ||
self.bounds = (a, b) | ||
self.sub_intervals = dict() | ||
self.points = SortedList([a, b]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would in principle be sufficient to store only the boundaries of the interval and always query for pairs of neighboring points. The current implementation is quite OK OTOH.
Before this fix when telling more than 1 point it was possible that we would attempt to remove subdomains from the queue that were only produced "temporarily" when adding points.
These are not catching an interesting class of bugs for now. We should reintroduce these when we come up with a firmer learner API.
This does not test very degenerate domains, but for the moment we just want to test that everything works for domains defined over hulls with different numbers of points. Adaptive will typically not be used on very degenerate domains.
We now use numpy.random for generating random points, which produces much less degenerate examples than using Hypothesis' "floats" strategy. At this point we don't want to test super awkward cases.
Re: loss function format In ND we can pass the following data:
In 1D we probably should adopt the naturally ordered format. |
While writing the adaptive paper we managed to write down a simple algorithm formulated in terms of abstract domains and subdomains.
Implementing such an algorithm in a learner should allow us to unify Learner1D, Learner2D and LearnerND.