-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add first draft of Bernstein polynomial flow #32
The head ref may contain hidden characters: "bernstein\u{1F4C8}"
Conversation
Hello @oduerr, thank your for your PR and contributing! I took a (very) quick look and could not find where the bug come from, especially if you were able to train successfully. This part of the test is to ensure that all parameters are used and registered within the computation graph. I'll have to dig deeper, but I'll be flying to NeurIPS tomorrow, so I might not be able to before the end of the conference. Thanks again and sorry for the wait. |
Hello @francois-rozet, thank you very much for your fast reply (and sorry for my late answer). Concerning training, I made a small notebook https://github.com/oduerr/zuko/blob/bernstein/oliver_tester/bern_tester.ipynb demonstrating the successful training for an unconditional 1-D distribution. A side remark concerning the training: if I understand the framework correctly, it is built so that during training, the inverse is used (Flow goes from "Latent to Data"). Is it possible to specify the flow in a way that in the training no inverse is used (Flow goes from "Data to Latent")? It would be great if one could choose the direction and trade-off training vs. sampling. No pressure with your reply; enjoy NeurIPS first! Best, |
Hi @francois-rozet, Thank you for your prompt response. Best regards, |
Hello @oduerr and @MArpogaus, I will take a look at this now.
So it is actually the opposite. In Zuko, the forward is assumed to be "data to latent". This is because the inverse of some transformations is not differentiable.
It is already the case actually, and it is indeed related to For example, in the training from energy tutorial, I use |
@oduerr I have an issue. I cannot pull your PR on my computer because of the branch name which contains a emoji 😆 Would it be possible to rename your branch (without the emoji) and re-submit a PR if necessary? |
bound = 10, | ||
**kwargs, | ||
): | ||
super().__init__(self.f, phi=theta_un, bound=bound, **kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is the bug 🐛 phi
should be a list of parameters (in this case it should be (theta_un,)
. Otherwise I think everything works, good job! I will refactor the code a bit to make it compliant with Zuko's conventions.
Hello again @oduerr and @MArpogaus, as I was unable to pull this PR. I copied the code and created a branch (#33) on the Zuko repo. In addition to fixing the bug with
Otherwise, its the same. Great job! Tell me if you see a mistake in #33 or want something to be changed. |
Hello Francois,
thank you very much for taking care of this! I guess I was overdoing it
with the emojis, sorry ;-)
I went over the code and had a look at your changes. They all look fine.
Concerning point 3, the sigmoid is a good idea. For the very same reason,
we had this in our original paper. But we sometimes found it beneficial to
linearly map $[-B,B]$ to $[0,1]$ (especially when the bounds are known).
Would it be possible to have the sigmoid as a save default setting but also
have the option to linearly map $[-B,B]$ to $[0,1]$?
Along these lines, would it be possible to linearly extrapolate bounded
transformations, using the values and derivatives at the bounds? This would
be also an option for other bounded transformations.
I am impressed with the speed of zuko, we had no implementations as fast as
zuko of our flows. Thanks!
…On Fri, 12 Jan 2024 at 23:44, François Rozet ***@***.***> wrote:
Hello again @oduerr <https://github.com/oduerr> and @MArpogaus
<https://github.com/MArpogaus>, since I was unable to pull this PR. I
copied the code and created a branch (#33
<#33>) on the Zuko repo. In
addition to fixing the bug with phi, I have made some changes:
1. As a polynomial flow, I moved the flow to the zuko.flows.polynomial
module. I also renamed it BPF (Bernstein polynomial flow) to follow
Zuko's conventions.
2. I used torch.arange(1, degree + 1) instead of torch.tensor(range(1,
degree)) to create $\alpha$ and $\beta$.
3. I used the sigmoid function $\sigma(x) = \frac{1}{1 + \exp(-x)}$ to
map the inputs to $[0, 1]$ instead of $\frac{x + B}{2B}$. This ensures
that the domain is $\mathbb{R}$. Note that the transform is only
invertible for the domain $[-B, B]$ because these are the initial
bounds of the bisection method.
Otherwise, its the same. Great job! Tell me if there is you see a mistake
in #33 <#33>.
—
Reply to this email directly, view it on GitHub
<#32 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABDEA5B664DYM534A4DT46TYOG4FFAVCNFSM6AAAAABALDFAKOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOJQGA3TQNJUGU>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Yes it is possible to add the option to switch to a linear map. However, because the output of the transformation is unbounded, stacking several Bernstein transformations would likely fail during training. I like your proposition of linearly extrapolating outside of the bounds better. This is what spline transformations do actually. However, for splines it is trivial to compute the values and derivatives at the bounds. For your case, I don't see how to do it cheaply/easily. I propose to add the option to switch to a linear map instead of a sigmoid (+ assert that inputs are in bounds) in #33 and merge the PR. Then you can submit a new PR (or an issue if you don't know how to tackle this either) that adds the linear extrapolation feature later on. WDYT?
Thanks! |
Extrapolating the Bernstein Polynomial is trivial, when exploiting its properties. @oduerr: Maybe we can have a call by the beginning of next week to discuss the matter?
Sounds good to me. Thanks :) One more general question: Is there code available where you evaluate typical flows on established benchmark data like the one provided from @gpapamak here? If not, here is a self contained script to download, verify and prepare the data provided above. I will use this as a starting point to do some comparisons. # removed for readability |
Can I merge the PR #33 then?
You can check https://github.com/francois-rozet/uci-datasets to download the data and the benchmark branch for benchmarks on toy problems. Its a work in progress though.
|
Sorry for the late reply. Yes, you can merge PR #33. We will work on the extrapolation on Monday, but this could become another request. Thanks! |
Hello @oduerr and @MArpogaus, I have merged #33 🥳 and added you as authors. I will soon release a new version ( I think we can now close the present PR and you are welcome to open a new issue/PR for the additional features we discussed. |
Hello,
I am the co-author of the Bernstein-Flows paper https://arxiv.org/abs/2004.00464 in which we constructed a one-dimensional flow based on Bernstein polynomials. We already have a TF implementation.
This flow would fit nicely in the zuko framework as an alternative to NSF. I am not an expert in software development and pytorch. I managed to write a transformation and also a flow. The "test_transform" works and I have usd the flow successfully for training, but I still have an exception in the unit test (test_flows) at the following position
I tried hard but could not find any reason why this was failing. Possibly I lack the knowledge of the bells and whistles of your package. Could you please consider the request?
Best,
Oliver and Marcel (@MArpogaus)