Add first draft of Bernstein polynomial flow #32

oduerr · 2023-12-07T13:53:25Z

Hello,

I am the co-author of the Bernstein-Flows paper https://arxiv.org/abs/2004.00464 in which we constructed a one-dimensional flow based on Bernstein polynomials. We already have a TF implementation.
This flow would fit nicely in the zuko framework as an alternative to NSF. I am not an expert in software development and pytorch. I managed to write a transformation and also a flow. The "test_transform" works and I have usd the flow successfully for training, but I still have an exception in the unit test (test_flows) at the following position

for p in flow.parameters():
    assert p.grad is not None

I tried hard but could not find any reason why this was failing. Possibly I lack the knowledge of the bells and whistles of your package. Could you please consider the request?

Best,
Oliver and Marcel (@MArpogaus)

francois-rozet · 2023-12-07T17:08:00Z

Hello @oduerr, thank your for your PR and contributing! I took a (very) quick look and could not find where the bug come from, especially if you were able to train successfully. This part of the test is to ensure that all parameters are used and registered within the computation graph. I'll have to dig deeper, but I'll be flying to NeurIPS tomorrow, so I might not be able to before the end of the conference. Thanks again and sorry for the wait.

oduerr · 2023-12-12T16:52:06Z

Hello @oduerr, thank your for your PR and contributing! I took a (very) quick look and could not find where the bug come from, especially if you were able to train successfully. This part of the test is to ensure that all parameters are used and registered within the computation graph. I'll have to dig deeper, but I'll be flying to NeurIPS tomorrow, so I might not be able to before the end of the conference. Thanks again and sorry for the wait.

Hello @francois-rozet,

thank you very much for your fast reply (and sorry for my late answer). Concerning training, I made a small notebook https://github.com/oduerr/zuko/blob/bernstein/oliver_tester/bern_tester.ipynb demonstrating the successful training for an unconditional 1-D distribution.

A side remark concerning the training: if I understand the framework correctly, it is built so that during training, the inverse is used (Flow goes from "Latent to Data"). Is it possible to specify the flow in a way that in the training no inverse is used (Flow goes from "Data to Latent")? It would be great if one could choose the direction and trade-off training vs. sampling.

No pressure with your reply; enjoy NeurIPS first!

Best,
Oliver

MArpogaus · 2024-01-11T12:17:45Z

Hi @francois-rozet,

Thank you for your prompt response.
Is there anything we can do to assist in resolving the issues and merge this PR?

Best regards,
Marcel

MArpogaus · 2024-01-11T12:21:39Z

I was just wondering: Could this class be used to train using the forward transformation, as @oduerr suggests?

The relevant part in my tfp Implementation can be found here.

francois-rozet · 2024-01-12T18:20:12Z

Hello @oduerr and @MArpogaus, I will take a look at this now.

A side remark concerning the training: if I understand the framework correctly, it is built so that during training, the inverse is used (Flow goes from "Latent to Data").

So it is actually the opposite. In Zuko, the forward is assumed to be "data to latent". This is because the inverse of some transformations is not differentiable.

It would be great if one could choose the direction and trade-off training vs. sampling.

It is already the case actually, and it is indeed related to LazyInverse @MArpogaus! Basically, if you have a LazyTransform, you can reverse it with .inv.

For example, in the training from energy tutorial, I use .inv to reverse the transformation of an auto-regressive NSF and then train by sampling (which is now fast).

francois-rozet · 2024-01-12T19:55:12Z

@oduerr I have an issue. I cannot pull your PR on my computer because of the branch name which contains a emoji 😆 Would it be possible to rename your branch (without the emoji) and re-submit a PR if necessary?

francois-rozet · 2024-01-12T20:45:35Z

zuko/transforms.py

+        bound = 10, 
+        **kwargs,
+    ):
+        super().__init__(self.f, phi=theta_un, bound=bound, **kwargs)


Here is the bug 🐛 phi should be a list of parameters (in this case it should be (theta_un,). Otherwise I think everything works, good job! I will refactor the code a bit to make it compliant with Zuko's conventions.

francois-rozet · 2024-01-12T22:44:23Z

Hello again @oduerr and @MArpogaus, as I was unable to pull this PR. I copied the code and created a branch (#33) on the Zuko repo. In addition to fixing the bug with phi, I have made some changes:

As a polynomial flow, I moved the flow to the zuko.flows.polynomial module instead of zuko.flows.spline. I also renamed it BPF (Bernstein polynomial flow) to follow Zuko's conventions.
I used torch.arange(1, degree + 1) instead of torch.tensor(range(1, degree)) to create $\alpha$ and $\beta$.
I used the sigmoid function $\sigma(x) = \frac{1}{1 + \exp(-x)}$ to map the inputs to $[0, 1]$ instead of $\frac{x + B}{2B}$. This ensures that the domain is $\mathbb{R}$. In practice, it is only invertible for the domain $[-B, B]$ because these are the initial bounds of the bisection method, but it is still a bijection over $\mathbb{R}$.

Otherwise, its the same. Great job! Tell me if you see a mistake in #33 or want something to be changed.

oduerr · 2024-01-14T10:32:38Z

Hello Francois, thank you very much for taking care of this! I guess I was overdoing it with the emojis, sorry ;-) I went over the code and had a look at your changes. They all look fine. Concerning point 3, the sigmoid is a good idea. For the very same reason, we had this in our original paper. But we sometimes found it beneficial to linearly map $[-B,B]$ to $[0,1]$ (especially when the bounds are known). Would it be possible to have the sigmoid as a save default setting but also have the option to linearly map $[-B,B]$ to $[0,1]$? Along these lines, would it be possible to linearly extrapolate bounded transformations, using the values and derivatives at the bounds? This would be also an option for other bounded transformations. I am impressed with the speed of zuko, we had no implementations as fast as zuko of our flows. Thanks!

…

On Fri, 12 Jan 2024 at 23:44, François Rozet ***@***.***> wrote: Hello again @oduerr <https://github.com/oduerr> and @MArpogaus <https://github.com/MArpogaus>, since I was unable to pull this PR. I copied the code and created a branch (#33 <#33>) on the Zuko repo. In addition to fixing the bug with phi, I have made some changes: 1. As a polynomial flow, I moved the flow to the zuko.flows.polynomial module. I also renamed it BPF (Bernstein polynomial flow) to follow Zuko's conventions. 2. I used torch.arange(1, degree + 1) instead of torch.tensor(range(1, degree)) to create $\alpha$ and $\beta$. 3. I used the sigmoid function $\sigma(x) = \frac{1}{1 + \exp(-x)}$ to map the inputs to $[0, 1]$ instead of $\frac{x + B}{2B}$. This ensures that the domain is $\mathbb{R}$. Note that the transform is only invertible for the domain $[-B, B]$ because these are the initial bounds of the bisection method. Otherwise, its the same. Great job! Tell me if there is you see a mistake in #33 <#33>. — Reply to this email directly, view it on GitHub <#32 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABDEA5B664DYM534A4DT46TYOG4FFAVCNFSM6AAAAABALDFAKOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOJQGA3TQNJUGU> . You are receiving this because you were mentioned.Message ID: ***@***.***>

francois-rozet · 2024-01-14T18:39:52Z

But we sometimes found it beneficial to linearly map $[-B,B]$ to $[0,1]$ (especially when the bounds are known). Would it be possible to have the sigmoid as a save default setting but also have the option to linearly map $[-B,B]$ to $[0,1]$?

Yes it is possible to add the option to switch to a linear map. However, because the output of the transformation is unbounded, stacking several Bernstein transformations would likely fail during training.

I like your proposition of linearly extrapolating outside of the bounds better. This is what spline transformations do actually. However, for splines it is trivial to compute the values and derivatives at the bounds. For your case, I don't see how to do it cheaply/easily.

I propose to add the option to switch to a linear map instead of a sigmoid (+ assert that inputs are in bounds) in #33 and merge the PR. Then you can submit a new PR (or an issue if you don't know how to tackle this either) that adds the linear extrapolation feature later on. WDYT?

I am impressed with the speed of zuko, we had no implementations as fast as zuko of our flows. Thanks!

Thanks!

MArpogaus · 2024-01-16T11:08:44Z

I like your proposition of linearly extrapolating outside of the bounds better. This is what spline transformations do actually. However, for splines it is trivial to compute the values and derivatives at the bounds. For your case, I don't see how to do it cheaply/easily.

Extrapolating the Bernstein Polynomial is trivial, when exploiting its properties.
I did it here before for the tf Implementation and hopefully find some time next week to look into this and create a new PR.

@oduerr: Maybe we can have a call by the beginning of next week to discuss the matter?

I propose to add the option to switch to a linear map instead of a sigmoid (+ assert that inputs are in bounds) in #33 and merge the PR. Then you can submit a new PR (or an issue if you don't know how to tackle this either)
that adds the linear extrapolation feature later on. WDYT?

Sounds good to me. Thanks :)

One more general question: Is there code available where you evaluate typical flows on established benchmark data like the one provided from @gpapamak here?

If not, here is a self contained script to download, verify and prepare the data provided above. I will use this as a starting point to do some comparisons.

# removed for readability

francois-rozet · 2024-01-16T12:23:48Z

Sounds good to me. Thanks :)

Can I merge the PR #33 then?

Is there code available where you evaluate typical flows on established benchmark data like the one provided from @gpapamak here?

You can check https://github.com/francois-rozet/uci-datasets to download the data and the benchmark branch for benchmarks on toy problems. Its a work in progress though.

P.S. Could you host your script somewhere else (e.g. a GitHub gist)? It makes it hard to navigate the discussion 😅

oduerr · 2024-01-18T16:09:33Z

Sounds good to me. Thanks :)

Can I merge the PR #33 then?

Is there code available where you evaluate typical flows on established benchmark data like the one provided from @gpapamak here?

You can check https://github.com/francois-rozet/uci-datasets to download the data and the benchmark branch for benchmarks on toy problems. Its a work in progress though.

P.S. Could you host your script somewhere else (e.g. a GitHub gist)? It makes it hard to navigate the discussion 😅

Sorry for the late reply. Yes, you can merge PR #33. We will work on the extrapolation on Monday, but this could become another request. Thanks!

francois-rozet · 2024-01-25T12:48:51Z

Hello @oduerr and @MArpogaus, I have merged #33 🥳 and added you as authors. I will soon release a new version (1.1.0) of Zuko including BPF and other new features. Notably, it is now possible to instantiate unconditional univariate flows (features=1, context=0), which you noticed was previously impossible in your notebook.

I think we can now close the present PR and you are welcome to open a new issue/PR for the additional features we discussed.

Add first draft of Bernstein polynomial flow

87f7eb8

francois-rozet reviewed Jan 12, 2024

View reviewed changes

francois-rozet mentioned this pull request Jan 12, 2024

Add Bernstein polynomial flow #33

Merged

francois-rozet closed this Jan 29, 2024

oduerr deleted the bernstein📈 branch January 29, 2024 15:09

This was referenced Jan 29, 2024

Add Extrapolation to Bernstein Polynomial Flow #36

Closed

Add pre-commit hooks for code analysis and formatting #38

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add first draft of Bernstein polynomial flow #32

Add first draft of Bernstein polynomial flow #32

oduerr commented Dec 7, 2023

francois-rozet commented Dec 7, 2023 •

edited

Loading

oduerr commented Dec 12, 2023

MArpogaus commented Jan 11, 2024

MArpogaus commented Jan 11, 2024

francois-rozet commented Jan 12, 2024 •

edited

Loading

francois-rozet commented Jan 12, 2024

francois-rozet Jan 12, 2024

francois-rozet commented Jan 12, 2024 •

edited

Loading

oduerr commented Jan 14, 2024 via email

francois-rozet commented Jan 14, 2024 •

edited

Loading

MArpogaus commented Jan 16, 2024 •

edited by francois-rozet

Loading

francois-rozet commented Jan 16, 2024 •

edited

Loading

oduerr commented Jan 18, 2024

francois-rozet commented Jan 25, 2024 •

edited

Loading

Add first draft of Bernstein polynomial flow #32

Add first draft of Bernstein polynomial flow #32

Conversation

oduerr commented Dec 7, 2023

francois-rozet commented Dec 7, 2023 • edited Loading

oduerr commented Dec 12, 2023

MArpogaus commented Jan 11, 2024

MArpogaus commented Jan 11, 2024

francois-rozet commented Jan 12, 2024 • edited Loading

francois-rozet commented Jan 12, 2024

francois-rozet Jan 12, 2024

Choose a reason for hiding this comment

francois-rozet commented Jan 12, 2024 • edited Loading

oduerr commented Jan 14, 2024 via email

francois-rozet commented Jan 14, 2024 • edited Loading

MArpogaus commented Jan 16, 2024 • edited by francois-rozet Loading

francois-rozet commented Jan 16, 2024 • edited Loading

oduerr commented Jan 18, 2024

francois-rozet commented Jan 25, 2024 • edited Loading

francois-rozet commented Dec 7, 2023 •

edited

Loading

francois-rozet commented Jan 12, 2024 •

edited

Loading

francois-rozet commented Jan 12, 2024 •

edited

Loading

francois-rozet commented Jan 14, 2024 •

edited

Loading

MArpogaus commented Jan 16, 2024 •

edited by francois-rozet

Loading

francois-rozet commented Jan 16, 2024 •

edited

Loading

francois-rozet commented Jan 25, 2024 •

edited

Loading