-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PiBO Pull Request 2 - Beta Hyperparameters #222
Conversation
sampled outside its permitted range.
…king pdf regardlessx of input
hyperparameters.pyx where i changed Typing.
beta and normal distributions
parameters - now, they are generated the same way as the uniform (but scaled, so that it makes sense even though they are not normalized)
hyperparamretersChanged back to standard local search procedure for the normal hyperparamreters
class down to the subclasses Float, Constant, Integer, Categorical and Ordinal Hyperparameters (as these almost certainly can have a well-defined pdf, or that it would make sense to consider a pdf for these HP types).
all the classes that implement a pdf
… betaparams_merge
for both (identical to Uniform parameters
the other parameter types - follows as closely as possible
Done. Yeah, I think it turned out better than I first expected. Those small errors are now adressed. Hopefully BetaIntHP doesn't hide any unsuspected surprises. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, I made my way through the tests. They look good, but I have some further comments because I think they can be simplified quite a bit.
Please also pull the latest changes before making any changes yourself because I just fixed a merge conflict with the Web UI.
test/test_hyperparameters.py
Outdated
# Testing that no error is thrown | ||
BetaFloatHyperparameter("param", lower=1, upper=1000.0, alpha=2.0, beta=3.0, q=3.0) | ||
|
||
self.assertAlmostEqual(f5_legal_nolog.default_value, 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would the default value be one in both settings? Shouldn't the log move the default value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we could have a separate test for the default values that checks for log, quantization, and alpha being 1 and beta being one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same holds for the integer version. Looking at the code, this is really the only thing that is different than the parent class and that needs to be tested thoroughly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's up to us to decide whether log should move the default value or not. Personally, I think it shouldn't.
Yeah sure, I can split them up and do some more explicit tests for default values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's up to us to decide whether log should move the default value or not. Personally, I think it shouldn't.
I think diagree on this one. Let's take the example you did below with a hyperparameter from 1 to 100. In the non-log case the default (in my opinion) should be 50, while in the log case it should be 10. In both cases it would be the middle of the distribution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, on that one I completely agree! I was strictly talking about the case where the user specifies the default, which is the case in the example you brought up.
test/test_hyperparameters.py
Outdated
f2_actual = f2.to_uniform() | ||
self.assertEqual(f2_expected, f2_actual) | ||
|
||
def test_betafloat_is_legal(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't we inherit is_legal
now? Then we wouldn't have to test it, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True.
test/test_hyperparameters.py
Outdated
f1_actual = f1.to_uniform() | ||
self.assertEqual(f1_expected, f1_actual) | ||
|
||
def test_betaint_is_legal(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above, don't we inherit is_legal
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep.
test/test_hyperparameters.py
Outdated
with self.assertRaises(ValueError): | ||
BetaIntegerHyperparameter("param", lower=-1, upper=10.0, alpha=6.0, beta=2.0, log=True) | ||
|
||
with self.assertRaisesRegex(ValueError, "Illegal default value 0"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if we need to test such things that we inherit from the super class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, can remove!
test/test_hyperparameters.py
Outdated
f4_legal_log = BetaIntegerHyperparameter( | ||
"param", lower=1, upper=10.0, alpha=3.0, beta=2.0, default_value=1, log=False) | ||
|
||
self.assertAlmostEqual(f4_legal_nolog.default_value, 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above, I'm surprised about this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same response - I think it's up to us to decide what makes sense here. My reasoning is the following:
If you design a parameter on a log scale - with range from 1-100, with the default in the middle (at 10), you want to pass in 10, and not 1, as the default value - which would be the alternative. I simply think it is the more intuitive way to do it.
So, what do we still want done? Here's my interpretation:
Is that all? |
That is an excellent question, I tthink yes.
👍
👍 However, I currently see a different issue with the sampling, and I'm sorry for not seeing that earlier. We need to sample uniformly in [0,1] and then transform into the beta distribution in the transform function. Similarly, the inverse_transform must map into a uniformly distributed [0,1]. |
However, I currently see a different issue with the sampling, and I'm sorry for not seeing that earlier. We need to sample uniformly in [0,1] and then transform into the beta distribution in the transform function. Similarly, the inverse_transform must map into a uniformly distributed [0,1].I'm not sure I get it. From my own tests, it seems to sample as intended. Can you specify? I'm on the rest, it should be done in a couple of hours. |
Yes, it samples correctly, but I think the behavior is still incorrect. In the unit hypercube everything should be uniformly distributed, and we then transform into the desired distribution, similar to the idea of inverse transform sampling. By guaranteeing that the unit hypercube is uniform, all models can work with the data out of the box, and procedures like the neighborhood generation work the same for algorithm. As before, I'm sorry that this is not yet documented and that it didn't come up before, and we need to improve on this. |
Sorry, but I still don't really follow what behavior you are referring to! |
Unfortunately, such behavior is not implemented yet. I think the required changes would be:
The outcome of sampling from the distribution should remain the same, but the meaning of the transformed values would change. |
No, I disagree. This would equate to warping the entire search space like in this, relatively famous paper: That is not what I sought out to do here. Neither is this the case for the normalHPs that are already implemented. |
If one wanted to leave the door open for that to be a possibility, you would have to reconsider the transforms. If that's the case, however, the Beta cannot inherit from the uniform in any way whatsoever. Maybe you pointed this out some time ago, but if not, I'm doing it now! |
So, all in all. I personally am not looking to warp the search space, but only to have a parameter which one can sample from and evaluate the pdf of the accompanying beta distribution. These two things do not really go hand in hand as things are currently designed. |
Yes, I fully agree on that.
They are unbounded, so we cannot really normalize them. For the bounded ones this is in my opinion not implemented correctly at the moment and should be implemented to the warped version, too.
Hm, maybe we should follow the rather pragmatic approach of getting this merged. It right now does the right thing and so having it in is definitely a huge step forward. Formalizing whether we should be in a uniformly distributed hypercube or just a hypercube is something we can do later and that should involve the other stakeholders and users of the ConfigSpace, too. |
default values. Notably, quantization plus logging does not yield correct default values (i.e., not the mode of the distribution) all the time, as there is some undesired adjustment of the bounds ._lower and ._upper in UniformFloatHyperparameter which skews it.
I pulled the latest changes and got quite a lot of failed tests (3 of them). Do you have this, too? |
Tests appear to work in CI. Documentation building will be fixed in another PR. However, formatting doesn't look good yet, could you please apply the |
Absolutely. The Integer default tests are still to go (though they are almost 100% reliant on the float ones, so they can be a lot shorter). I think that's a good approach, let's do that! |
Okay, please just give me a go when I shall have a final look at the tests then (preferably with the pre-commit having a positive status, too). |
@mfeurer Okay, now I am fairly optimistic that we're done with this one. There was one more issue that arose: When doing both quantization and logging simultaneously, the default values of the beta are not computed correctly. This is because the bounds of the normal, ._lower and ._upper, which the transformation relies on, get expanded by 0.5 in each direction in non-logspace. When the default is in the center of the search space (like with uniforms), but it is not when the default is somewhere else in the interior. This issue is likely insignificant, and it is an unintended consequence of inheriting from UniformFHP. The easy solution is for the user to specify a default when using BetaHP+log+quant+no default. Added a warning in BetaFloat specifically for this case. |
Notably, I do not test for that niche case either, as any test values would simply be incorrect in terms of the actual computation. If you want, I can elaborate on this more, but just know that it is only relevant in this specific setting of log+quant (or int) + no default. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. I think I only found one minor thing: a comment which was accidentally copied over.
Co-authored-by: Matthias Feurer <lists@matthiasfeurer.de>
Awesome! it is now changed, too. I will jump on the last PR and integrate all that's been discussed. However, most of it has been kept up to speed with the rest. |
Great, just merged! @eddiebergman as this PR no longer includes the neighborhood generation, we should follow up on that in a separate thread/PR. I'll also post an issue to discuss the hypercube scaling/warping. |
PiBO Pull Request 2 - Beta Hyperparameters
Added BetaFloat and BetaInteger HPs