-
Notifications
You must be signed in to change notification settings - Fork 66
feat(tuner): allow adjustment of optimizer #128
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i personally think stopping at optimizer
level is sufficiently enough, how do you think? I mean even when I'm using pytorch training model by myself, i almost never change parameters such as beta1
, beta2
, nesterov
etc., let alone using finetuner
Yes I agree. However, as you can see, the 3 frameworks sometimes actually have different default values. And I guess it is in our interest to ensure that training is the same for all 3 frameworks. As this already involves creating some default params, we might as well make them available to the users, if they wants to use them. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
understand, if we decided to keep the optimizer_kwargs
then let's keep it.
) | ||
elif optimizer == 'sgd': | ||
return keras.optimizers.SGD(learning_rate=learning_rate, **optimizer_kwargs) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
raise an value error when optimizer name is not recognised
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is already done in _get_optimizer_kwargs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason for this is that I try to factor out as much code as possible to the base class, so that it is not repeated in each framework
@@ -39,6 +44,32 @@ def _get_data_loader(self, inputs, batch_size: int, shuffle: bool): | |||
shuffle=shuffle, | |||
) | |||
|
|||
def _get_optimizer( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
personally i'm not a big fan of get_something
, in Java or other OOP world, it feels like a getter
or setter
, if it is a getter
, then in Python I would turn it into a property, otherwise I'll rename it how do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a factory function, not a getter or setter. And we already have this in the current base class (see _get_data_loader
), so I would stick with this convention for now. You can raise a separate PR to change this if you want
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw, speaking of get
naming and Java: https://www.informit.com/articles/article.aspx?p=1216151#:~:text=getType%E2%80%94Like%20getInstance%2C%20but%20used%20when%20the%20factory%20method%20is%20in%20a%20different%20class.%20Type%20indicates%20the%20type%20of%20object%20returned%20by%20the%20factory%20method.
…er into feat-configure-optimizer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Adds 3 different optimizers
they are customizable (I had to find intersection of options allowed under all 3 frameworks).
Missing docstrings/documentation, but let's add that once we have everything (currently there are no docstrings in place, this PR does not feel like the right place to add them)
Also, you can now adjust learning rate. The new deafult is now
1e-3
, a magintude lowe, as @maximilianwerk observed problems with high learning rate (and in general in applications the lr is low, even lower than new default)