-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update field descriptions for ludwig-docs #3123
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an awesome enhancement. Not only does this make things more standardized for the repeated parameters, but the new info you've added is very insightful and helpful. I just added in a few thoughts I had while reading through the new metadata but overall love it. Thank you!
on the computational load of the model and might require further hypterparameter | ||
tuning | ||
expected_impact: 2 | ||
suggested_values: The default value will work well in the majority of the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking maybe this field should be empty or None and then we put this text under suggested_value_reasoning
. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call, done.
ludwig/schema/common_fields.py
Outdated
def DropoutField(default: float = 0.0, description: str = None, parameter_metadata: ParameterMetadata = None) -> Field: | ||
description = description or ( | ||
"Default dropout rate applied to fully connected layers. " | ||
"Increasing dropout is a common form of regularization to combat overfitting." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could be useful to add in here or in common.yaml
what exactly the percentage value is doing - i.e. 0.85 means that 85% of the nodes in that layer will be kept or 85% will be dropped - or something along those lines. Reason I say that is that personally, when I've looked into dropout in the past, it's a fairly easy idea to conceptualize but I've experienced different tools interpreting the input value in different ways. For instance, I've seen 0.85 mean drop 85% in one tool and keep 85% in another. So I feel like specifying this somewhere - probably common.yaml
in suggested_values_reasoning
as I'm writing this out - could be valuable, since the user will likely look this up and have a similar question. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very good callout. Will put this in the description.
ludwig/schema/combiners/tabnet.py
Outdated
@@ -22,7 +22,9 @@ def module_name(): | |||
) | |||
|
|||
size: int = schema_utils.PositiveInteger( | |||
default=32, description="`N_a` in the paper.", parameter_metadata=COMBINER_METADATA["TabNetCombiner"]["size"] | |||
default=32, | |||
description="Size of the hidden layers. `N_a` in the paper.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No AI here, just a thought: I don't think our descriptions render text surrounded in backticks as a code snippet yet. This would be an improvement that makes the UI a little crisper and more meaningful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, it's mostly useful for the ludwig docs at the moment, which uses markdown.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
- https://machinelearningmastery.com/batch-normalization-for-training-of-deep-neural-networks/ | ||
related_parameters: | ||
- norm_params | ||
suggested_values: '"batch" or "layer"' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't know if we should remove these necessarily, but I've always thought it a little unhelpful that the suggested values here are just both of the available values. I get that they are better than None, but I feel like from 85% of user's view they will see this and think that it's not very helpful since it's essentially suggesting that if the user is going to use this parameter they should use it. There are a couple other instances similar to this one throughout the metadata so do you think it would make sense to remove the suggested value here or just leave it in and gloss over it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, will remove.
…into up-descriptions
for more information, see https://pre-commit.ci
Thanks for the great comments, @connor-mccorm! |
…into up-descriptions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👌🏽
) | ||
|
||
|
||
INITIALIZER_SUFFIX = """ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that this won't be possible after #3075
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, true, we will need to refactor this.
@@ -12,6 +14,13 @@ | |||
class ComparatorCombinerConfig(BaseCombinerConfig): | |||
"""Parameters for comparator combiner.""" | |||
|
|||
def __post_init__(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps better placed in the config_validation.checks
suite. Or we should decentralize that suite when you have schema-specific requirements (e.g. see check_sequence_concat_combiner_requirements
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I originally had it there, but it was very ugly and hacky. I actually much prefer having it here, to better encapsulate the validation logic. In the future, we can look to remove this by combining num_fc_layers
and fc_layers
into a single oneOf
option.
ludwig/schema/combiners/tabnet.py
Outdated
@@ -22,7 +22,9 @@ def module_name(): | |||
) | |||
|
|||
size: int = schema_utils.PositiveInteger( | |||
default=32, description="`N_a` in the paper.", parameter_metadata=COMBINER_METADATA["TabNetCombiner"]["size"] | |||
default=32, | |||
description="Size of the hidden layers. `N_a` in the paper.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
ludwig/schema/combiners/tabnet.py
Outdated
parameter_metadata=COMBINER_METADATA["TabNetCombiner"]["output_size"], | ||
) | ||
|
||
num_steps: int = schema_utils.NonNegativeInteger( | ||
default=3, | ||
description="Number of steps / repetitions of the the attentive transformer and feature transformer " | ||
"computations. `N_steps` in the paper ", | ||
"computations. `N_steps` in the paper.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I know it's a seminal paper but perhaps we should say (Arik and Pfister, 2019)
instead of literally the paper
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I agree that's. good change.
"Requires all fully connected layers to have the same `output_size`." | ||
) | ||
parameter_metadata = parameter_metadata or COMMON_METADATA["residual"] | ||
return schema_utils.Boolean( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
idk if this is that much more readable/efficient but since we're passing all local variables you could say e.g. Boolean(**locals())
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's true, but a little dangerous if someone ever adds a new local var.
@@ -502,15 +502,25 @@ class GradientClippingConfig(schema_utils.BaseMarshmallowConfig): | |||
"""Dataclass that holds gradient clipping parameters.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you mind removing desciption = TODO
from OptimizerDataclassField
above? :D
No description provided.