Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update field descriptions for ludwig-docs #3123

Merged
merged 14 commits into from
Feb 21, 2023
Merged

Update field descriptions for ludwig-docs #3123

merged 14 commits into from
Feb 21, 2023

Conversation

tgaddair
Copy link
Collaborator

No description provided.

Copy link
Contributor

@connor-mccorm connor-mccorm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an awesome enhancement. Not only does this make things more standardized for the repeated parameters, but the new info you've added is very insightful and helpful. I just added in a few thoughts I had while reading through the new metadata but overall love it. Thank you!

on the computational load of the model and might require further hypterparameter
tuning
expected_impact: 2
suggested_values: The default value will work well in the majority of the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking maybe this field should be empty or None and then we put this text under suggested_value_reasoning. WDYT?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call, done.

def DropoutField(default: float = 0.0, description: str = None, parameter_metadata: ParameterMetadata = None) -> Field:
description = description or (
"Default dropout rate applied to fully connected layers. "
"Increasing dropout is a common form of regularization to combat overfitting."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be useful to add in here or in common.yaml what exactly the percentage value is doing - i.e. 0.85 means that 85% of the nodes in that layer will be kept or 85% will be dropped - or something along those lines. Reason I say that is that personally, when I've looked into dropout in the past, it's a fairly easy idea to conceptualize but I've experienced different tools interpreting the input value in different ways. For instance, I've seen 0.85 mean drop 85% in one tool and keep 85% in another. So I feel like specifying this somewhere - probably common.yaml in suggested_values_reasoning as I'm writing this out - could be valuable, since the user will likely look this up and have a similar question. WDYT?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good callout. Will put this in the description.

@@ -22,7 +22,9 @@ def module_name():
)

size: int = schema_utils.PositiveInteger(
default=32, description="`N_a` in the paper.", parameter_metadata=COMBINER_METADATA["TabNetCombiner"]["size"]
default=32,
description="Size of the hidden layers. `N_a` in the paper.",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No AI here, just a thought: I don't think our descriptions render text surrounded in backticks as a code snippet yet. This would be an improvement that makes the UI a little crisper and more meaningful.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, it's mostly useful for the ludwig docs at the moment, which uses markdown.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

- https://machinelearningmastery.com/batch-normalization-for-training-of-deep-neural-networks/
related_parameters:
- norm_params
suggested_values: '"batch" or "layer"'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't know if we should remove these necessarily, but I've always thought it a little unhelpful that the suggested values here are just both of the available values. I get that they are better than None, but I feel like from 85% of user's view they will see this and think that it's not very helpful since it's essentially suggesting that if the user is going to use this parameter they should use it. There are a couple other instances similar to this one throughout the metadata so do you think it would make sense to remove the suggested value here or just leave it in and gloss over it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, will remove.

@github-actions
Copy link

github-actions bot commented Feb 20, 2023

Unit Test Results

         6 files  ±  0           6 suites  ±0   5h 46m 7s ⏱️ - 14m 42s
  3 961 tests +  5    3 924 ✔️ +  5    37 💤 ±0  0 ±0 
11 839 runs  +15  11 729 ✔️ +15  110 💤 ±0  0 ±0 

Results for commit d5e2bac. ± Comparison against base commit c638c23.

♻️ This comment has been updated with latest results.

@tgaddair
Copy link
Collaborator Author

Thanks for the great comments, @connor-mccorm!

Copy link
Collaborator

@ksbrar ksbrar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👌🏽

)


INITIALIZER_SUFFIX = """
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this won't be possible after #3075

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, true, we will need to refactor this.

@@ -12,6 +14,13 @@
class ComparatorCombinerConfig(BaseCombinerConfig):
"""Parameters for comparator combiner."""

def __post_init__(self):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps better placed in the config_validation.checks suite. Or we should decentralize that suite when you have schema-specific requirements (e.g. see check_sequence_concat_combiner_requirements).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I originally had it there, but it was very ugly and hacky. I actually much prefer having it here, to better encapsulate the validation logic. In the future, we can look to remove this by combining num_fc_layers and fc_layers into a single oneOf option.

@@ -22,7 +22,9 @@ def module_name():
)

size: int = schema_utils.PositiveInteger(
default=32, description="`N_a` in the paper.", parameter_metadata=COMBINER_METADATA["TabNetCombiner"]["size"]
default=32,
description="Size of the hidden layers. `N_a` in the paper.",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

parameter_metadata=COMBINER_METADATA["TabNetCombiner"]["output_size"],
)

num_steps: int = schema_utils.NonNegativeInteger(
default=3,
description="Number of steps / repetitions of the the attentive transformer and feature transformer "
"computations. `N_steps` in the paper ",
"computations. `N_steps` in the paper.",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I know it's a seminal paper but perhaps we should say (Arik and Pfister, 2019) instead of literally the paper.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I agree that's. good change.

"Requires all fully connected layers to have the same `output_size`."
)
parameter_metadata = parameter_metadata or COMMON_METADATA["residual"]
return schema_utils.Boolean(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

idk if this is that much more readable/efficient but since we're passing all local variables you could say e.g. Boolean(**locals())?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true, but a little dangerous if someone ever adds a new local var.

@@ -502,15 +502,25 @@ class GradientClippingConfig(schema_utils.BaseMarshmallowConfig):
"""Dataclass that holds gradient clipping parameters."""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind removing desciption = TODO from OptimizerDataclassField above? :D

@tgaddair tgaddair merged commit 0b8765b into master Feb 21, 2023
@tgaddair tgaddair deleted the up-descriptions branch February 21, 2023 07:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants