Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: update list of reserved parameter names #1736

Merged
merged 4 commits into from
Feb 23, 2024

Conversation

saikonen
Copy link
Collaborator

@saikonen saikonen commented Feb 15, 2024

updated list of reserved parameter names.

closes #1735

"max-log-size",
"max_log_size",
"tags",
"user_namespace",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we also have user-namespace? it might be easier if reserved_params is a set that adds the dashed versions of all its member by itself so that it is easy to maintain longer term.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed we should standardize _ and -

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done. Lifted the list construction to a separate function

Copy link
Contributor

@romain-intel romain-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs a bit more discussion since the issue comes from non-core plugins and this is modifying a core component based on that.

"decospecs",
"max-log-size",
"max_log_size",
"tags",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hum -- we should maybe discuss this. The others were parameters to run but now these are parameters to other parts of the CLI (argo in this case). I haven't checked if this is used but it is not broken right now internally and so I am a bit leary of removing a perfectly functional name because argo has this restriction.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which restriction?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean that "--tags" is not something that is part of https://github.com/Netflix/metaflow/blob/master/metaflow/cli.py. My understanding of this issue is that it only happens because argo has a --tags option. The others are typically here because they appear in the aforementioned cli.py.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

argo also doesn't have a --tags option. actually, I don't think we need to list tags and decospecs here in this list.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here is a test flow to try out, simply comment out any of the parameters and run it locally:
python TagsParamFlow.py --metadata local --datastore local run

from metaflow import step, FlowSpec, Parameter

class TagsParamFlow(FlowSpec):
    # Covered
    # run_id_file = Parameter("run-id-file")
    # maxworkers = Parameter("max-workers")
    # maxnumsplits = Parameter("max-num-splits")
    # user_namespace = Parameter("user-namespace")
    
    # Not Covered
    # tags = Parameter("tags")
    # decospecs = Parameter("decospecs")
    # run_id_file = Parameter("run_id_file")
    # maxworkers = Parameter("max_workers")
    # maxnumsplits = Parameter("max_num_splits")
    # maxlogsize = Parameter("max_log_size")
    # maxlogsize = Parameter("max-log-size")
    # user_namespace = Parameter("user_namespace")
    # obj = Parameter("obj")

    @step
    def start(self):
        print("Starting 👋")
        self.next(self.end)

    @step
    def end(self):
        print("Done! 🏁")


if __name__ == "__main__":
    TagsParamFlow()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK -- I take it back and I shouldn't do fly-by reviews of error messages. Anyways, @saikonen is right. "tags" is the problem but not because it is passed to anything but because it is a named argument in def run in cli.py. That said, I would be more in favor of changing those internal names than restricting here (clearly --tags proved useful). So maybe we can make the names in cli.py (the ones not passed to kwargs) a bit more exotic (maybe even start them with a _) and that way we can reduce the number of names that are excluded.

Thoughts?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the issue is coming from cli.py#common_run_options that maps click options to specific args. Note that we are not having issues with a cli arg --tags (which does not exist) but that multiple --tag are mapped to a tags argument for functions using the @common_run_options

these functions then try to call obj.flow._set_constants(obj.graph, kwargs) with their kwargs, which do not include the explicit args anymore

Copy link
Collaborator Author

@saikonen saikonen Feb 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah @romain-intel beat me to it ;)

I'm all for prefixing the internal args with _ to avoid collisions, though it seems like a bit of a heavy change across the board given that we've only had collisions with --tags, and the others seem less likely to be used.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changes should be limited to cli.py so may not be too heavy a lift. Am fine either ways since it seems it was already "broken" before with a slight preference for prefixing to make them less colliding.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

been scratching my head over the UX of this a bit. We can prefix the internal args to avoid collisions, but this only applies to the args of the functions, not cli options. This can lead to some funky UX:

User defines a custom parameter for their flow called tags. Now they can deploy a flow supplying the options

--tag some_internal_flow_tag --tags values,for,parameter

which is why I'm starting to lean towards simply updating the reserved params list for now, and relaxing the restrictions in a future change if this kind of use case starts popping up more.

"max-log-size",
"max_log_size",
"tags",
"user_namespace",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed we should standardize _ and -

@nflx-mf-bot
Copy link
Collaborator

Testing[635] @ 57debc3

@nflx-mf-bot
Copy link
Collaborator

Testing[635] @ 57debc3 had 9 FAILUREs.

@@ -374,6 +365,28 @@ def __getitem__(self, x):
pass


def _reserved_params():
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this need to be a function?

@saikonen saikonen dismissed romain-intel’s stale review February 23, 2024 11:55

comments covered. Opened a follow up issue for discussing the relaxing of restricted parameter names

@saikonen saikonen merged commit cb7e945 into master Feb 23, 2024
34 checks passed
@saikonen saikonen deleted the fix/update-restricted-parameter-names-list branch February 23, 2024 11:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Using tags as a Parameter name breaks flow.
4 participants