Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate Stages and introduce Labels instead #218

Merged
merged 10 commits into from
Jul 25, 2022
Merged

Conversation

aguschin
Copy link
Contributor

@aguschin aguschin commented Jul 20, 2022

This is a continuation of the long discussion re adding Labels. In fact, this still enable Stages workflow, although "Stages" as a term will be deprecated (maybe it will be easier to understand now).
close #191

Starting with DDD, I'm updating the README to get some feedback first.

@aguschin aguschin requested a review from omesser July 20, 2022 04:09
@aguschin aguschin self-assigned this Jul 20, 2022
@codecov-commenter
Copy link

codecov-commenter commented Jul 20, 2022

Codecov Report

Merging #218 (4a32fcf) into main (6a49b25) will increase coverage by 0.80%.
The diff coverage is 95.45%.

@@            Coverage Diff             @@
##             main     #218      +/-   ##
==========================================
+ Coverage   82.72%   83.52%   +0.80%     
==========================================
  Files          16       16              
  Lines        1534     1536       +2     
==========================================
+ Hits         1269     1283      +14     
+ Misses        265      253      -12     
Impacted Files Coverage Δ
gto/api.py 80.00% <85.71%> (+7.61%) ⬆️
gto/cli.py 69.53% <87.50%> (+0.24%) ⬆️
gto/registry.py 91.79% <92.30%> (ø)
gto/base.py 88.00% <100.00%> (+1.90%) ⬆️
gto/constants.py 100.00% <100.00%> (ø)
gto/tag.py 89.93% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6a49b25...4a32fcf. Read the comment docs.

README.md Outdated
from one column ("label1") to another ("label2"). This is how MLFlow and some
other Model Registries works.

To achieve this, you can use `--last-label-for-version` flag (or `--last` for
Copy link
Contributor

@dberenbaum dberenbaum Jul 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just my take since we discussed this: I don't think the long --last-label-for-version flag is needed, and the help description for the flag is probably enough to specify that --last means the last label for the version (not the last version for the label).

Also, we use latest in other places, so it might be worth considering whether last and latest have different meanings (in which case we might want to clarify each), or else use one of those words consistently.

Edit: I see latest and last do have different meanings. Still not sure about the flag name 🤔. Some ideas:

  • --collapse-versions
  • --current-labels

Not sure those are any better, and it doesn't need to block this PR, but it might be good to get more feedback on it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, they have different meanings. I had an idea of introducing two flags: gto show --greatest (sort by semver) and gto show latest (sort by timestamp) with greatest as a default.

Re options: Maybe --last-stage or --latest-stage could work?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about --hide-old-stages? Anyway, to be clear, it shouldn't block the PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aguschin @dberenbaum
Suggestion - How about we use a per-label prefix that will make this specific label behave like a stage?
We would be able to support behaviors in parallel:

  • free-form labels (user needs to assign and remove)
  • stages (assigning a stage overrides previous stages assignments per-artifact-version)
  • environments (assigning an environment to an artifact-version will cause this environment to be "unassigned" from other versions of that artifact)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That summary of the three different behaviors is great @omesser.

TBH I like the current approach by @aguschin where you can switch between those by choosing how to view the labels rather than having it coded into the tag names. It makes it much easier to play around with the different approaches without being locked into one of them. The only part I don't like is that naming/explaining is not that easy, but I don't think it would be easy no matter what approach we take.

Copy link
Contributor

@omesser omesser Jul 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the main downside of having this determined by the "querying"/"viewing" command is that you cannot mix and match. This prevents from using GTO to manage environments if you already use it for kanban-like stages and vice versa.
To put it another way - if I build my workflow and CI to treat gto labels (or specific labels) like kanban-stages, it will never make sense to view those as something else. If I use them as deployment environments - it will never make sense to query them as anything else. querying it the wrong way is only room for error, there is not upside to that flexibility after the label mechanism was decided on assignment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@omesser, I agree with your argument, but I think flexibility is more important currently for us to see what people want. Also, an important question you raised is how this will behave in Studio when a team collaborates. A misunderstanding may appear when different users interpret the labels with different mechanics (e.g. one with free-form labels and one with stages).

On the second point, I don't think I'll be able to implement this before Studio release, and I doubt it will be handy without Studio BE/FE support cause it just won't be shown there. So let's continue to discuss this and gather some users' feedback to see how they want this to work, and then let's implement this in a separate PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like we don't have much choice in the short term. In the medium/long term, separate concepts of "stages" and "environments" seem useful.

README.md Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
@aguschin aguschin mentioned this pull request Jul 21, 2022
@omesser
Copy link
Contributor

omesser commented Jul 24, 2022

@aguschin @dberenbaum
I keep coming back to this when re-reviewing this PR and discussions;
The free-form-labels (the base behavior in this implementation), while most flexible, seem to be almost useless on its own to the user, and can cause a lot of confusion. I can not think of a use case for this which makes sense, only compromises with env-like or stage-like behavior 🤔
To recap:

  1. env-like - 👍 but with slight overlap with mlem (manageable?)
  2. stage-like - 👍 but users not using mlem might still like the above in some scenarios.
  3. labels - While technically a base implementation for both the above, on its own not helpful. Also confusing because of the labels in artifacts.yaml, which, as a concept is "too close" for comfort.

@aguschin
Copy link
Contributor Author

aguschin commented Jul 25, 2022

Looks like unassign takes a bit more time that I expected. I’m going to merge this without unassign today, and going to add both unassign and deprecate a version at once, since they're very similar. deprecate an artifact is a bit different, but the approach should be similar, so I'm going to add it also I think.

README.md Outdated Show resolved Hide resolved
@aguschin aguschin merged commit 1702340 into main Jul 25, 2022
@aguschin aguschin deleted the readme/actionable-labels branch July 25, 2022 12:19
Copy link
Contributor

@shortcipher3 shortcipher3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the flexibility this introduces, it is a better fit for my use case. Thanks!

@@ -64,39 +64,73 @@ several versions in a given commit, ordered by their automatic version numbers.

</details>

### Promoting
### Assing a stage
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo, should be Assign

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, fixing this in #219

Stages can be seen as the status of your artifact, signaling readiness for usage
by downstream systems, e.g. via CI/CD or web hooks. For example: redeploy an ML
model.
Assing an actionable stage for a specific artifact version with `gto assign`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo, should be Assign

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Labels versus Stages
5 participants