Validate full model name in the deprecate model flow #427

tapadipti · 2023-10-12T12:17:00Z

Earlier, the model name was not being validated for the deprecate model action. It was getting validated for all other actions including the unassign and deregister actions (which are variations of the deprecate action).

Resolves iterative/dvc#9821 (comment)

sweep-ai · 2023-10-12T12:17:06Z

Apply Sweep Rules to your PR?

Apply: Leftover TODOs in the code should be handled.
Apply: All new business logic should have corresponding unit tests in the tests/ directory.
Apply: Any clearly inefficient or repeated code should be optimized or refactored.

shcheklein · 2023-10-12T23:18:21Z

gto/registry.py

@@ -421,6 +421,7 @@ def deprecate(
        author: Optional[str] = None,
        author_email: Optional[str] = None,
    ) -> Optional[Deprecation]:
+        self._check_args(name=name, version=None, rev=rev, deprecate_model=True)


let's just do assert_fullname_is_valid(name) if need to validate the name and nothing else

shcheklein · 2023-10-12T23:20:34Z

Thanks @tapadipti ! I left a comment + I think we would need a test for this.

codecov-commenter · 2023-10-13T03:52:19Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Files	Coverage Δ
gto/constants.py	`100.00% <ø> (ø)`
gto/registry.py	`90.61% <100.00%> (+0.03%)`	⬆️

📢 Thoughts on this report? Let us know!.

tapadipti · 2023-10-13T07:08:48Z

Thanks @shcheklein. I've addressed your comments.

For the test, I ran it locally by calling it from a __main__ block. But I'm not sure how to run it without that.

There are more validations to be added. Eg, currently it accepts // in the name, and this is not allowed in the Git ref. I'll work on fixing this as part of iterative/dvc#9821 but it's not within the scope of this PR.

shcheklein · 2023-10-13T21:48:06Z

thanks @tapadipti ! it looks good, but it seems we need to fix the test. Please see the checks above.

This reverts commit 3aa3fde.

tapadipti · 2023-10-16T05:05:20Z

it seems we need to fix the test. Please see the checks above.

The error message returned by the name validation functions is quite long. So it looks like a new line character (\n) gets added automatically - see the extra \n between Value and must below:

❌ Invalid value 'a4!'. Only letters, numbers, '_', '-', '/' are allowed. Value \nmust be of len >= 2 and must start and end with a letter or a number.\n

Is there a way to turn off this automatic addition?

I was not adding this \n to my test (coz I wasn't expecting it to get added when running), and locally I didn't see this issue.

To resolve this, I've now added the \n to the error message and to my test. It's passing now.

However, I'm not sure if some other products with dependencies on GTO could break due to this (if they're expecting a string without this \n). cc @iterative/studio

shcheklein · 2023-10-16T18:47:01Z

@tapadipti I think it's better to not alter the behavior, but rather fix test to be more lax on comparison. E.g. remove all \n before comparing the result.

…evert name check error message

tapadipti · 2023-10-18T06:31:37Z

@tapadipti do you have sense on why is it escaped? what other symbols are escaped?

@shcheklein So this is failing for Windows only - so it must be something to do with Windows convention for line endings (not the first time Windows got me stuck for so long 😞). Still not sure how to resolve this - will try to figure out. But I think it's better to just add the line break at the end of the sentence (like I tried before).

No, I couldn't figure this out - for some reason the \ in \n is being escaped, so it becomes \\n. When I try locally, a string with \\n prints just \n. And I don't know how to replicate how the complete test suite is being run - I've run individual tests and it works fine. The only character causing me issue is the line break which is getting added in the middle of a line. Without this extra line break, there's no issue coz all the other line breaks are at the same position in both the actual and expected values.

pmrowla

The GTO CLI test framework is fragile (and IMO a lot of these tests are not actually useful either) since all of the tests are hard coded to check for fixed output.

The issue that shows up here is that the underlying test for CLI commands uses click.CliRunner which will implements line wrapping at a fixed maximum content width of 80. Wrapped lines will also be split using the platform specific line ending (which is CRLF on windows and not just LF). This was a non-issue for all of the other existing tests up to now (purely by coincidence) since none of them generate a line longer than 80 characters.

Adjusting the test framework to strip all newlines and replace them with spaces is probably not the right way to go since the newlines are explicitly expected output in all of the other existing tests (where the split lines are significant).

The fix here should be to just manually handle the line wrapping at 80 characters in the new test like

def test_deprecate_artifact(repo_with_commit: str):
    expected = os.linesep.join(
        [
            (
                "❌ Invalid value 'a4!'."
                " Only letters, numbers, '_', '-', '/' are allowed. Value "
            ),
            (
                "must be of len >= 2 and must start and end with a letter or"
                " a number.\n"
            ),
        ]
    )
    _check_failing_cmd(
        "deprecate",
        ["-r", repo_with_commit, "a4!"],
        expected,
    )

(this should fix the test failures and should not require modifying _check_output_exact_match at all in this PR)

pmrowla · 2023-10-18T07:46:53Z

ideally in the long term all of the tests should be updated to work off of a list of expected lines instead of a single hard coded expected string, and then the check functions would do something like

assert result.stdout.splitlines() == expected_stdout
assert result.stderr.splitlines() == expected_stderr

which will work regardless of platform/line ending, and it will account for commands where the newlines in output are significant

Should also note that this does not account for the fact that click will also automatically add indentation depending on content and context which may end up causing similar issues in the future (so exact string comparison in tests would break due to leading tabs or spaces instead of newlines)

tapadipti · 2023-10-18T08:01:09Z

@pmrowla Yes.

Wrapped lines will also be split using the platform specific line ending (which is CRLF on windows and not just LF)

I wish this comment was here yesterday or earlier today. Figured it out the hard way today. Have fixed it by doing replace differently for Windows. Which I don't like. So I'll revert to what you've suggested - which is pretty much what I had here and here. @shcheklein Can you pls confirm you're ok with this - coz you'd earlier suggested I think it's better to not alter the behavior, but rather fix test to be more lax on comparison. E.g. remove all \n before comparing the result.

pmrowla · 2023-10-18T08:09:03Z

I think it's better to not alter the behavior, but rather fix test to be more lax on comparison. E.g. remove all \n before comparing the result.

The problem with this is that this is applied to the tests for every GTO command and it strips the line breaks in places where the test behavior is actually significant, i.e. when the expected output is explicitly

To push the changes upstream, run:
    git push origin nn2@v0.0.1

Stripping/replacing all the line breaks means the test would now pass if the CLI command outputs

To push the changes upstream, run:     git push origin nn2@v0.0.1

(which is why I suggested that the long term solution should be that CLI output tests in GTO compare lists of expected lines and not a single expected string)

tapadipti · 2023-10-18T08:21:08Z

@pmrowla I agree with you - it's better not to strip the line breaks. I've made a new commit reverting to an earlier state where I added an explicit line break in the error message to handle the 80 character limit. The suggestion you provided in your comment works for the name a4! but if we change the name to a string of different length at any time, it won't work. So I added a line break after allowed., so that the line does not become >80 chars long in most cases and this issue does not arise.

@shcheklein pls confirm if this is ok.

pmrowla · 2023-10-18T12:45:47Z

The suggestion you provided in #427 (review) works for the name a4! but if we change the name to a string of different length at any time, it won't work.

This is true, but we would hit the issue even with the manual line break if we changed the test to use a longer name.

I would prefer to not have the manual line breaks in the exception message. The line wrapping is handled by the UI rendering library (click for GTO). The real issue here is with the test suite itself - the tests should not be affected by line wrapping in the UI library)

IMO these kinds of fragile UI tests for CLI output are not actually useful at all. It would be better to test that the expected exception is raised by the underlying call.

Basically, GTO tests should only be testing GTO functionality. But right now, the CLI tests are causing issues because in addition to testing GTO functionality they are also testing how click line wraps a given exception message. The click formatting behavior is not actually relevant to whether or not GTO works properly (and click's line wrapping is tested in click itself).

…d value

tapadipti · 2023-10-19T07:11:53Z

@pmrowla I tried removing the manual line breaks in the exception message, as you suggested. But it's not working - it looks like the space between Value and must is getting stripped in Windows. See the assertion errors in the test reports for the latest commits.

Summary of error:

When I use your suggested code for the test (in this commit), I get the following mismatch between the actual and expected values.

"\u274c Invalid value 'a4!'. Only letters, numbers, '_', '-', '/' are allowed. Value\\nmust be of len >= 2 and must start and end with a letter or a number.\\n"
"\u274c Invalid value 'a4!'. Only letters, numbers, '_', '-', '/' are allowed. Value \\r\\nmust be of len >= 2 and must start and end with a letter or a number.\\n"

If I use a \n in the test (see this commit), I get the following mismatch:

"\u274c Invalid value 'a4!'. Only letters, numbers, '_', '-', '/' are allowed. Value\\nmust be of len >= 2 and must start and end with a letter or a number.\\n"
"\u274c Invalid value 'a4!'. Only letters, numbers, '_', '-', '/' are allowed. Value \\nmust be of len >= 2 and must start and end with a letter or a number.\\n"

From what I saw yesterday, this (stripping) does not happen in ubuntu and macos, so I can't just remove the space from the expected value (else the test will fail in ubuntu and macos). I can treat Windows differently (similar to how I did here), but imo that's a terrible idea. So my only option now is to revert to adding the manual line break in the exception message (or to improve how the tests are written, which I don't want to make part/dependency of this PR).

dberenbaum · 2023-10-19T12:40:53Z

Hope this helps and isn't further derailing here, but can you strip only the \r clrf windows-style line endings from the actual output? I don't think we are ever going to intentionally add those into output.

Edit: I see there are also issues related to spacing, so I don't think this alone will help much. Disregard.

pmrowla · 2023-10-19T12:57:34Z

We should just remove this CLI test entirely and test the underlying behavior in test_api instead like

with pytest.raises(ValidationError):
    api.deregister(...)  # call deregister with an invalid name

tapadipti · 2023-10-20T05:53:03Z

For now, I'm removing the test. I've created an issue to improve tests, and we can add a test for the deprecate model action with invalid model name there. cc @pmrowla @shcheklein @dberenbaum

tapadipti · 2023-10-25T14:46:49Z

@pmrowla Can you ptal at this comment and re-review this PR. I can't merge the PR without you accepting the change.

tapadipti added 3 commits October 12, 2023 17:31

check args during deprecate model flow

8ef6100

Revert unnecessary changes

11c920d

Revert unnecessary changes

9e3c3f3

tapadipti requested review from pmrowla and shcheklein October 12, 2023 12:17

tapadipti temporarily deployed to pypi October 12, 2023 12:21 — with GitHub Actions Inactive

tapadipti mentioned this pull request Oct 12, 2023

artifacts: Review naming restrictions iterative/dvc#9821

Closed

shcheklein reviewed Oct 12, 2023

View reviewed changes

shcheklein added the bug Something isn't working label Oct 12, 2023

assert_fullname_is_valid instead of check_args

1ddc807

tapadipti changed the title ~~Check args in the deprecate model flow~~ Validate full model name in the deprecate model flow Oct 13, 2023

tapadipti temporarily deployed to pypi October 13, 2023 03:55 — with GitHub Actions Inactive

Add a missing space and add test

62b9988

tapadipti requested a review from shcheklein October 13, 2023 07:08

shcheklein approved these changes Oct 13, 2023

View reviewed changes

tapadipti and others added 4 commits October 16, 2023 10:08

Temporary change to debug test failure

3aa3fde

Revert "Temporary change to debug test failure"

224e1fe

This reverts commit 3aa3fde.

Add line break in error message

673a657

Added missing change

3fb7676

tapadipti temporarily deployed to pypi October 16, 2023 04:57 — with GitHub Actions Inactive

amritghimire approved these changes Oct 16, 2023

View reviewed changes

tapadipti added 2 commits October 17, 2023 11:57

Remove line breaks before comparing expected and actual values. And r…

0834b16

…evert name check error message

Lint fix

fbc4d2c

tapadipti force-pushed the check-args branch from 1d29b59 to 4f3ec19 Compare October 18, 2023 07:01

tapadipti added 2 commits October 18, 2023 13:14

Try by replacing \r\n

e0a00fc

Fix for last commit

10159c6

pmrowla suggested changes Oct 18, 2023

View reviewed changes

tapadipti added 2 commits October 18, 2023 13:35

Treat windows differently

7196ef8

Lint fix

ddf1dcb

Lint fix

3e0a93a

tapadipti temporarily deployed to pypi October 18, 2023 08:08 — with GitHub Actions Inactive

Revert to adding an explicit line break in error msg

4ba2633

tapadipti temporarily deployed to pypi October 18, 2023 08:19 — with GitHub Actions Inactive

tapadipti requested a review from pmrowla October 18, 2023 08:35

tapadipti added 2 commits October 19, 2023 12:04

Revert linebreak addition in err msg and add linebreak to the expecte…

371e120

…d value

Try Peter's way

ab1b866

Remove test for deeprecate

13fa103

tapadipti temporarily deployed to pypi October 20, 2023 05:49 — with GitHub Actions Inactive

tapadipti mentioned this pull request Oct 20, 2023

Improve tests #429

Open

pmrowla approved these changes Oct 26, 2023

View reviewed changes

tapadipti merged commit 96c6d9d into main Oct 26, 2023
17 checks passed

tapadipti deleted the check-args branch October 26, 2023 02:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Validate full model name in the deprecate model flow #427

Validate full model name in the deprecate model flow #427

tapadipti commented Oct 12, 2023 •

edited

Loading

sweep-ai bot commented Oct 12, 2023

shcheklein Oct 12, 2023

tapadipti Oct 13, 2023

shcheklein commented Oct 12, 2023

codecov-commenter commented Oct 13, 2023 •

edited

Loading

tapadipti commented Oct 13, 2023

shcheklein commented Oct 13, 2023

tapadipti commented Oct 16, 2023

shcheklein commented Oct 16, 2023

tapadipti commented Oct 18, 2023 •

edited

Loading

pmrowla left a comment •

edited

Loading

pmrowla commented Oct 18, 2023 •

edited

Loading

tapadipti commented Oct 18, 2023

pmrowla commented Oct 18, 2023 •

edited

Loading

tapadipti commented Oct 18, 2023

pmrowla commented Oct 18, 2023

tapadipti commented Oct 19, 2023

dberenbaum commented Oct 19, 2023 •

edited

Loading

pmrowla commented Oct 19, 2023

tapadipti commented Oct 20, 2023

tapadipti commented Oct 25, 2023

Validate full model name in the deprecate model flow #427

Validate full model name in the deprecate model flow #427

Conversation

tapadipti commented Oct 12, 2023 • edited Loading

sweep-ai bot commented Oct 12, 2023

Apply Sweep Rules to your PR?

shcheklein Oct 12, 2023

Choose a reason for hiding this comment

tapadipti Oct 13, 2023

Choose a reason for hiding this comment

shcheklein commented Oct 12, 2023

codecov-commenter commented Oct 13, 2023 • edited Loading

Codecov Report

tapadipti commented Oct 13, 2023

shcheklein commented Oct 13, 2023

tapadipti commented Oct 16, 2023

shcheklein commented Oct 16, 2023

tapadipti commented Oct 18, 2023 • edited Loading

pmrowla left a comment • edited Loading

Choose a reason for hiding this comment

pmrowla commented Oct 18, 2023 • edited Loading

tapadipti commented Oct 18, 2023

pmrowla commented Oct 18, 2023 • edited Loading

tapadipti commented Oct 18, 2023

pmrowla commented Oct 18, 2023

tapadipti commented Oct 19, 2023

dberenbaum commented Oct 19, 2023 • edited Loading

pmrowla commented Oct 19, 2023

tapadipti commented Oct 20, 2023

tapadipti commented Oct 25, 2023

tapadipti commented Oct 12, 2023 •

edited

Loading

codecov-commenter commented Oct 13, 2023 •

edited

Loading

tapadipti commented Oct 18, 2023 •

edited

Loading

pmrowla left a comment •

edited

Loading

pmrowla commented Oct 18, 2023 •

edited

Loading

pmrowla commented Oct 18, 2023 •

edited

Loading

dberenbaum commented Oct 19, 2023 •

edited

Loading