Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move validation to separate package, fix misleading errors on branch creation #2859

Merged
merged 6 commits into from
Feb 6, 2022

Conversation

ortz
Copy link
Contributor

@ortz ortz commented Jan 4, 2022

Closes #1634

Disclaimer: first pull request, feel free to comment and take a strict approach.

I started working on this PR in order to fix some branch-creation error messages that are misleading, due to the way it's currently modeled (validation logic resides within the catalog package, which is irrelevant for the CLI) I moved the validation logic into a separate package so both the CLI and the API would consume it.
I've also aligned the branch API calls arguments to match the same name in the validation error messages.

After I moved the logic to a separate, I've included the same validation for the CLI branch create command and aligned the error message.

Errors:
Branch creation with bad character (%) which the url.Parse returns an error for.
➜ lakeFS git:(fix/branch-creation-error-1634) lakectl branch create lakefs://test/testing-spark% --source lakefs://test/main Invalid 'branch': parsing lakefs://test/testing-spark%: malformed lakefs uri Error executing command.

Branch creation with bad character (.) which the regular expression returns an error for.
➜ lakeFS git:(fix/branch-creation-error-1634) lakectl branch create lakefs://test/testing-spark. --source lakefs://test/main Invalid branch: not a valid ref uri Error executing command.

The same, but directly against the API (bypass the url.Parse validation):
➜ lakeFS git:(fix/branch-creation-error-1634) curl -u ***:*** -XPOST -H "Content-Type: application/json" -d '{"name":"testing-spark%", "source": "main"}' http://localhost:8000/api/v1/repositories/test/branches {"message":"argument branch: invalid value: validation error"}

@ortz ortz added the bug Something isn't working label Jan 4, 2022
@ortz ortz requested review from nopcoder and talSofer January 4, 2022 15:27
@ortz ortz self-assigned this Jan 4, 2022
@ortz ortz added the include-changelog PR description should be included in next release changelog label Jan 4, 2022
@ortz ortz force-pushed the fix/branch-creation-error-1634 branch from 789697b to b55db55 Compare January 4, 2022 16:02
Copy link
Contributor

@nopcoder nopcoder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initial review - it is still a draft so I didn't know if it is final (move it to non draft if it is ready for review).
A lot of files changes as part of func rename - but I missed the part where the CLI changes - just point me to what I missed.

cmd/lakectl/cmd/abuse.go Outdated Show resolved Hide resolved
Comment on lines +37 to +45
return len(u.Repository) > 0 && len(u.Ref) == 0 && u.Path == nil && validator.ReValidRepositoryID.MatchString(u.Repository)
}

func (u *URI) IsRef() bool {
return len(u.Repository) > 0 && len(u.Ref) > 0 && u.Path == nil
return len(u.Repository) > 0 && len(u.Ref) > 0 && u.Path == nil && validator.ReValidRepositoryID.MatchString(u.Repository) && validator.ReValidBranchID.MatchString(u.Ref)
}

func (u *URI) IsFullyQualified() bool {
return len(u.Repository) > 0 && len(u.Ref) > 0 && u.Path != nil
return len(u.Repository) > 0 && len(u.Ref) > 0 && u.Path != nil && validator.ReValidRepositoryID.MatchString(u.Repository) && validator.ReValidBranchID.MatchString(u.Ref)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This changes the behavior of these methods - it was 'IsRepository' and not it is 'IsValidRepository' .
Did you verified that all the places we currently use this would like this behavior?

Copy link
Contributor Author

@ortz ortz Jan 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nopcoder thanks for the comment, you're right that the behaviour is due to change.

Here's a list of the commands using the Is* validation functions I modified (both directly and indirectly):

### IsRepository
lakefs import
lakectl branch
lakectl branch-protect - No validation
lakectl cat-hook-output
lakectl gc set-config - No validation
lakectl gc get-config - No validation
lakectl refs-restore
lakectl refs-dump
lakectl repo create
lakectl repo create-bare
lakectl repo delete
lakectl actions runs describe
lakectl actions runs list
lakectl show
lakectl tag list

### IsRef
lakefs loadtest entry
lakectl abuse random-read
lakectl abuse random-write
lakectl abuse create-branches
lakectl branch
lakectl commit
lakectl diff
lakectl log
lakectl merge
lakectl tag

### IsFullyQualified
lakectl fs


Most of the functions already validate the logic within the API, after we "wasted" resources (i.e file checks, scanning, querying the database, etc).
There are 3 areas that doesn't do validation, therefore, they are open to problems (i.e someone changing GC configuration for an invalid repository name).
This change will block it, at least from the CLI perspective.
Other than that, I think that the Nessie CLi automation could help a lot saving time checking it manually like it did this time.

As I see it, it doesn't break anything.

Let me know your thoughts.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, I was more worried about backend/pkg code that was using these function to check for a structure of URI without validation.

pkg/validator/validate.go Outdated Show resolved Hide resolved
ErrInvalid = errors.New("validation error")
ErrInvalidType = fmt.Errorf("invalid type: %w", ErrInvalid)
ErrRequiredValue = fmt.Errorf("required value: %w", ErrInvalid)
ErrInvalidValue = fmt.Errorf("invalid value: %w", ErrInvalid)
ErrPathRequiredValue = fmt.Errorf("missing path: %w", ErrRequiredValue)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error make sense at this level as a general validation package, I think for specific types the caller can control the specific error and help by providing the context.
For example if catalog call a validation function from this package, the return error will be wrapped by catalog.ErrXXX error so a caller to the catalog will identify the specific package error.
Same for specific type - like Path - the catalog will get an error from the validation package - ErrInvalid and wrap it with catalog.ErrPathXXX.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nopcoder I thought about it, the problem with this approach is that it might mask some meaningful errors, for instance, the ValidatePath function returns 3 different errors (ErrInvalidType, ErrPathRequiredValue and a custom one wrapping the ErrInvalidValue) and I can't really handle it properly from the caller function as it fails fast (first validation error returns).
I think we can do some work here and rethink of error handling (there are some packages that handle it better compared to the built-in one).
If you think we can still achieve custom errors without loosing visibility, I'd appreciate to hear more on your thoughts

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved the package-specific validations into the packages scope, this way, the errors, types, etc would be managed by the package.
I left the common resources (regex, general functions, etc) in the validate package

pkg/catalog/validate.go Outdated Show resolved Hide resolved
@ortz ortz force-pushed the fix/branch-creation-error-1634 branch from b55db55 to 2378ff2 Compare January 13, 2022 09:46
@ortz ortz marked this pull request as ready for review January 13, 2022 09:48
@ortz ortz requested a review from nopcoder January 13, 2022 10:19
@ortz ortz force-pushed the fix/branch-creation-error-1634 branch from 20eb7b5 to e68ddb9 Compare January 18, 2022 10:11
@CLAassistant
Copy link

CLAassistant commented Jan 20, 2022

CLA assistant check
All committers have signed the CLA.

per package validations

update deleteobjects error location
@ortz ortz force-pushed the fix/branch-creation-error-1634 branch from a62491e to 418bf0c Compare January 20, 2022 15:12
Copy link
Contributor

@nopcoder nopcoder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me - minor comments to consider.

pkg/catalog/errors.go Outdated Show resolved Hide resolved
Comment on lines +83 to +86
func isControlCodeOrSpace(r rune) bool {
const space = 0x20
return r <= space
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure its you code - but there are IsControl and IsSpace in unicode package

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's not mine; yes - we can use these functions from the unicode package, I wonder if that should be included in this issue/pull request or it's out of scope.

Comment on lines +37 to +45
return len(u.Repository) > 0 && len(u.Ref) == 0 && u.Path == nil && validator.ReValidRepositoryID.MatchString(u.Repository)
}

func (u *URI) IsRef() bool {
return len(u.Repository) > 0 && len(u.Ref) > 0 && u.Path == nil
return len(u.Repository) > 0 && len(u.Ref) > 0 && u.Path == nil && validator.ReValidRepositoryID.MatchString(u.Repository) && validator.ReValidBranchID.MatchString(u.Ref)
}

func (u *URI) IsFullyQualified() bool {
return len(u.Repository) > 0 && len(u.Ref) > 0 && u.Path != nil
return len(u.Repository) > 0 && len(u.Ref) > 0 && u.Path != nil && validator.ReValidRepositoryID.MatchString(u.Repository) && validator.ReValidBranchID.MatchString(u.Ref)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, I was more worried about backend/pkg code that was using these function to check for a structure of URI without validation.

@ortz ortz merged commit f7012dd into master Feb 6, 2022
@ortz ortz deleted the fix/branch-creation-error-1634 branch February 9, 2022 10:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working include-changelog PR description should be included in next release changelog
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Misleading error on branch creation with an invalid name
4 participants