-
Notifications
You must be signed in to change notification settings - Fork 360
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
lakectl doctor command to run a basic diagnose on lakeFS configuration #2948
Conversation
Idan Novogroder seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Many comments -- the core is really solid and useful.
I am not sure that we need more tests for this level of diagnosis beyond calling LoginWithResponse
. So I'm a bit wary of this implementation, for multiple reasons:
- We could get the same details from analyzing the
LoginWithResponse
call carefully, with more accuracy. - Many of the tests are quite specific -- they can fail in possible (although unlikely) cases:
- If a user generates access keys manually, they will not be "valid" -- but they will still be good on lakeFS.
- The URL suffix is probably important, but even that can be rewritten by proxies.
- All the tests except
LoginWithResponse
can fail to detect some of the most common mixups -- e.g. mix up my AWS and lakeFS credentials.
nits:
Note that linting fails -- your probably want to
make gofmt
# (Look at `git diff`)
git commit -a -m 'make gofmt'
Co-authored-by: itai-david <90712874+itai-david@users.noreply.github.com>
nessie/golden/lakectl_doctor_with_long_length_access_key_id.golden
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left some comments below. The suggestions for common_helpers.go
are mostly up to you. What I really miss are unit-tests for the common_helper functions added. These functions are totally independent and are very easy to cover with unit-tests. These tests will provide a future protection, in case of changes.
Another thing I think should be fixed is the duplication in the lakectl_tests golden files, as commented below
I think the command addition to lakectl should affect the help
command output and the lakectl_help.golden
file respectively, yet I don't see this file changed. I guess it should fail nessie tests
Co-authored-by: arielshaqed <ariels@treeverse.io>
@idanovo - Thanks for addressing all comments |
…the list repo command, and add a recommendation to use the doctor command when command is dying
0b0c970
to
c0c765a
Compare
c0c765a
to
8575f73
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Still one or two places where I could not follow the logic; noted below. And let's talk f2f about how we might simplify and improve message formatting by using templates.
cmd/lakectl/cmd/doctor.go
Outdated
} | ||
accessKeyID := cfg.Values.Credentials.AccessKeyID | ||
if !IsValidAccessKeyID(accessKeyID) { | ||
fmt.Println("access_key_id value looks suspicious: accessKeyID") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fmt.Println("access_key_id value looks suspicious: accessKeyID") | |
fmt.Println("access_key_id value looks suspicious, should look like AKIJ{20 characters}Q") |
(or whatever the format should be, and using an appropriate formatter from common_helpers).
Also print the access key ID: the user might not know which access key is being used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I agree about giving the user details like how access key id should look like and what validation failed on it (e.g.- the access key id you provided is too long) will be really helpful. I mean, what would do about it? He just has to provide the right credentials. WDYT?
cmd/lakectl/cmd/doctor.go
Outdated
|
||
secretAccessKey := cfg.Values.Credentials.SecretAccessKey | ||
if !IsValidSecretAccessKey(secretAccessKey) { | ||
fmt.Println("secret_access_key value looks suspicious...") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, this could be a bit more precise if IsValidSecretAccessKey
returned more information. (That information could go into a "detail" field)
cmd/lakectl/cmd/doctor.go
Outdated
} | ||
// In case we get the "not found" HTML page (the status is 200 and not 404 in this case) | ||
if resp.HTTPResponse != nil && resp.HTTPResponse.StatusCode == 200 { | ||
return ErrWrongEndpointURI |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This error is the only one returned without calling Fmt
. Is it special, or did we miss it here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It just doesn't have an indicative message linked to the response so I preferred to let the outside method validate what's wrong
cmd/lakectl/cmd/doctor.go
Outdated
Fmt("It looks like you have a problem with your `%v` file.\n", configFileName) | ||
switch { | ||
case errors.Is(err, ErrCredential): | ||
Fmt("It is possible that the `access_key_id` or `secret_access_key` you supplied are wrong.\n") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand. Didn't we already report an error on l.78?
This is a bit odd. I would just return an error from ListRepositoriesAndAnalyze, and format everything here. E.g. if your error is a struct
with fields Message
and Detail
, you can format them really nicely with the utils code here. (It doesn't even need to be an error, I guess.) But formatting part of the output here and part there makes it harder to read (and maybe also write) things like l.83, where we don't emit a message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, do you think also the part of going over the file and trying to get what might be the problem should be in ListRepositoriesAndAnalyze
?
} | ||
|
||
func IsValidSecretAccessKey(secretAccessKey string) bool { | ||
return IsBase64(secretAccessKey) && len(secretAccessKey) == 40 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment thread gone because the line changed...
Cool. But then please return a string explaining to the user what is suspicious about the URL: is it impossible to parse (which is definitely wrong), or does it have a weird suffix (which is probably wrong). The user can and should take different steps in each case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might still want to do this :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Only required bits are to name error types *Error
; you might choose any, none or all of the remaining comments.
cmd/lakectl/cmd/doctor.go
Outdated
} | ||
|
||
func (e *ErrCredential) Error() string { | ||
return (e.Message + "\n" + e.Details + "\n") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Error strings should not end in a newline. In this case a more Go-ish version would say e.g.
type Detailed interface {
Details() string
}
type CredentialsError struct {
Message string
Details string
}
func (e *CredentialsError) Error() string { return e.Message }
func (e *CredentialsError) Details() string {
return e.Message + "\n" + e.Details
}
and then when printing an error you could see whether it has details by saying e.g.
if detailedErr, ok := err.(Detailed); ok {
Write(DetailedErrorTemplate, detailedErr)
} else {
Write(ErrorTemplate, err)
}
Now DetailedErrorTemplate
can use the Details
field (even though it never calls the function, go figure...).
cmd/lakectl/cmd/doctor.go
Outdated
ErrWrongEndpointURI = errors.New("wrong endpoint_uri error") | ||
ErrUnknownConfig = errors.New("unknown configuratioin error") | ||
) | ||
type ErrCredential struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Error types look like CredentialsError
; error base variables look like ErrCredentials
. Yes, Go errors have many layers and their implementation reflects their history.
Sorry.
cmd/lakectl/cmd/doctor.go
Outdated
@@ -47,46 +73,45 @@ var doctorCmd = &cobra.Command{ | |||
|
|||
serverEndpoint := cfg.Values.Server.EndpointURL | |||
if !strings.HasSuffix(serverEndpoint, api.BaseURL) { | |||
Fmt("Suspicious URI format for server.endpoint_url: %v, doesn't have `%v` suffix.\n", serverEndpoint, api.BaseURL) | |||
Fmt("Suspicious URI format for server.endpoint_url: %v, doesn't end with: `%v`.\n", serverEndpoint, api.BaseURL) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we also handle this as a write to the template? Otherwise we again have multiple places to control what the details look like.
func (e *ErrCredential) Error() string { | ||
return (e.Message + "\n" + e.Details + "\n") | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For future reference (none of this is required in this PR, but it would be more Go-ish if you did do these): these errors work, but are a bit dirty. They break the errors wrapping chain. You could have instead
type WrappingError struct {
Message string
DetailedErr error
}
func (e *WrappingError) Error() string { return e.Message }
func (e *WrappingError) Unwrap() error { return e.DetailedErr }
func (e *WrappingError) Details() string { return c.DetailedErr.Error() }
The Unwrap
call means the errors.Is
and errors.As
will continue to work on this error.
Now copy this type into every desired type, e.g.
type WrongEndpointURIError WrappingError
var ErrWrongEndpointURI = WrongEndpointERIError{}
and you get all the parts in one place.
cmd/lakectl/cmd/doctor.go
Outdated
} | ||
|
||
var configErrorTemplate = `{{ .Message |red }} | ||
{{ .Details | red }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
{{ .Details | red }} | |
{{ .Details }} |
I think the details are less important, and templates let us say that!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
This covers everything required. I still have 2 requests, but neither is blocking.
} | ||
|
||
func IsValidSecretAccessKey(secretAccessKey string) bool { | ||
return IsBase64(secretAccessKey) && len(secretAccessKey) == 40 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might still want to do this :-)
close part of #2226