lakectl doctor command to run a basic diagnose on lakeFS configuration #2948

idanovo · 2022-02-14T13:58:45Z

close part of #2226

…guration.

CLAassistant · 2022-02-14T13:58:51Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ idanovo
❌ Idan Novogroder

Idan Novogroder seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

arielshaqed

Many comments -- the core is really solid and useful.

I am not sure that we need more tests for this level of diagnosis beyond calling LoginWithResponse. So I'm a bit wary of this implementation, for multiple reasons:

We could get the same details from analyzing the LoginWithResponse call carefully, with more accuracy.
Many of the tests are quite specific -- they can fail in possible (although unlikely) cases:
- If a user generates access keys manually, they will not be "valid" -- but they will still be good on lakeFS.
- The URL suffix is probably important, but even that can be rewritten by proxies.
All the tests except LoginWithResponse can fail to detect some of the most common mixups -- e.g. mix up my AWS and lakeFS credentials.

nits:

Note that linting fails -- your probably want to

make gofmt
# (Look at `git diff`)
git commit -a -m 'make gofmt'

cmd/lakectl/cmd/common_helpers.go

cmd/lakectl/cmd/doctor.go

cmd/lakectl/cmd/common_helpers.go

Co-authored-by: itai-david <90712874+itai-david@users.noreply.github.com>

cmd/lakectl/cmd/common_helpers.go

nessie/doctor_test.go

nessie/golden/lakectl_doctor_long_secret_access_key.golden

nessie/golden/lakectl_doctor_short_secret_access_key.golden

nessie/golden/lakectl_doctor_with_long_length_access_key_id.golden

nessie/lakectl_util.go

cmd/lakectl/cmd/doctor.go

itaidavid

I left some comments below. The suggestions for common_helpers.go are mostly up to you. What I really miss are unit-tests for the common_helper functions added. These functions are totally independent and are very easy to cover with unit-tests. These tests will provide a future protection, in case of changes.
Another thing I think should be fixed is the duplication in the lakectl_tests golden files, as commented below
I think the command addition to lakectl should affect the help command output and the lakectl_help.golden file respectively, yet I don't see this file changed. I guess it should fail nessie tests

Co-authored-by: arielshaqed <ariels@treeverse.io>

itaidavid · 2022-02-15T18:25:57Z

@idanovo - Thanks for addressing all comments
LGTM

…the list repo command, and add a recommendation to use the doctor command when command is dying

cmd/lakectl/cmd/doctor.go

arielshaqed

Thanks!

Still one or two places where I could not follow the logic; noted below. And let's talk f2f about how we might simplify and improve message formatting by using templates.

cmd/lakectl/cmd/doctor.go

arielshaqed · 2022-02-16T09:33:08Z

cmd/lakectl/cmd/doctor.go

+		}
+		accessKeyID := cfg.Values.Credentials.AccessKeyID
+		if !IsValidAccessKeyID(accessKeyID) {
+			fmt.Println("access_key_id value looks suspicious: accessKeyID")


Suggested change

fmt.Println("access_key_id value looks suspicious: accessKeyID")

fmt.Println("access_key_id value looks suspicious, should look like AKIJ{20 characters}Q")

(or whatever the format should be, and using an appropriate formatter from common_helpers).

Also print the access key ID: the user might not know which access key is being used.

I'm not sure I agree about giving the user details like how access key id should look like and what validation failed on it (e.g.- the access key id you provided is too long) will be really helpful. I mean, what would do about it? He just has to provide the right credentials. WDYT?

arielshaqed · 2022-02-16T09:35:05Z

cmd/lakectl/cmd/doctor.go

+
+		secretAccessKey := cfg.Values.Credentials.SecretAccessKey
+		if !IsValidSecretAccessKey(secretAccessKey) {
+			fmt.Println("secret_access_key value looks suspicious...")


Again, this could be a bit more precise if IsValidSecretAccessKey returned more information. (That information could go into a "detail" field)

cmd/lakectl/cmd/doctor.go

arielshaqed · 2022-02-17T08:41:42Z

cmd/lakectl/cmd/doctor.go

+		}
+		// In case we get the "not found" HTML page (the status is 200 and not 404 in this case)
+		if resp.HTTPResponse != nil && resp.HTTPResponse.StatusCode == 200 {
+			return ErrWrongEndpointURI


This error is the only one returned without calling Fmt. Is it special, or did we miss it here?

It just doesn't have an indicative message linked to the response so I preferred to let the outside method validate what's wrong

cmd/lakectl/cmd/doctor.go

arielshaqed · 2022-02-17T08:47:42Z

cmd/lakectl/cmd/doctor.go

+		Fmt("It looks like you have a problem with your `%v` file.\n", configFileName)
+		switch {
+		case errors.Is(err, ErrCredential):
+			Fmt("It is possible that the `access_key_id` or `secret_access_key` you supplied are wrong.\n")


I don't understand. Didn't we already report an error on l.78?

This is a bit odd. I would just return an error from ListRepositoriesAndAnalyze, and format everything here. E.g. if your error is a struct with fields Message and Detail, you can format them really nicely with the utils code here. (It doesn't even need to be an error, I guess.) But formatting part of the output here and part there makes it harder to read (and maybe also write) things like l.83, where we don't emit a message.

Agree, do you think also the part of going over the file and trying to get what might be the problem should be in ListRepositoriesAndAnalyze?

cmd/lakectl/cmd/doctor.go

arielshaqed · 2022-02-17T08:52:41Z

cmd/lakectl/cmd/common_helpers.go

+}
+
+func IsValidSecretAccessKey(secretAccessKey string) bool {
+	return IsBase64(secretAccessKey) && len(secretAccessKey) == 40


Comment thread gone because the line changed...

Cool. But then please return a string explaining to the user what is suspicious about the URL: is it impossible to parse (which is definitely wrong), or does it have a weird suffix (which is probably wrong). The user can and should take different steps in each case.

We might still want to do this :-)

arielshaqed

Thanks!

Only required bits are to name error types *Error; you might choose any, none or all of the remaining comments.

arielshaqed · 2022-02-20T08:06:58Z

cmd/lakectl/cmd/doctor.go

+}
+
+func (e *ErrCredential) Error() string {
+	return (e.Message + "\n" + e.Details + "\n")


Error strings should not end in a newline. In this case a more Go-ish version would say e.g.

type Detailed interface { Details() string } type CredentialsError struct { Message string Details string } func (e *CredentialsError) Error() string { return e.Message } func (e *CredentialsError) Details() string { return e.Message + "\n" + e.Details }

and then when printing an error you could see whether it has details by saying e.g.

if detailedErr, ok := err.(Detailed); ok { Write(DetailedErrorTemplate, detailedErr) } else { Write(ErrorTemplate, err) }

Now DetailedErrorTemplate can use the Details field (even though it never calls the function, go figure...).

arielshaqed · 2022-02-20T08:12:59Z

cmd/lakectl/cmd/doctor.go

-	ErrWrongEndpointURI = errors.New("wrong endpoint_uri error")
-	ErrUnknownConfig    = errors.New("unknown configuratioin error")
-)
+type ErrCredential struct {


Error types look like CredentialsError; error base variables look like ErrCredentials. Yes, Go errors have many layers and their implementation reflects their history.
Sorry.

arielshaqed · 2022-02-20T08:14:08Z

cmd/lakectl/cmd/doctor.go

@@ -47,46 +73,45 @@ var doctorCmd = &cobra.Command{

 		serverEndpoint := cfg.Values.Server.EndpointURL
 		if !strings.HasSuffix(serverEndpoint, api.BaseURL) {
-			Fmt("Suspicious URI format for server.endpoint_url: %v, doesn't have `%v` suffix.\n", serverEndpoint, api.BaseURL)
+			Fmt("Suspicious URI format for server.endpoint_url: %v, doesn't end with: `%v`.\n", serverEndpoint, api.BaseURL)


Can we also handle this as a write to the template? Otherwise we again have multiple places to control what the details look like.

cmd/lakectl/cmd/doctor.go

arielshaqed · 2022-02-20T08:19:34Z

cmd/lakectl/cmd/doctor.go

+func (e *ErrCredential) Error() string {
+	return (e.Message + "\n" + e.Details + "\n")
+}
+


For future reference (none of this is required in this PR, but it would be more Go-ish if you did do these): these errors work, but are a bit dirty. They break the errors wrapping chain. You could have instead

type WrappingError struct { Message string DetailedErr error } func (e *WrappingError) Error() string { return e.Message } func (e *WrappingError) Unwrap() error { return e.DetailedErr } func (e *WrappingError) Details() string { return c.DetailedErr.Error() }

The Unwrap call means the errors.Is and errors.As will continue to work on this error.

Now copy this type into every desired type, e.g.

type WrongEndpointURIError WrappingError var ErrWrongEndpointURI = WrongEndpointERIError{}

and you get all the parts in one place.

arielshaqed · 2022-02-20T08:22:33Z

cmd/lakectl/cmd/doctor.go

+}
+
+var configErrorTemplate = `{{ .Message |red }}
+{{  .Details | red }}


Suggested change

{{ .Details | red }}

{{ .Details }}

I think the details are less important, and templates let us say that!

arielshaqed

Thanks!

This covers everything required. I still have 2 requests, but neither is blocking.

docs/reference/commands.md

arielshaqed · 2022-02-21T12:36:24Z

cmd/lakectl/cmd/common_helpers.go

+}
+
+func IsValidSecretAccessKey(secretAccessKey string) bool {
+	return IsBase64(secretAccessKey) && len(secretAccessKey) == 40


We might still want to do this :-)

Idan Novogroder added 3 commits February 14, 2022 15:29

New lakectl command 'doctor' to run a basic diagnosis of lakeFS confi…

29522b5

…guration.

Fixed command outputs

2528ff0

Fixed command outputs

3e69a10

idanovo added area/lakectl Issues related to lakeFS' command line interface (lakectl) include-changelog PR description should be included in next release changelog labels Feb 14, 2022

Idan Novogroder added 5 commits February 14, 2022 16:21

Fixed linter comments

531371c

Fixed linter comments

9c4e5de

Fixed a bug of ignoring the '--config' flag

91b1b89

Added tests for bad config files

c2fb59b

Fixed tests to pass on runners

7e77e9e

arielshaqed requested changes Feb 14, 2022

View reviewed changes

itaidavid reviewed Feb 14, 2022

View reviewed changes

cmd/lakectl/cmd/common_helpers.go Outdated Show resolved Hide resolved

changed IsValidAccessKeyID to be more readable

23d3eed

Co-authored-by: itai-david <90712874+itai-david@users.noreply.github.com>

itaidavid reviewed Feb 14, 2022

View reviewed changes

cmd/lakectl/cmd/common_helpers.go Outdated Show resolved Hide resolved

Fixed PR review

8cf0aa6

itaidavid reviewed Feb 14, 2022

View reviewed changes

cmd/lakectl/cmd/common_helpers.go Outdated Show resolved Hide resolved

Changed IsContainOnlyUpperLettersOrNumber func to work with regex

77f172b

idanovo changed the title ~~Basic lakectl doctor command to run a basic diagnose on lakeFS configuration~~ lakectl doctor command to run a basic diagnose on lakeFS configuration Feb 14, 2022

Fixed a test to pass on runners

6c80ac2