Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Context threading: more wiring #14797

Merged
merged 2 commits into from
Dec 23, 2022

Conversation

justinsb
Copy link
Member

We're aiming to use this for testing immediately and better
logging/tracing in future, but to make the changes manageable breaking
them into a smaller series that don't directly achieve much.

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Dec 17, 2022
@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Dec 17, 2022
@justinsb
Copy link
Member Author

/retest

@@ -268,7 +268,9 @@ func createKeypair(out io.Writer, options *CreateKeypairOptions, name string, ke
}

func completeKeyset(cluster *kopsapi.Cluster, clientSet simple.Clientset, args []string, filter func(name string, keyset *fi.Keyset) bool) (keyset *fi.Keyset, keyStore fi.CAStore, completions []string, directive cobra.ShellCompDirective) {
keyStore, err := clientSet.KeyStore(cluster)
ctx := context.TODO()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this qualifies for TODO. This is a new interactive context.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I've been using TODO at the "frontier" to limit how far we have to require context.Context in the callchains.

However, in most of these we do have the cobra Command available, which includes the context.Context, so I added another commit to add those to this PR.

@@ -59,7 +60,7 @@ func NewVFSCAStore(cluster *kops.Cluster, basedir vfs.Path) *VFSCAStore {
}

// NewVFSSSHCredentialStore creates a SSHCredentialStore backed by VFS
func NewVFSSSHCredentialStore(cluster *kops.Cluster, basedir vfs.Path) SSHCredentialStore {
func NewVFSSSHCredentialStore(ctx context.Context, cluster *kops.Cluster, basedir vfs.Path) SSHCredentialStore {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this changing to take a Context? We're adding a Context to all methods, so why one when creating the store?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right - we don't need it - removed!

@@ -53,10 +53,10 @@ type Clientset interface {
SecretStore(cluster *kops.Cluster) (fi.SecretStore, error)

// KeyStore builds the key store for the specified cluster
KeyStore(cluster *kops.Cluster) (fi.CAStore, error)
KeyStore(ctx context.Context, cluster *kops.Cluster) (fi.CAStore, error)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't take a Context

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I follow. I think all our methods are going to end up taking a context. And this one directly needs a context for VFSClientset::KeyStore -> pkiPath -> VFSContext::BuildVfsPath

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The methods on a CAStore might take a context, but it doesn't make much sense for the CAStore itself to hold a context.

I'm thinking a lot of the problem has to do with vfs.Path implementing io.WriterTo. Perhaps one should call a method taking a context in order to get a io.WriterTo.

Copy link
Member Author

@justinsb justinsb Dec 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's a valid concern, and I've been wondering if it's possible to change where we need the context. The reason it's baked into the VFSContext right now is that we are caching a GCS client, which includes the http.Client, which is what we're going to intercept/override in our tests. Arguably the GCS Client and the http.Client should be built on every method call, but I think the GCS client is not just using the standard http.Transport. So maybe we should be caching the GCS client in the context ... etc etc

But: I don't think we should assume that a method that takes a context.Context is going to embed the context.Context. If a context arg means anything at all, it means that the method is non-trivial. Anything that logs, anything that emits an otel span, or any method that calls a method that logs is going to take a context parameter.

So, while I think we can absolutely look at where we store the GCS client and friends, overall we're going to have a context on basically every exported function, and probably most internal functions also.

I suspect we'll do this for a few years until we decide that maybe goroutine-local-storage is a good idea, and then we'll hopefully just have a bulk refactoring tool to undo all the context args ;-)

Copy link
Member

@johngmyers johngmyers Dec 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The design pattern I'm applying is that contexts don't belong in clientsets. One should not mock out the client through the context; one should mock out the client through the clientset.

The VFSContext (which is a clientset) should not be a global variable; it should be passed in explicitly. And vfs.Path is not just a path but something which currently inherently needs to embed a context.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, that makes sense and I will look at doing that. I think we need to pass it into every callsite to BuildVfsPath (which can then pass a context to buildGCSPath). I think the GCS vfs.Path will end up embedding a GCS client, which I think is equivalent.

But, in order to get there, we need to wire up the context into a lot of methods. I can either do that piecemeal in "waves" - one idea or set of functions at a time - with the hope that these smaller PRs can be more easily reviewed, or I can do a "big bang" where we change every callsite. I feel that changing every callsite at once will be almost impossible to review though...

This particular function calls VFSClientset::KeyStore -> pkiPath -> VFSContext::BuildVfsPath anyway, so we'd need the context argument even if I did the "BuildVfsPath" wave immediately. (And that "wave" looks like it'll be very big)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(The PR where I tried doing some of this is #14777 and it rapidly becomes all-encompassing...)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should hang a VFSContext off of a context.Context and I don't think we should be adding context.Context parameters to enable hanging a VFSContext off of a context.Context.

It would be reasonable for a VFSClientset to have a VFSContext.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I changed this PR to only wire up context.Context into more functions.

Once we have context.Context available in enough callsites, we can try it the more logically coherent way.


// SSHCredentialStore builds the SSHCredential store for the specified cluster
SSHCredentialStore(cluster *kops.Cluster) (fi.SSHCredentialStore, error)
SSHCredentialStore(ctx context.Context, cluster *kops.Cluster) (fi.SSHCredentialStore, error)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't take a context

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 18, 2022
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 19, 2022
@@ -76,7 +76,7 @@ func NewCmdCreate(f *util.Factory, out io.Writer) *cobra.Command {
Example: createExample,
Args: cobra.NoArgs,
RunE: func(cmd *cobra.Command, args []string) error {
return RunCreate(context.TODO(), f, out, options)
return RunCreate(cmd.Context(), f, out, options)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this return nil as we don't run with cobraCommand.ExecuteContext?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh I hope it returns context.TODO() or context.Background() but I'll check - thanks for pointing it out!

Copy link
Member Author

@justinsb justinsb Dec 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it actually defaults to context.Background, but I switched us to call ExecuteContext so it's clearer (which might also allow us to reply to control-C more promptly/correctly in future :-) )

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 21, 2022
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 21, 2022
@justinsb
Copy link
Member Author

On the more contentious issue, @johngmyers has defined a nice distinction between builder methods (broadly: no side effects, no network calls, not expected to fail for good input) vs request methods (make network calls, do things, could fail unexpectedly if e.g. the network flakes). The argument is that builder methods should not take context.Context.

It's potentially requires some deeper refactoring, but I think we/I can do it. However, I want to be sure that we don't end up revisiting these methods, so I think we need some clear design principles - ideally that we could enforce with a linter.

I'm thinking something along these lines for the rules:

  • "builder" methods would not take a context, and would be named NewFoo. (NewVFSPath, NewSSHCredentialStore etc). The prefix could also be "Build" or something else.
  • "request" methods would always take a context.

An alternative rule is that "every method takes a context", but builders should be builders (no network requests etc). We can't enforce that with a linter, but we don't enforce it today.

The problem with the stricter rules is that logging and tracing are also tied into the context (or will be in the next year or two). I think tracing is OK, but if a builder wants to log that is awkward... We could suggest passing in the logger there (effectively, a "limited context")

@olemarkus
Copy link
Member

I definitely think we should use klog from context. But I don't think logging from deterministic functions makes a whole lot of sense.

It's not uncommon to create loggers that are passed into builders though, so that may be an option. Not a huge fan of this pattern myself.

@johngmyers
Copy link
Member

To quibble with the rules, in k8sClient.CoreV1().Nodes() both CoreV1() and .Nodes() are builder methods. I don't think a prefix of "New" or "Build" is desirable in that case of a receiver on a client that configures a different client.

@justinsb
Copy link
Member Author

Well this is the place to iterate on the "rules" (and we don't even need to have rules, and they could be guidelines).

Another candidate I thought of: Builder's don't get a prefix. So Builders should start with Nouns, Requests should start with Verbs.

I don't think we can lint that reliably in general, but we might be able to catch the obvious ones. It also gives us an answer to whether methods should include the Get prefix as they often do in Java; they do if they are a Request (GetInstance), they don't if they are a Builder (i.e. don't do GetSSHCredentialStore, do SSHCredentialStore)

We're aiming to use this for testing immediately and better
logging/tracing in future, but to make the changes manageable breaking
them into a smaller series that don't directly achieve much.
Removes a lot of context.TODO() calls.
@justinsb
Copy link
Member Author

So I tried the "no prefix" option and it seems pretty reasonable (and has the advantage of being the smallest change ... )

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Dec 23, 2022
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: johngmyers

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 23, 2022
@k8s-ci-robot k8s-ci-robot merged commit c01fdbb into kubernetes:master Dec 23, 2022
@k8s-ci-robot k8s-ci-robot added this to the v1.26 milestone Dec 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/api area/nodeup area/rolling-update cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants