Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

provider: added k3d provider and node lifecycle handlers #441

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

harshanarayana
Copy link
Contributor

@harshanarayana harshanarayana commented Jul 13, 2024

What type of PR is this?

/kind feature

What this PR does / why we need it:

This PR enables a new E2EClusterProvider for k3d based infra.

This change includes the following changes and features.
  1. Added a new Interface type E2EClusterProviderWithImageLoaderAndNodeLifecycle which can be used to setup providers that extend the cluster lifecycle function around the nodes.
  2. Enabled a k3d based provider with support for Node lifecycle management.
  3. Existing Image loader related function and interfaces were augmented with args ...string to be able to provide additional arguments in case if the image load handlers need some of the additional config.

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

I have also extended the E2EClusterProviderWithImageLoader 's Image load related methods to take an additional arg ...string value to account for additional options that one can pass while performing image load operation alone.

This comes in handy with k3d where it provides a few different mode of loading images and cleanup workflows.

This change also doesn't break any existing API contract extended by the envfuncs.

Does this PR introduce a user-facing change?

enabled a new `E2eClusterProvider` for `k3d` that can be used as a new cluster provider for running e2e tests.

Additional documentation e.g., Usage docs, etc.:

NA

@k8s-ci-robot k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. sig/testing Categorizes an issue or PR as relevant to SIG Testing. labels Jul 13, 2024
@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jul 13, 2024
@harshanarayana harshanarayana force-pushed the feature/k3d-provider-support branch 2 times, most recently from 0c3a6e3 to d25b6ab Compare July 13, 2024 14:38
@harshanarayana
Copy link
Contributor Author

/cc @cpanato

@k8s-ci-robot k8s-ci-robot requested a review from cpanato July 13, 2024 14:41
@harshanarayana
Copy link
Contributor Author

go install of k3d is failing due to an issue similar to k3d-io/k3d#1324

@harshanarayana
Copy link
Contributor Author

/hold

Holding this under the k3d install issue is resolved.

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 14, 2024
@harshanarayana harshanarayana force-pushed the feature/k3d-provider-support branch 2 times, most recently from 3b76b81 to a70ae18 Compare July 16, 2024 05:50
@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jul 16, 2024
Comment on lines +46 to +51
p := commandRunner.NewProc(installCommand)
p.SetStdout(&stdout)
p.SetStderr(&stderr)
result := p.Run()
if result.Err() != nil {
return "", fmt.Errorf("failed to install %s: %s: \n %s", pPath, result.Result(), stderr.String())
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding these to be able to capture and log the install failures better during the provider install.

@harshanarayana
Copy link
Contributor Author

/unhold

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 16, 2024
@harshanarayana harshanarayana force-pushed the feature/k3d-provider-support branch from a70ae18 to afe220a Compare July 16, 2024 11:21
@harshanarayana harshanarayana changed the title provider: enable k3d based cluster provider provider: added k3d provider and node lifecycle handlers Jul 16, 2024
@harshanarayana harshanarayana force-pushed the feature/k3d-provider-support branch 2 times, most recently from 46696b8 to 6a755b5 Compare July 16, 2024 11:24
Copy link
Member

@cpanato cpanato left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
thanks

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 16, 2024
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cpanato, harshanarayana

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [cpanato,harshanarayana]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@harshanarayana
Copy link
Contributor Author

/hold

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 16, 2024
@harshanarayana
Copy link
Contributor Author

Holding this until @cpanato @vladimirvivien @ShwethaKumbla also takes look

@harshanarayana
Copy link
Contributor Author

/retest

@harshanarayana harshanarayana force-pushed the feature/k3d-provider-support branch from 6a755b5 to 4c0f410 Compare July 16, 2024 15:51
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 16, 2024
Copy link
Member

@cpanato cpanato left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 16, 2024
Copy link
Contributor

@vladimirvivien vladimirvivien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@harshanarayana this a great (massive) PR.
Thanks for doing this and I am sure the community will appreciate the k3d integration.
I left some comments and we can work on getting this to the finish line.

examples/k3d/main_test.go Show resolved Hide resolved
pkg/stepfuncs/nodelifecycle_funcs.go Outdated Show resolved Hide resolved
support/k3d/k3d.go Outdated Show resolved Hide resolved
support/kind/kind.go Show resolved Hide resolved
support/types.go Outdated
// test out how the Remove operation impacts your workflow.
// Or you want to simulate a case where one or more node of your cluster is down and you want to see how
// your application reacts to such failure events.
type E2EClusterProviderWithImageLoaderAndNodeLifecycle interface {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oof, that is a long name.
Let's reconsider making it simpler or break into multiple (embedded) interfaces.

  • E2EClusterProvier
  • E2EClusterProviderWithImgLoader
    *E2EClusterProviderWithLifeCycle

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack. using E2EClusterProviderWithLifeCycle instead

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vladimirvivien Do you want to check the image loader related interface name too ?

@harshanarayana harshanarayana force-pushed the feature/k3d-provider-support branch from 4c0f410 to 3a6934c Compare July 22, 2024 05:19
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 22, 2024
@k8s-ci-robot
Copy link
Contributor

New changes are detected. LGTM label has been removed.

@harshanarayana
Copy link
Contributor Author

@vladimirvivien PTAL. I has pushed the requested changes.

@harshanarayana harshanarayana force-pushed the feature/k3d-provider-support branch from 3a6934c to 1597cf5 Compare July 22, 2024 05:25
Copy link
Member

@ShwethaKumbla ShwethaKumbla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@vladimirvivien vladimirvivien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making progress @harshanarayana
Added some more comments.

}
log.V(4).InfoS("Stopping node in k3d cluster", "command", cmd)
p, stdout, stderr := utils.FetchSeperatedCommandOutput(cmd)
if p.Err() != nil || (p.Exited() && p.ExitCode() != 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh boy, you are pushing that package (gexe) to it's limit. Love it 😉


// PerformNodeLifecycleOperation performs a node operation on a cluster. These operations can range from Add/Remove/Start/Stop.
// This helper is re-used in both node lifecycle handler used as types.StepFunc or env.Func
func PerformNodeLifecycleOperation(ctx context.Context, action support.NodeOperation, node *support.Node, args ...string) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this function only applies when using K3d or do you think orther providers will use that.
If only k3d, maybe it belongs in third_party package path if that is the case.

Copy link
Contributor Author

@harshanarayana harshanarayana Aug 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other providers like minikube can support this.. And kwok technically can be simulated to enable node add and remove workflows. So it can work with other providers as well. We can do that even for kind. But we will have to use adhoc docker command to simulate power on and off kind of workflows

support/k3d/k3d.go Outdated Show resolved Hide resolved
This change includes the following changes and features.

1. Added a new Interface type `E2EClusterProviderWithLifeCycle`
which can be used to setup providers that extend the cluster lifecycle function around the nodes.
2. Enabled a `k3d` based provider with support for Node lifecycle management.
3. Existing Image loader related function and interfaces were augmented with `args ...string`
to be able to provide additional arguments in case if the image load
handlers need some of the additional config.
@harshanarayana harshanarayana force-pushed the feature/k3d-provider-support branch from 1597cf5 to 3d1be3f Compare August 15, 2024 13:38
@harshanarayana
Copy link
Contributor Author

@vladimirvivien PTAL when you can. Addressed the required changes in the message for wait for control plane.

Copy link
Contributor

@vladimirvivien vladimirvivien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More comments.

@@ -0,0 +1,399 @@
/*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be in third-party.
I see that kwok in that package, but it should have also been in third-party.
package support is a place for API definitions for third-party support.
kind was supposed to be an example implementation for others (maybe it should have been in third-party as well)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point.. I can move them all. But we have to retain some sort of aliasing to aid the import backward compatibility.. At least for kwok and the kind that we support today in an available release.. Right?

If you are ok with that, I will take care of the required package refactoring

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @harshanarayana if type aliasing will help maintain backward compat, then do it.
At the very least, we should just move kwok. Thanks for doing it.

p = commandRunner.RunProc("ls $GOPATH/bin")
if p.Err() != nil {
return "", fmt.Errorf("failed to install %s: %s", pPath, p.Err())
p = commandRunner.NewProc("ls $GOPATH/bin")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think gexe could be improved to make output caputure easier ? If so, please open an issue.

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Mark this PR as fresh with /remove-lifecycle stale
  • Close this PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 29, 2024
@harshanarayana
Copy link
Contributor Author

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 1, 2024
@vladimirvivien
Copy link
Contributor

@harshanarayana is this still on your radar for 2025?

@harshanarayana
Copy link
Contributor Author

I will get this done before we cut the christmas release..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. kind/feature Categorizes issue or PR as related to a new feature. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants