Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Istio-sidecar handling #1638

Open
anencore94 opened this issue Aug 24, 2021 · 18 comments
Open

Support Istio-sidecar handling #1638

anencore94 opened this issue Aug 24, 2021 · 18 comments

Comments

@anencore94
Copy link
Member

anencore94 commented Aug 24, 2021

/kind feature

Describe the solution you'd like
[A clear and concise description of what you want to happen.]

I'd like to run katib-experiment with istio-sidecar injection.
I guess the reason why katib induce user to run katib-experiment with istio-sidecar injection "false" was something like this:

Thus, the following problem is maybe the main issue.
A) katib's metrics-logger-and-collector sidecar container should know when the istio-proxy sidecar container started
B) metrics-logger-and-collector sidecar container should kill when the main container finished well or not.


By the way I think changing initial-logic and end-logic in metrics-collector image, katib could handle this.

Maybe something like this;

  1. Handling issue B above from metrics-collector ( )
func killIstioProxySidecar() {
	klog.Infof("Let's kill istio-proxy sidecar")

	// curl -fsI -X POST http://localhost:15020/quitquitquit
	resp, err := http.Post("http://localhost:15020/quitquitquit", "text/plain", bytes.NewBufferString(""))
	if err != nil {
		// TODO 
		klog.Infof("there is no istio-proxy sidecar")
	}

	defer resp.Body.Close()
	klog.Infof("istio-proxy sidecar killed")
}

func main() {
	defer killIstioProxySidecar()

	flag.Var(&stopRules, "stop-rule", "The list of early stopping stop rules")
	flag.Parse()
	klog.Infof("Trial Name: %s", *trialName)
.....
  1. Also, pass the process from istio-proxy sidecar nearby here
		// Ignore istio-sidecar processes
                // TODO this must be more sophisticated to catch istio-proxy's process
		if strings.Contains(cmdline, "envoy") || strings.Contains(cmdline, "sidecar") {
			fmt.Printf("Ignore istio-sidecar container's processes")
			continue
		}
  1. Handling issue A above.
    I have no idea how to implement this now, but the primary container's command should be changed automatically by katib experiment/trial's controller as follows:
  • Before
- command:
  - <YOUR_COMMAND>;
  • After
- command:
  - /bin/sh
  - -c
  - |
    until curl -fsI http://localhost:15021/healthz/ready; do echo \"Waiting for Sidecar...\"; sleep 3; done;
    echo \"Sidecar available. Running the command...\";
    <YOUR_COMMAND>
  • I guess there is the way since katib-controller already automatically adds echo completed > /var/log/katib/$$$$.pid at the end of primary-container's command.

Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]

  • I'm afraid this feature could be generalized except "Job" kind.
    • But for "Job", this might be useful who wants to work with istio-sidecar
@anencore94
Copy link
Member Author

The reason for this codes was this.

		if strings.Contains(cmdline, "envoy") || strings.Contains(cmdline, "sidecar") {
			fmt.Printf("Ignore istio-sidecar container's processes")
			continue
		}

When I created my katib-experiment with sidecar-injection == true, There was following process in view of metrics-logger-and-collector sidecar container.

image

@andreyvelich
Copy link
Member

andreyvelich commented Aug 24, 2021

Thank you for creating this @anencore94.
Did you check this: #1383 ?
I guess waitAllProcesses flag might help us with running istio sidecar?

@anencore94
Copy link
Member Author

anencore94 commented Oct 17, 2021

Or I think this might be another solution:

Inject metrics-logger-and-collector container with container.lifecycle.PreStop to request quitquitquit API to istio-proxy container endpoint.

From here:

@tenzen-y
Copy link
Member

@anencore94 I think you are referring to a private repository.

@longpi1
Copy link

longpi1 commented Oct 18, 2021

或者我认为这可能是另一种解决方案:

将带有container.lifecycle.PreStop 的metrics-logger-and-collector 容器注入到 istio-proxy 容器端点以请求quitquitquit API。

从这里:

unsee :404

@anencore94
Copy link
Member Author

Thanks @tenzen-y I changed the links

@stale
Copy link

stale bot commented Mar 2, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the lifecycle/stale label Mar 2, 2022
@stale
Copy link

stale bot commented Apr 16, 2022

This issue has been automatically closed because it has not had recent activity. Please comment "/reopen" to reopen it.

@stale stale bot closed this as completed Apr 16, 2022
@google-oss-prow
Copy link

@spots107: You can't reopen an issue/PR unless you authored it or you are a collaborator.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@spots107
Copy link

@anencore94 what do you think about re-opening this feature?

@anencore94
Copy link
Member Author

Thanks @spots107
I guess this issue is not resolved yet.
/reopen

@google-oss-prow
Copy link

@anencore94: Reopened this issue.

In response to this:

Thanks @spots107
I guess this issue is not resolved yet.
/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@google-oss-prow google-oss-prow bot reopened this Jul 11, 2022
@anencore94
Copy link
Member Author

WDYT of this issue ? @kubeflow/wg-automl-leads @tenzen-y
If you think this feature is necessary, I'd like to implement

@stale stale bot removed the lifecycle/stale label Sep 1, 2022
@johnugeorge
Copy link
Member

This is definitely nice to have.

@anencore94 Thanks for following this issue

@tenzen-y
Copy link
Member

tenzen-y commented Sep 1, 2022

SGTM. Thanks! @anencore94

@longpi1
Copy link

longpi1 commented Sep 22, 2022

I remember I solved this problem before, if you need, you can chat with me privately, or I will propose a PR。

apo-ger added a commit to apo-ger/katib that referenced this issue Dec 21, 2022
katib-controller:

- Allow traffic to the webhook port in order to let the
  K8s api server send traffic to this endpoint.

katib-db-manager:

- Multiple components talk to db-manager for fetching and/or
  storing metrics. We need to allow ALL trafic as Katib jobs
  are currently not supporting Istio sidecar injection.

  Relative Issue: kubeflow#1638

katib-mysql:

- The db manager's persistence layer. Allow traffic only from
  katib-db-manager.

Signed-off-by: Apostolos Gerakaris <apoger@arrikto.com>
apo-ger added a commit to apo-ger/katib that referenced this issue Dec 21, 2022
katib-controller:

- Allow traffic to the webhook port in order to let the
  K8s api server send traffic to this endpoint.

katib-db-manager:

- Multiple components talk to db-manager for fetching and/or
  storing metrics. We need to allow ALL trafic as Katib jobs
  are currently not supporting Istio sidecar injection.

  Relative Issue: kubeflow#1638

katib-mysql:

- The db manager's persistence layer. Allow traffic only from
  katib-db-manager.

Signed-off-by: Apostolos Gerakaris <apoger@arrikto.com>
@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@tenzen-y
Copy link
Member

/lifecycle frozen

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants