Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

e2e test for the case that the chief is not master #235

Closed
lluunn opened this issue Dec 20, 2017 · 1 comment
Closed

e2e test for the case that the chief is not master #235

lluunn opened this issue Dec 20, 2017 · 1 comment

Comments

@lluunn
Copy link
Contributor

lluunn commented Dec 20, 2017

followup of this issue

@jlewi
Copy link
Contributor

jlewi commented Mar 7, 2018

/unassign @lluunn

Our test infrastructure should make it fairly easy to add an example which does this. The steps would be something like

  1. Create a variant of tf_smoke suitable for this test case.
  2. Create a ksonnet config to run this job
    https://github.com/kubeflow/tf-operator/blob/master/test/workflows/components/simple_tfjob.jsonnet
  3. Add it to our E2E workflow like here
    https://github.com/kubeflow/tf-operator/blob/master/test/workflows/components/workflows.libsonnet#L294

jlewi added a commit to jlewi/k8s that referenced this issue Jun 14, 2018
* Only the tests for v1alpha1 are enabled. A follow on PR will see
if v1alpha2 is working and enable the tests for v1alpha2.

* Fix versionTag logic; we need to allow for case where versionTag is an

* To facilitate these E2E tests, we create a test server to be run as
  inside the replicas. This server allows us to control what the process
  does via RPC. This allows the test runner to control when a replica exits.

* Test harness needs to route requests through the APIServer proxy

* Events no longer appears to be showing up for all services / pods
  even though all services pods are being created. So we turn the failure
  into a warning instead of a test failure.

* Print out the TFJob spec and events to aid debugging test failures.

Fix kubeflow#653 test server

Fixes: kubeflow#235 E2E test case for when chief is worker 0

Related: kubeflow#589 CI for v1alpha2
jlewi added a commit to jlewi/k8s that referenced this issue Jun 14, 2018
* Only the tests for v1alpha1 are enabled. A follow on PR will see
if v1alpha2 is working and enable the tests for v1alpha2.

* Fix versionTag logic; we need to allow for case where versionTag is an

* To facilitate these E2E tests, we create a test server to be run as
  inside the replicas. This server allows us to control what the process
  does via RPC. This allows the test runner to control when a replica exits.

* Test harness needs to route requests through the APIServer proxy

* Events no longer appears to be showing up for all services / pods
  even though all services pods are being created. So we turn the failure
  into a warning instead of a test failure.

* Print out the TFJob spec and events to aid debugging test failures.

Fix kubeflow#653 test server

Fixes: kubeflow#235 E2E test case for when chief is worker 0

Related: kubeflow#589 CI for v1alpha2
k8s-ci-robot pushed a commit that referenced this issue Jun 14, 2018
* Add E2E tests that verify termination policy is handled correctly.

* Only the tests for v1alpha1 are enabled. A follow on PR will see
if v1alpha2 is working and enable the tests for v1alpha2.

* Fix versionTag logic; we need to allow for case where versionTag is an

* To facilitate these E2E tests, we create a test server to be run as
  inside the replicas. This server allows us to control what the process
  does via RPC. This allows the test runner to control when a replica exits.

* Test harness needs to route requests through the APIServer proxy

* Events no longer appears to be showing up for all services / pods
  even though all services pods are being created. So we turn the failure
  into a warning instead of a test failure.

* Print out the TFJob spec and events to aid debugging test failures.

Fix #653 test server

Fixes: #235 E2E test case for when chief is worker 0

Related: #589 CI for v1alpha2

* * Fix bug in wait for pods; we were exiting prematurely
* Fix bug in getting message from event.
yph152 pushed a commit to yph152/tf-operator that referenced this issue Jun 18, 2018
* Add E2E tests that verify termination policy is handled correctly.

* Only the tests for v1alpha1 are enabled. A follow on PR will see
if v1alpha2 is working and enable the tests for v1alpha2.

* Fix versionTag logic; we need to allow for case where versionTag is an

* To facilitate these E2E tests, we create a test server to be run as
  inside the replicas. This server allows us to control what the process
  does via RPC. This allows the test runner to control when a replica exits.

* Test harness needs to route requests through the APIServer proxy

* Events no longer appears to be showing up for all services / pods
  even though all services pods are being created. So we turn the failure
  into a warning instead of a test failure.

* Print out the TFJob spec and events to aid debugging test failures.

Fix kubeflow#653 test server

Fixes: kubeflow#235 E2E test case for when chief is worker 0

Related: kubeflow#589 CI for v1alpha2

* * Fix bug in wait for pods; we were exiting prematurely
* Fix bug in getting message from event.
jetmuffin pushed a commit to jetmuffin/tf-operator that referenced this issue Jul 9, 2018
* Add E2E tests that verify termination policy is handled correctly.

* Only the tests for v1alpha1 are enabled. A follow on PR will see
if v1alpha2 is working and enable the tests for v1alpha2.

* Fix versionTag logic; we need to allow for case where versionTag is an

* To facilitate these E2E tests, we create a test server to be run as
  inside the replicas. This server allows us to control what the process
  does via RPC. This allows the test runner to control when a replica exits.

* Test harness needs to route requests through the APIServer proxy

* Events no longer appears to be showing up for all services / pods
  even though all services pods are being created. So we turn the failure
  into a warning instead of a test failure.

* Print out the TFJob spec and events to aid debugging test failures.

Fix kubeflow#653 test server

Fixes: kubeflow#235 E2E test case for when chief is worker 0

Related: kubeflow#589 CI for v1alpha2

* * Fix bug in wait for pods; we were exiting prematurely
* Fix bug in getting message from event.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants