Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding initial v1alpha2 API controller #457

Merged
merged 5 commits into from
Apr 23, 2019

Conversation

johnugeorge
Copy link
Member

@johnugeorge johnugeorge commented Apr 15, 2019

This is an initial version of level based implementation of controller for early review of code flow. I am waiting for #435 and #456 to get merged first and make necessary changes after that.

  1. Invalid conditions are not validated yet.
  2. Metric collector code is not added yet

Fixes: #441

/cc @richardsliu @gaocegege @YujiOshima


This change is Reviewable

@richardsliu richardsliu mentioned this pull request Apr 16, 2019
15 tasks
pkg/controller/v1alpha2/experiment/util/api_util.go Outdated Show resolved Hide resolved
pkg/controller/v1alpha2/experiment/util/api_util.go Outdated Show resolved Hide resolved
pkg/controller/v1alpha2/trial/util/api_util.go Outdated Show resolved Hide resolved
pkg/controller/v1alpha2/experiment/util/status_util.go Outdated Show resolved Hide resolved
pkg/controller/v1alpha2/experiment/util/status_util.go Outdated Show resolved Hide resolved
log.Info("Error in converting unstructured to status: %v ", err)
return err
}
if jobStatus.Active == 0 && jobStatus.Succeeded > 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

func (exp *Experiment) MarkExperimentStatusSucceeded(reason, message string) {
currentCond := getCondition(exp, ExperimentRunning)
if currentCond != nil {
exp.setCondition(ExperimentRunning, v1.ConditionFalse, currentCond.Reason, currentCond.Message)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why mark ExperimentRunning as v1.ConditionFalse?
and why not mark ExperimentCreated as v1.ConditionFalse when MarkExperimentStatusRunning?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to follow the same conventions from the job operators. But, it is true that we have to rethink on making more meaningful conditions. eg: we may not need Successful condition and Running condition. We could make Running as false stating reason as Successful. This is followed in k8s default objects

pkg/controller/v1alpha2/experiment/util/status_util.go Outdated Show resolved Hide resolved
pkg/controller/v1alpha2/experiment/util/status_util.go Outdated Show resolved Hide resolved
@johnugeorge
Copy link
Member Author

/retest

@johnugeorge johnugeorge changed the title WIP: Adding initial v1alpha2 API controller Adding initial v1alpha2 API controller Apr 22, 2019
func (r *ReconcileExperiment) createTrials(instance *experimentsv1alpha2.Experiment, addCount int) error {

logger := log.WithValues("Experiment", types.NamespacedName{Name: instance.GetName(), Namespace: instance.GetNamespace()})
trials, err := util.GetSuggestions(instance, addCount)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@johnugeorge
It can run more than one Suggestion at once, since we name it GetSuggestions?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please explain more?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean you have named this function GetSuggestions. Is it make sense? I thought we should name it GetSuggestion.

completedCount := instance.Status.TrialsSucceeded + instance.Status.TrialsFailed + instance.Status.TrialsKilled

if activeCount > parallelCount {
deleteCount := activeCount - parallelCount
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think line 215 and 223 can change to if activeCount >= parallelCount {} else {}

if deleteCount > 0 {
//delete 'deleteCount' number of trails. Sort them?
logger.Info("DeleteTrials", "deleteCount", deleteCount)
err = r.deleteTrials(instance, deleteCount)
Copy link
Member

@hougangliu hougangliu Apr 23, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

r.deleteTrials will make Status.TrialsKilled? will it make completedCount value dirty in line 213?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel that TrialsKilled has a different meaning. We should just delete the trial here.

@hougangliu
Copy link
Member

@richardsliu @YujiOshima @johnugeorge @gaocegege how about merging this PR first? Then we can code further based on it.

@gaocegege
Copy link
Member

SGTM.

/lgtm

@andreyvelich
Copy link
Member

@hougangliu @gaocegege @richardsliu @YujiOshima @johnugeorge @gaocegege
Should we split the work on the Controller in to small issues, after we merge this PR?

instance.Status.StartTime = &now
}
msg := "Trial is created"
instance.MarkTrialStatusCreated(util.TrialCreatedReason, msg)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MarkTrialStatusCreated is collect? It looks you need to mark running for going next step.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trial is marked running when jobs are in running state

instance.Status.StartTime = &now
}
msg := "Experiment is created"
instance.MarkExperimentStatusCreated(util.ExperimentCreatedReason, msg)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MarkTrialStatusCreated is collect? It looks you need to mark running for going next step.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the Experiment is created, it is marked Created. It is marked running when trials are created.

@YujiOshima
Copy link
Contributor

@hougangliu @andreyvelich Thanks, I agree with we proceed step by step since this change will be big and complicated.
@johnugeorge Does it OK?

@k8s-ci-robot k8s-ci-robot removed the lgtm label Apr 23, 2019
@johnugeorge
Copy link
Member Author

I have verified creation of experiment and trial objects with status updates. We can merge this PR if current review is completed.

@hougangliu
Copy link
Member

/lgtm
/approve

@k8s-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: hougangliu

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit 5a7a144 into kubeflow:master Apr 23, 2019
caicloud-bot pushed a commit to caicloud/katib that referenced this pull request Apr 28, 2019
* Adding initial v1alpha2 API controller (kubeflow#457)

* Adding initial v1alpha2 controller

* Adding logs

* Adding comments

* Adding template functions for experiment

* Adding error checks

* chore: Skip test when code is not changed (kubeflow#467)

Signed-off-by: Ce Gao <gaoce@caicloud.io>

* Add serviceAccountName in UI deployment (kubeflow#469)

* feat: Introduce suggestion

Signed-off-by: Ce Gao <gaoce@caicloud.io>

* owners: Add xinyi yu

Signed-off-by: Ce Gao <gaoce@caicloud.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[v1alpha2] Controller implementation
7 participants