Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding initial v1alpha2 API controller #457

Merged
merged 5 commits into from
Apr 23, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 6 additions & 7 deletions cmd/katib-controller/v1alpha2/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -15,10 +15,7 @@ limitations under the License.
*/

/*
StudyJobController is a controller (operator) for StudyJob
StudyJobController create and watch workers and metricscollectors.
The workers and metricscollectors are generated from template defined ConfigMap.
The workers and metricscollectors are kubernetes object. The default object is a Job and CronJob.
Katib-controller is a controller (operator) for Experiments and Trials
*/
package main

Expand All @@ -30,18 +27,20 @@ import (
_ "k8s.io/client-go/plugin/pkg/client/auth/gcp"
"sigs.k8s.io/controller-runtime/pkg/client/config"
"sigs.k8s.io/controller-runtime/pkg/manager"
logf "sigs.k8s.io/controller-runtime/pkg/runtime/log"
"sigs.k8s.io/controller-runtime/pkg/runtime/signals"
)

func main() {
logf.SetLogger(logf.ZapLogger(false))
// Get a config to talk to the apiserver
cfg, err := config.GetConfig()
if err != nil {
log.Printf("config.GetConfig()")
log.Fatal(err)
}

// Create a new StudyJobController to provide shared dependencies and start components
// Create a new katib controller to provide shared dependencies and start components
mgr, err := manager.New(cfg, manager.Options{})
if err != nil {
log.Printf("manager.New")
Expand All @@ -56,14 +55,14 @@ func main() {
log.Fatal(err)
}

// Setup StudyJobController
// Setup katib controller
if err := controller.AddToManager(mgr); err != nil {
log.Printf("controller.AddToManager(mgr)")
log.Fatal(err)
}

log.Printf("Starting the Cmd.")

// Starting the StudyJobController
// Starting the katib controller
log.Fatal(mgr.Start(signals.SetupSignalHandler()))
}
26 changes: 26 additions & 0 deletions pkg/api/operators/apis/addtoscheme_katib_v1alpha2.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
/*

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/

package apis

import (
experiments "github.com/kubeflow/katib/pkg/api/operators/apis/experiment/v1alpha2"
trials "github.com/kubeflow/katib/pkg/api/operators/apis/trial/v1alpha2"
)

func init() {
// Register the types with the Scheme so the components can map objects to GroupVersionKinds and back
AddToSchemes = append(AddToSchemes, experiments.SchemeBuilder.AddToScheme, trials.SchemeBuilder.AddToScheme)
}
30 changes: 30 additions & 0 deletions pkg/api/operators/apis/experiment/v1alpha2/constants.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
/*
Copyright 2019 The Kubernetes Authors.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package v1alpha2

const (
// Default value of Spec.ParallelTrialCount
DefaultTrialParallelCount = 3

// Default value of Spec.ConfigMapName
DefaultTrialConfigMapName = "trial-template"

// Default env name of katib namespace
DefaultKatibNamespaceEnvName = "KATIB_CORE_NAMESPACE"

// Default value of Spec.TemplatePath
DefaultTrialTemplatePath = "defaultTrialTemplate.yaml"
)
1 change: 1 addition & 0 deletions pkg/api/operators/apis/experiment/v1alpha2/doc.go
Original file line number Diff line number Diff line change
Expand Up @@ -18,5 +18,6 @@ limitations under the License.
// +k8s:deepcopy-gen=package,register
// +k8s:conversion-gen=github.com/kubeflow/katib/pkg/api/operators/apis/experiment/v1alpha2
// +k8s:defaulter-gen=TypeMeta
// +kubebuilder:subresource:status
// +groupName=experiment.kubeflow.org
package v1alpha2
29 changes: 18 additions & 11 deletions pkg/api/operators/apis/experiment/v1alpha2/experiment_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -35,10 +35,14 @@ type ExperimentSpec struct {
TrialTemplate *TrialTemplate `json:"trialTemplate,omitempty"`

// How many trials can be processed in parallel.
ParallelTrialCount int `json:"parallelTrialCount,omitempty"`
// Defaults to 3
ParallelTrialCount *int `json:"parallelTrialCount,omitempty"`

// Total number of trials to run.
MaxTrialCount int `json:"maxTrialCount,omitempty"`
// Max completed trials to mark experiment as succeeded
MaxTrialCount *int `json:"maxTrialCount,omitempty"`

// Max failed trials to mark experiment as failed.
MaxFailedTrialCount *int `json:"maxFailedTrialCount,omitempty"`
johnugeorge marked this conversation as resolved.
Show resolved Hide resolved

// Whether to retain historical data in DB after deletion.
RetainHistoricalData bool `json:"retainHistoricalData,omitempty"`
Expand Down Expand Up @@ -75,8 +79,8 @@ type ExperimentStatus struct {
// Current optimal trial parameters and observations.
CurrentOptimalTrial OptimalTrial `json:"currentOptimalTrial,omitempty"`

// How many trials have successfully completed.
TrialsCompleted int `json:"trialsCompleted,omitempty"`
// How many trials have succeeded.
TrialsSucceeded int `json:"trialsSucceeded,omitempty"`

// How many trials have failed.
TrialsFailed int `json:"trialsFailed,omitempty"`
Expand All @@ -86,6 +90,9 @@ type ExperimentStatus struct {

// How many trials are currently pending.
TrialsPending int `json:"trialsPending,omitempty"`

// How many trials are currently running.
TrialsRunning int `json:"trialsRunning,omitempty"`
}

type OptimalTrial struct {
Expand Down Expand Up @@ -154,7 +161,7 @@ type FeasibleSpace struct {

type ObjectiveSpec struct {
Type ObjectiveType `json:"type,omitempty"`
Goal float64 `json:"goal,omitempty"`
Goal *float64 `json:"goal,omitempty"`
ObjectiveMetricName string `json:"objectiveMetricName,omitempty"`
// This can be empty if we only care about the objective metric.
// Note: If we adopt a push instead of pull mechanism, this can be omitted completely.
Expand Down Expand Up @@ -206,6 +213,7 @@ type GoTemplate struct {

// Structure of the Experiment custom resource.
// +k8s:openapi-gen=true
// +kubebuilder:subresource:status
type Experiment struct {
metav1.TypeMeta `json:",inline"`
metav1.ObjectMeta `json:"metadata,omitempty"`
Expand All @@ -231,7 +239,7 @@ type NasConfig struct {

// GraphConfig contains a config of DAG
type GraphConfig struct {
NumLayers int32 `json:"numLayers,omitempty"`
NumLayers *int32 `json:"numLayers,omitempty"`
InputSizes []int32 `json:"inputSizes,omitempty"`
OutputSizes []int32 `json:"outputSizes,omitempty"`
}
Expand All @@ -242,7 +250,6 @@ type Operation struct {
Parameters []ParameterSpec `json:"parameterconfigs,omitempty"`
}

// TODO - enable this during API implementation.
//func init() {
// SchemeBuilder.Register(&Experiment{}, &ExperimentList{})
//}
func init() {
SchemeBuilder.Register(&Experiment{}, &ExperimentList{})
}
42 changes: 42 additions & 0 deletions pkg/api/operators/apis/experiment/v1alpha2/register.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
/*

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/

// Package v1alpha2 contains API Schema definitions for the experiment v1alpha2 API group
// +k8s:openapi-gen=true
// +k8s:deepcopy-gen=package,register
// +k8s:conversion-gen=github.com/kubeflow/katib/pkg/api/operators/apis/experiment/v1alpha2
// +k8s:defaulter-gen=TypeMeta
// +kubebuilder:subresource:status
// +groupName=experiments.kubeflow.org
package v1alpha2

import (
"k8s.io/apimachinery/pkg/runtime/schema"
"sigs.k8s.io/controller-runtime/pkg/runtime/scheme"
)

const (
Group = "kubeflow.org"
Version = "v1alpha2"
)

var (
// SchemeGroupVersion is group version used to register these objects
SchemeGroupVersion = schema.GroupVersion{Group: Group, Version: Version}

// SchemeBuilder is used to add go types to the GroupVersionKind scheme
SchemeBuilder = &scheme.Builder{GroupVersion: SchemeGroupVersion}
AddToScheme = SchemeBuilder.AddToScheme
)
122 changes: 122 additions & 0 deletions pkg/api/operators/apis/experiment/v1alpha2/util.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,122 @@
/*

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/

package v1alpha2

import (
v1 "k8s.io/api/core/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
)

func getCondition(exp *Experiment, condType ExperimentConditionType) *ExperimentCondition {
for _, condition := range exp.Status.Conditions {
if condition.Type == condType {
return &condition
}
}
return nil
}

func hasCondition(exp *Experiment, condType ExperimentConditionType) bool {
cond := getCondition(exp, condType)
if cond != nil && cond.Status == v1.ConditionTrue {
return true
}
return false
}

func (exp *Experiment) removeCondition(condType ExperimentConditionType) {
var newConditions []ExperimentCondition
for _, c := range exp.Status.Conditions {

if c.Type == condType {
continue
}

newConditions = append(newConditions, c)
}
exp.Status.Conditions = newConditions
}

func newCondition(conditionType ExperimentConditionType, status v1.ConditionStatus, reason, message string) ExperimentCondition {
return ExperimentCondition{
Type: conditionType,
Status: status,
LastUpdateTime: metav1.Now(),
LastTransitionTime: metav1.Now(),
Reason: reason,
Message: message,
}
}

func (exp *Experiment) IsCreated() bool {
return hasCondition(exp, ExperimentCreated)
}

func (exp *Experiment) IsSucceeded() bool {
return hasCondition(exp, ExperimentSucceeded)
}

func (exp *Experiment) IsFailed() bool {
return hasCondition(exp, ExperimentFailed)
}

func (exp *Experiment) IsCompleted() bool {
return exp.IsSucceeded() || exp.IsFailed()
}

func (exp *Experiment) setCondition(conditionType ExperimentConditionType, status v1.ConditionStatus, reason, message string) {

newCond := newCondition(conditionType, status, reason, message)
currentCond := getCondition(exp, conditionType)
// Do nothing if condition doesn't change
if currentCond != nil && currentCond.Status == newCond.Status && currentCond.Reason == newCond.Reason {
johnugeorge marked this conversation as resolved.
Show resolved Hide resolved
return
}

// Do not update lastTransitionTime if the status of the condition doesn't change.
if currentCond != nil && currentCond.Status == newCond.Status {
newCond.LastTransitionTime = currentCond.LastTransitionTime
}

exp.removeCondition(conditionType)
exp.Status.Conditions = append(exp.Status.Conditions, newCond)
}

func (exp *Experiment) MarkExperimentStatusCreated(reason, message string) {
exp.setCondition(ExperimentCreated, v1.ConditionTrue, reason, message)
}

func (exp *Experiment) MarkExperimentStatusRunning(reason, message string) {
//exp.removeCondition(ExperimentRestarting)
exp.setCondition(ExperimentRunning, v1.ConditionTrue, reason, message)
}

func (exp *Experiment) MarkExperimentStatusSucceeded(reason, message string) {
currentCond := getCondition(exp, ExperimentRunning)
if currentCond != nil {
exp.setCondition(ExperimentRunning, v1.ConditionFalse, currentCond.Reason, currentCond.Message)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why mark ExperimentRunning as v1.ConditionFalse?
and why not mark ExperimentCreated as v1.ConditionFalse when MarkExperimentStatusRunning?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to follow the same conventions from the job operators. But, it is true that we have to rethink on making more meaningful conditions. eg: we may not need Successful condition and Running condition. We could make Running as false stating reason as Successful. This is followed in k8s default objects

}
exp.setCondition(ExperimentSucceeded, v1.ConditionTrue, reason, message)

}

func (exp *Experiment) MarkExperimentStatusFailed(reason, message string) {
currentCond := getCondition(exp, ExperimentRunning)
if currentCond != nil {
exp.setCondition(ExperimentRunning, v1.ConditionFalse, currentCond.Reason, currentCond.Message)
}
exp.setCondition(ExperimentFailed, v1.ConditionTrue, reason, message)
}
Loading