Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add validation for NAS job in Katib controller #398

Merged
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 6 additions & 5 deletions pkg/controller/studyjob/const.go
Original file line number Diff line number Diff line change
Expand Up @@ -12,11 +12,12 @@ limitations under the License.
package studyjob

const (
DefaultJobWorker = "Job"
TFJobWorker = "TFJob"
PyTorchJobWorker = "PyTorchJob"
jobTypeNAS = "NAS"
jobTypeHP = "HP"
DefaultJobWorker = "Job"
TFJobWorker = "TFJob"
PyTorchJobWorker = "PyTorchJob"
jobTypeNAS = "NAS"
jobTypeHP = "HP"
suggestionAlgorithmRL = "nasrl"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/suggestionAlgorithmRL/suggestionNASRL

)

var ValidWorkerKindList = [...]string{DefaultJobWorker, TFJobWorker, PyTorchJobWorker}
19 changes: 19 additions & 0 deletions pkg/controller/studyjob/katib_api_util.go
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,25 @@ import (
)

func initializeStudy(instance *katibv1alpha1.StudyJob) error {
if validErr := validateStudy(instance); validErr != nil {
instance.Status.Condition = katibv1alpha1.ConditionFailed
return validErr
}

if getJobType(instance) != jobTypeNAS {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use suggestionSpec.suggestionAlgorithm to return jobTypeNAS or jobTypeHP in getJobType to handle the invalid studyjob case: .Spec.NasConfig is not nil and suggestionSpec.suggestionAlgorithm is not nas-like

//Validate HP job
if validJobErr := validateHPJob(instance); validJobErr != nil {
instance.Status.Condition = katibv1alpha1.ConditionFailed
return validJobErr
}
} else {
//Validate NAS job
if validJobErr := validateNASJob(instance); validJobErr != nil {
instance.Status.Condition = katibv1alpha1.ConditionFailed
return validJobErr
}
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move validateHPJob and validateNASJob into validateStudy, and move validateStudy away from initializeStudy

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You want to put validateStudy function before initializeStudy?
Right here: https://github.com/kubeflow/katib/blob/master/pkg/controller/studyjob/studyjob_controller.go#L250 ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if instance.Spec.SuggestionSpec.SuggestionAlgorithm == "" {
instance.Spec.SuggestionSpec.SuggestionAlgorithm = "random"
}
Expand Down
86 changes: 86 additions & 0 deletions pkg/controller/studyjob/utils.go
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ import (
"fmt"
"io/ioutil"
"log"
"reflect"
"strings"

katibapi "github.com/kubeflow/katib/pkg/api"
Expand Down Expand Up @@ -206,6 +207,91 @@ func contains(l []string, s string) bool {
return false
}

func validateHPJob(instance *katibv1alpha1.StudyJob) error {
log.Printf("Validation for the HP job")
if instance.Spec.ParameterConfigs == nil {
return fmt.Errorf("No Spec.ParameterConfigs specified")
}
err := validateParameterConfigs(instance.Spec.ParameterConfigs, jobTypeHP, instance.Spec.SuggestionSpec.SuggestionAlgorithm)
if err != nil {
return err
}
return nil
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return validateParameterConfigs to replace line 215-219


func validateNASJob(instance *katibv1alpha1.StudyJob) error {
log.Printf("Validation for the NAS job")
if instance.Spec.NasConfig == nil {
return fmt.Errorf("No Spec.NasConfig specified")
}

if reflect.DeepEqual(instance.Spec.NasConfig.GraphConfig, katibv1alpha1.GraphConfig{}) {
return fmt.Errorf("Missing GraphConfig in NasConfig: %v", instance.Spec.NasConfig)
}

if instance.Spec.NasConfig.GraphConfig.InputSize == nil {
return fmt.Errorf("Missing InputSize in NasConfig.GraphConfig: %v", instance.Spec.NasConfig.GraphConfig)
}

if instance.Spec.NasConfig.GraphConfig.OutputSize == nil {
return fmt.Errorf("Missing OutputSize in NasConfig.GraphConfig: %v", instance.Spec.NasConfig.GraphConfig)
}

if instance.Spec.NasConfig.GraphConfig.NumLayers == 0 && instance.Spec.SuggestionSpec.SuggestionAlgorithm == suggestionAlgorithmRL {
return fmt.Errorf("Missing NumLayers in NasConfig.GraphConfig: %v", instance.Spec.NasConfig.GraphConfig)
}

if instance.Spec.NasConfig.Operations == nil {
return fmt.Errorf("Missing Operations in NasConfig: %v", instance.Spec.NasConfig)
}

for _, op := range instance.Spec.NasConfig.Operations {
if op.OperationType == "" {
return fmt.Errorf("Missing OperationType in Operation: %v", op)
}
if op.ParameterConfigs == nil {
return fmt.Errorf("Missing ParameterConfig in Operation: %v", op)
}
err := validateParameterConfigs(op.ParameterConfigs, jobTypeNAS, instance.Spec.SuggestionSpec.SuggestionAlgorithm)
if err != nil {
return err
}
}

return nil
}

func validateParameterConfigs(parameterConfigs []katibv1alpha1.ParameterConfig, jobType string, suggestionAlgorithm string) error {
for _, pc := range parameterConfigs {
if pc.Name == "" {
return fmt.Errorf("Missing Name in ParameterConfig: %v", pc)
}
if pc.ParameterType == "" {
return fmt.Errorf("Missing ParameterType in ParameterConfig: %v", pc)
}
if reflect.DeepEqual(pc.Feasible, katibv1alpha1.FeasibleSpace{}) {
return fmt.Errorf("Missing Feasible in ParameterConfig: %v", pc)
}
if pc.ParameterType == katibv1alpha1.ParameterTypeCategorical {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not only categorical but discrete.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In discrete we also use only List structure or something else?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently discrete only need List. But we may need more type for discretes, int or double.
Please treated in the same way as Categorical for now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

if pc.Feasible.List == nil {
return fmt.Errorf("Missing List in ParameterConfig.Feasible: %v", pc.Feasible)
}
}
if pc.ParameterType == katibv1alpha1.ParameterTypeDouble || pc.ParameterType == katibv1alpha1.ParameterTypeInt {
if pc.Feasible.Min == "" {
return fmt.Errorf("Missing Min in ParameterConfig.Feasible: %v", pc.Feasible)
}
if pc.Feasible.Max == "" {
return fmt.Errorf("Missing Max in ParameterConfig.Feasible: %v", pc.Feasible)
}
if jobType == jobTypeNAS && pc.Feasible.Step == "" && pc.ParameterType == katibv1alpha1.ParameterTypeDouble && suggestionAlgorithm == suggestionAlgorithmRL {
return fmt.Errorf("Missing Step in ParameterConfig.Feasible for NAS job: %v", pc.Feasible)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

define a suggestionNASList (now only nasrl is in)
s/suggestionAlgorithm == suggestionAlgorithmRL/contains(suggestionNASList, suggestionAlgorithm)
so that when we add other nas suggestion, we only need update suggestionNASList

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here. Better way to handle this?

}
}
}
return nil
}

func getMyNamespace() string {
data, _ := ioutil.ReadFile("/var/run/secrets/kubernetes.io/serviceaccount/namespace")
return strings.TrimSpace(string(data))
Expand Down