Skip to content

Latest commit

 

History

History
505 lines (348 loc) · 23.7 KB

kubeflow.org_v1_generated.asciidoc

File metadata and controls

505 lines (348 loc) · 23.7 KB

API Reference

Packages

kubeflow.org/v1

Package v1 is the v1 version of the API.

Package v1 contains API Schema definitions for the kubeflow.org v1 API group

Definitions

ElasticPolicy

Appears In:
Field Description

minReplicas integer

minReplicas is the lower limit for the number of replicas to which the training job can scale down. It defaults to null.

maxReplicas integer

upper limit for the number of pods that can be set by the autoscaler; cannot be smaller than MinReplicas, defaults to null.

rdzvBackend RDZVBackend

rdzvPort integer

rdzvHost string

rdzvId string

rdzvConf RDZVConf array

RDZVConf contains additional rendezvous configuration (<key1>=<value1>,<key2>=<value2>,…​).

standalone boolean

Start a local standalone rendezvous backend that is represented by a C10d TCP store on port 29400. Useful when launching single-node, multi-worker job. If specified --rdzv_backend, --rdzv_endpoint, --rdzv_id are auto-assigned; any explicitly set values are ignored.

nProcPerNode integer

Number of workers per node; supported values: [auto, cpu, gpu, int].

maxRestarts integer

metrics MetricSpec array

Metrics contains the specifications which are used to calculate the desired replica count (the maximum replica count across all metrics will be used). The desired replica count is calculated with multiplying the ratio between the target value and the current value by the current number of pods. Ergo, metrics used must decrease as the pod count is increased, and vice-versa. See the individual metric source types for more information about how each type of metric must respond. If not set, the HPA will not be created.

JobModeType (string)

JobModeType id the type for JobMode

Appears In:

MPIJob

Appears In:
Field Description

apiVersion string

kubeflow.org/v1

kind string

MPIJob

TypeMeta TypeMeta

metadata ObjectMeta

Refer to Kubernetes API documentation for fields of metadata.

spec MPIJobSpec

status JobStatus

MPIJobList

Field Description

apiVersion string

kubeflow.org/v1

kind string

MPIJobList

TypeMeta TypeMeta

metadata ListMeta

Refer to Kubernetes API documentation for fields of metadata.

items MPIJob array

MPIJobSpec

Appears In:
Field Description

slotsPerWorker integer

Specifies the number of slots per worker used in hostfile. Defaults to 1.

cleanPodPolicy CleanPodPolicy

CleanPodPolicy defines the policy that whether to kill pods after the job completes. Defaults to None.

mpiReplicaSpecs object (keys:ReplicaType, values:ReplicaSpec)

MPIReplicaSpecs contains maps from MPIReplicaType to ReplicaSpec that specify the MPI replicas to run.

mainContainer string

MainContainer specifies name of the main container which executes the MPI code.

runPolicy RunPolicy

RunPolicy encapsulates various runtime policies of the distributed training job, for example how to clean up resources and how long the job can stay active.

MXJob

MXJob is the Schema for the mxjobs API

Appears In:
Field Description

apiVersion string

kubeflow.org/v1

kind string

MXJob

TypeMeta TypeMeta

metadata ObjectMeta

Refer to Kubernetes API documentation for fields of metadata.

spec MXJobSpec

status JobStatus

MXJobList

MXJobList contains a list of MXJob

Field Description

apiVersion string

kubeflow.org/v1

kind string

MXJobList

TypeMeta TypeMeta

metadata ListMeta

Refer to Kubernetes API documentation for fields of metadata.

items MXJob array

MXJobSpec

MXJobSpec defines the desired state of MXJob

Appears In:
Field Description

runPolicy RunPolicy

RunPolicy encapsulates various runtime policies of the distributed training job, for example how to clean up resources and how long the job can stay active.

jobMode JobModeType

JobMode specify the kind of MXjob to do. Different mode may have different MXReplicaSpecs request

mxReplicaSpecs object (keys:ReplicaType, values:ReplicaSpec)

MXReplicaSpecs is map of commonv1.ReplicaType and commonv1.ReplicaSpec specifies the MX replicas to run. For example, { "Scheduler": commonv1.ReplicaSpec, "Server": commonv1.ReplicaSpec, "Worker": commonv1.ReplicaSpec, }

PaddleElasticPolicy

Appears In:
Field Description

minReplicas integer

minReplicas is the lower limit for the number of replicas to which the training job can scale down. It defaults to null.

maxReplicas integer

upper limit for the number of pods that can be set by the autoscaler; cannot be smaller than MinReplicas, defaults to null.

maxRestarts integer

MaxRestarts is the limit for restart times of pods in elastic mode.

metrics MetricSpec array

Metrics contains the specifications which are used to calculate the desired replica count (the maximum replica count across all metrics will be used). The desired replica count is calculated with multiplying the ratio between the target value and the current value by the current number of pods. Ergo, metrics used must decrease as the pod count is increased, and vice-versa. See the individual metric source types for more information about how each type of metric must respond. If not set, the HPA will not be created.

PaddleJob

PaddleJob Represents a PaddleJob resource.

Appears In:
Field Description

apiVersion string

kubeflow.org/v1

kind string

PaddleJob

TypeMeta TypeMeta

Standard Kubernetes type metadata.

metadata ObjectMeta

Refer to Kubernetes API documentation for fields of metadata.

Specification of the desired state of the PaddleJob.

status JobStatus

Most recently observed status of the PaddleJob. Read-only (modified by the system).

PaddleJobList

PaddleJobList is a list of PaddleJobs.

Field Description

apiVersion string

kubeflow.org/v1

kind string

PaddleJobList

TypeMeta TypeMeta

Standard type metadata.

metadata ListMeta

Refer to Kubernetes API documentation for fields of metadata.

items PaddleJob array

List of PaddleJobs.

PaddleJobSpec

PaddleJobSpec is a desired state description of the PaddleJob.

Appears In:
Field Description

runPolicy RunPolicy

RunPolicy encapsulates various runtime policies of the distributed training job, for example how to clean up resources and how long the job can stay active.

elasticPolicy PaddleElasticPolicy

ElasticPolicy holds the elastic policy for paddle job.

paddleReplicaSpecs object (keys:ReplicaType, values:ReplicaSpec)

A map of PaddleReplicaType (type) to ReplicaSpec (value). Specifies the Paddle cluster configuration. For example, { "Master": PaddleReplicaSpec, "Worker": PaddleReplicaSpec, }

PyTorchJob

PyTorchJob Represents a PyTorchJob resource.

Appears In:
Field Description

apiVersion string

kubeflow.org/v1

kind string

PyTorchJob

TypeMeta TypeMeta

Standard Kubernetes type metadata.

metadata ObjectMeta

Refer to Kubernetes API documentation for fields of metadata.

Specification of the desired state of the PyTorchJob.

status JobStatus

Most recently observed status of the PyTorchJob. Read-only (modified by the system).

PyTorchJobList

PyTorchJobList is a list of PyTorchJobs.

Field Description

apiVersion string

kubeflow.org/v1

kind string

PyTorchJobList

TypeMeta TypeMeta

Standard type metadata.

metadata ListMeta

Refer to Kubernetes API documentation for fields of metadata.

items PyTorchJob array

List of PyTorchJobs.

PyTorchJobSpec

PyTorchJobSpec is a desired state description of the PyTorchJob.

Appears In:
Field Description

runPolicy RunPolicy

RunPolicy encapsulates various runtime policies of the distributed training job, for example how to clean up resources and how long the job can stay active.

elasticPolicy ElasticPolicy

pytorchReplicaSpecs object (keys:ReplicaType, values:ReplicaSpec)

A map of PyTorchReplicaType (type) to ReplicaSpec (value). Specifies the PyTorch cluster configuration. For example, { "Master": PyTorchReplicaSpec, "Worker": PyTorchReplicaSpec, }

RDZVBackend (string)

Appears In:

RDZVConf

Appears In:
Field Description

key string

value string

SuccessPolicy (string)

SuccessPolicy is the success policy.

Appears In:

TFJob

TFJob represents a TFJob resource.

Appears In:
Field Description

apiVersion string

kubeflow.org/v1

kind string

TFJob

TypeMeta TypeMeta

Standard Kubernetes type metadata.

metadata ObjectMeta

Refer to Kubernetes API documentation for fields of metadata.

spec TFJobSpec

Specification of the desired state of the TFJob.

status JobStatus

Most recently observed status of the TFJob. Populated by the system. Read-only.

TFJobList

TFJobList is a list of TFJobs.

Field Description

apiVersion string

kubeflow.org/v1

kind string

TFJobList

TypeMeta TypeMeta

Standard type metadata.

metadata ListMeta

Refer to Kubernetes API documentation for fields of metadata.

items TFJob array

List of TFJobs.

TFJobSpec

TFJobSpec is a desired state description of the TFJob.

Appears In:
Field Description

runPolicy RunPolicy

RunPolicy encapsulates various runtime policies of the distributed training job, for example how to clean up resources and how long the job can stay active.

successPolicy SuccessPolicy

SuccessPolicy defines the policy to mark the TFJob as succeeded. Default to "", using the default rules.

tfReplicaSpecs object (keys:ReplicaType, values:ReplicaSpec)

A map of TFReplicaType (type) to ReplicaSpec (value). Specifies the TF cluster configuration. For example, { "PS": ReplicaSpec, "Worker": ReplicaSpec, }

enableDynamicWorker boolean

A switch to enable dynamic worker

XGBoostJob

XGBoostJob is the Schema for the xgboostjobs API

Appears In:
Field Description

apiVersion string

kubeflow.org/v1

kind string

XGBoostJob

TypeMeta TypeMeta

metadata ObjectMeta

Refer to Kubernetes API documentation for fields of metadata.

status JobStatus

XGBoostJobList

XGBoostJobList contains a list of XGBoostJob

Field Description

apiVersion string

kubeflow.org/v1

kind string

XGBoostJobList

TypeMeta TypeMeta

metadata ListMeta

Refer to Kubernetes API documentation for fields of metadata.

items XGBoostJob array

XGBoostJobSpec

XGBoostJobSpec defines the desired state of XGBoostJob

Appears In:
Field Description

runPolicy RunPolicy

INSERT ADDITIONAL SPEC FIELDS - desired state of cluster Important: Run "make" to regenerate code after modifying this file

xgbReplicaSpecs object (keys:ReplicaType, values:ReplicaSpec)