diff --git a/docs/source/user_guide/best-practices-file-based-nodes.md b/docs/source/user_guide/best-practices-file-based-nodes.md index 0b653a119..eeb9e9e61 100644 --- a/docs/source/user_guide/best-practices-file-based-nodes.md +++ b/docs/source/user_guide/best-practices-file-based-nodes.md @@ -85,6 +85,8 @@ The Visual Pipeline Editor can detect which environment variables notebooks/scri Refer to the next section for a list of proprietary environment variables that cannot be modified using the node properties settings. +If using environment variables containing sensitive information, refer to the [Handling sensitive information topic](#handling-sensitive-information) for details. + ### Proprietary environment variables Elyra makes a set of proprietary environment variables available to notebooks and scripts during execution. Unless indicated otherwise, these variables are defined in all runtime environments. @@ -104,3 +106,8 @@ notebook or script is executed in: - `local` - JupyterLab - `kfp` - Kubeflow Pipelines - `airflow` - Apache Airflow + +### Handling sensitive information +The Kubernetes Secrets property can be used to associate environment variable names with secrets, preventing sensitive information from being exposed in the pipeline file, the pipeline editor, and the runtime environment. As with static environment variables, secret-based environment variable values can be set on an individual node and/or defined as pipeline default values and shared across nodes belonging to the same pipeline. A default value can also be overridden for a particular node by redefining the secret for a given variable name in the node properties. + +Secrets are ignored when the pipeline is executed locally. For remote execution, if an environment variable was assigned both a static value (via the 'Environment Variables' property) and a Kubernetes secret value, the secret's value is used. diff --git a/docs/source/user_guide/pipelines.md b/docs/source/user_guide/pipelines.md index dc46b980a..21ba0b626 100644 --- a/docs/source/user_guide/pipelines.md +++ b/docs/source/user_guide/pipelines.md @@ -87,10 +87,15 @@ The [tutorials](/getting_started/tutorials.md) provide comprehensive step-by-ste - The value is ignored when the pipeline is executed locally. - **Environment variables** - A list of environment variables to be set in the container that executes the Jupyter notebook or script. Format: `ENV_VAR_NAME=value`. Entries that are empty (`ENV_VAR_NAME=`) or malformed are ignored. - - **Data volume** - - A list of [Persistent Volume Claims](https://kubernetes.io/docs/concepts/storage/persistent-volumes/) (PVC) to be mounted into the container that executes the Jupyter notebook or script. Format: `/mnt/path=existing-pvc-name`. Entries that are empty (`/mnt/path=`) or malformed are ignored. + - **Data volumes** + - A list of [Persistent Volume Claims](https://kubernetes.io/docs/concepts/storage/persistent-volumes/) (PVC) to be mounted into the container that executes the Jupyter notebook or script. Format: `/mnt/path=existing-pvc-name`. Entries that are empty (`/mnt/path=`) or malformed are ignored. Entries with a PVC name considered to be an [invalid Kubernetes resource name](https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names) will raise a validation error after pipeline submission or export. - The referenced PVCs must exist in the Kubernetes namespace where the generic pipeline nodes are executed. - Data volumes are not mounted when the pipeline is executed locally. + - **Kubernetes secrets** + - A list of [Kubernetes Secrets](https://kubernetes.io/docs/concepts/configuration/secret/) to be accessed as environment variables during Jupyter notebook or script execution. Format: `ENV_VAR=secret-name:secret-key`. Entries that are empty (`ENV_VAR=`) or malformed are ignored. Entries with a secret name considered to be an [invalid Kubernetes resource name](https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names) or with [an invalid secret key](https://kubernetes.io/docs/concepts/configuration/secret/#restriction-names-data) will raise a validation error after pipeline submission or export. + - The referenced secrets must exist in the Kubernetes namespace where the generic pipeline nodes are executed. + - Secrets are ignored when the pipeline is executed locally. For remote execution, if an environment variable was assigned both a static value (via the 'Environment Variables' property) and a Kubernetes secret value, the secret's value is used. + #### Adding nodes @@ -102,11 +107,11 @@ The [tutorials](/getting_started/tutorials.md) provide comprehensive step-by-ste ![Add generic components from file browser](../images/user_guide/pipelines/add-components-from-file-browser.gif) -1. Define the dependencies between nodes by connecting them, essentially creating an execution graph. +2. Define the dependencies between nodes by connecting them, essentially creating an execution graph. ![Connect nodes](../images/user_guide/pipelines/connect-nodes.gif) -1. Define the runtime properties for each node. Highlight a node, right click, and select `Open Properties`. Runtime properties configure a component and govern its execution behavior. +3. Define the runtime properties for each node. Highlight a node, right click, and select `Open Properties`. Runtime properties configure a component and govern its execution behavior. ![Configure node](../images/user_guide/pipelines/configure-node.gif) @@ -136,15 +141,20 @@ The [tutorials](/getting_started/tutorials.md) provide comprehensive step-by-ste - Example: `data/*.csv` **Data Volumes** - - Optional. A list of [Persistent Volume Claims](https://kubernetes.io/docs/concepts/storage/persistent-volumes/) (PVC) to be mounted into the container that executes the Jupyter notebook or script. Format: `/mnt/path=existing-pvc-name`. Entries that are empty (`/mnt/path=`) or malformed are ignored. The referenced PVCs must exist in the Kubernetes namespace where the node is executed. + - Optional. A list of [Persistent Volume Claims](https://kubernetes.io/docs/concepts/storage/persistent-volumes/) (PVC) to be mounted into the container that executes the Jupyter notebook or script. Format: `/mnt/path=existing-pvc-name`. Entries that are empty (`/mnt/path=`) or malformed are ignored. Entries with a PVC name considered to be an [invalid Kubernetes resource name](https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names) will raise a validation error after pipeline submission or export. The referenced PVCs must exist in the Kubernetes namespace where the node is executed. - Data volumes are not mounted when the pipeline is executed locally. - Example: `/mnt/vol1=data-pvc` -1. Associate each node with a comment to document its purpose. + **Kubernetes Secrets** + - Optional. A list of [Kubernetes Secrets](https://kubernetes.io/docs/concepts/configuration/secret/) to be accessed as environment variables during Jupyter notebook or script execution. Format: `ENV_VAR=secret-name:secret-key`. Entries that are empty (`ENV_VAR=`) or malformed are ignored. Entries with a secret name considered to be an [invalid Kubernetes resource name](https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names) or with [an invalid secret key](https://kubernetes.io/docs/concepts/configuration/secret/#restriction-names-data) will raise a validation error after pipeline submission or export. The referenced secrets must exist in the Kubernetes namespace where the generic pipeline nodes are executed. + - Secrets are ignored when the pipeline is executed locally. For remote execution, if an environment variable was assigned both a static value (via the 'Environment Variables' property) and a Kubernetes secret value, the secret's value is used. + - Example: `ENV_VAR=secret-name:secret-key` + +5. Associate each node with a comment to document its purpose. ![Add comment to node](../images/user_guide/pipelines/add-comment-to-node.gif) -1. Save the pipeline file. +6. Save the pipeline file. Note: You can rename the pipeline file in the JupyterLab _File Browser_. diff --git a/elyra/kfp/operator.py b/elyra/kfp/operator.py index 296026bcb..1ed4a0571 100644 --- a/elyra/kfp/operator.py +++ b/elyra/kfp/operator.py @@ -27,10 +27,13 @@ from kubernetes.client.models import V1EnvVarSource from kubernetes.client.models import V1ObjectFieldSelector from kubernetes.client.models import V1PersistentVolumeClaimVolumeSource +from kubernetes.client.models import V1SecretKeySelector from kubernetes.client.models import V1Volume from kubernetes.client.models import V1VolumeMount from elyra._version import __version__ +from elyra.pipeline.pipeline import KubernetesSecret +from elyra.pipeline.pipeline import VolumeMount """ The ExecuteFileOp uses a python script to bootstrap the user supplied image with the required dependencies. @@ -81,7 +84,8 @@ def __init__( mem_request: Optional[str] = None, gpu_limit: Optional[str] = None, workflow_engine: Optional[str] = "argo", - volume_mounts: Optional[Dict[str, str]] = None, + volume_mounts: Optional[List[VolumeMount]] = None, + kubernetes_secrets: Optional[List[KubernetesSecret]] = None, **kwargs, ): """Create a new instance of ContainerOp. @@ -106,6 +110,7 @@ def __init__( gpu_limit: maximum number of GPUs allowed for the operation workflow_engine: Kubeflow workflow engine, defaults to 'argo' volume_mounts: data volumes to be mounted + kubernetes_secrets: secrets to be made available as environment variables kwargs: additional key value pairs to pass e.g. name, image, sidecars & is_exit_handler. See Kubeflow pipelines ContainerOp definition for more parameters or how to use https://kubeflow-pipelines.readthedocs.io/en/latest/source/kfp.dsl.html#kfp.dsl.ContainerOp @@ -133,6 +138,7 @@ def __init__( self.mem_request = mem_request self.gpu_limit = gpu_limit self.volume_mounts = volume_mounts # optional data volumes to be mounted to the pod + self.kubernetes_secrets = kubernetes_secrets # optional secrets to be made available as env vars argument_list = [] @@ -227,16 +233,18 @@ def __init__( # or this generic operation will fail if self.volume_mounts: unique_pvcs = [] - for mount_path, pvc_name in self.volume_mounts.items(): - if pvc_name not in unique_pvcs: + for volume_mount in self.volume_mounts: + if volume_mount.pvc_name not in unique_pvcs: self.add_volume( V1Volume( - name=pvc_name, - persistent_volume_claim=V1PersistentVolumeClaimVolumeSource(claim_name=pvc_name), + name=volume_mount.pvc_name, + persistent_volume_claim=V1PersistentVolumeClaimVolumeSource( + claim_name=volume_mount.pvc_name + ), ) ) - unique_pvcs.append(pvc_name) - self.container.add_volume_mount(V1VolumeMount(mount_path=mount_path, name=pvc_name)) + unique_pvcs.append(volume_mount.pvc_name) + self.container.add_volume_mount(V1VolumeMount(mount_path=volume_mount.path, name=volume_mount.pvc_name)) # We must deal with the envs after the superclass initialization since these amend the # container attribute that isn't available until now. @@ -244,6 +252,15 @@ def __init__( for key, value in self.pipeline_envs.items(): # Convert dict entries to format kfp needs self.container.add_env_variable(V1EnvVar(name=key, value=value)) + if self.kubernetes_secrets: + for secret in self.kubernetes_secrets: # Convert tuple entries to format kfp needs + self.container.add_env_variable( + V1EnvVar( + name=secret.env_var, + value_from=V1EnvVarSource(secret_key_ref=V1SecretKeySelector(name=secret.name, key=secret.key)), + ) + ) + # If crio volume size is found then assume kubeflow pipelines environment is using CRI-o as # its container runtime if self.emptydir_volume_size: diff --git a/elyra/pipeline/airflow/processor_airflow.py b/elyra/pipeline/airflow/processor_airflow.py index b9065de38..55661d5be 100644 --- a/elyra/pipeline/airflow/processor_airflow.py +++ b/elyra/pipeline/airflow/processor_airflow.py @@ -41,6 +41,8 @@ from elyra.pipeline.pipeline import Operation from elyra.pipeline.pipeline import Pipeline from elyra.pipeline.pipeline_constants import COS_OBJECT_PREFIX +from elyra.pipeline.pipeline_constants import KUBERNETES_SECRETS +from elyra.pipeline.pipeline_constants import MOUNTED_VOLUMES from elyra.pipeline.processor import PipelineProcessor from elyra.pipeline.processor import PipelineProcessorResponse from elyra.pipeline.processor import RuntimePipelineProcessor @@ -317,8 +319,6 @@ def _cc_pipeline(self, pipeline: Pipeline, pipeline_name: str, pipeline_instance outputs=operation.outputs, ) - volume_mounts = self._get_volume_mounts(operation=operation) - target_op = { "notebook": operation.name, "id": operation.id, @@ -333,7 +333,8 @@ def _cc_pipeline(self, pipeline: Pipeline, pipeline_name: str, pipeline_instance "operator_source": operation.component_params["filename"], "is_generic_operator": True, "doc": operation.doc, - "volume_mounts": volume_mounts, + "volume_mounts": operation.component_params.get(MOUNTED_VOLUMES, []), + "kubernetes_secrets": operation.component_params.get(KUBERNETES_SECRETS, []), } if runtime_image_pull_secret is not None: @@ -507,7 +508,7 @@ def create_pipeline_file( python_output = template.render( operations_list=target_ops, pipeline_name=pipeline_instance_id, - namespace=user_namespace, + user_namespace=user_namespace, cos_secret=cos_secret, kube_config_path=None, is_paused_upon_creation="False", diff --git a/elyra/pipeline/kfp/processor_kfp.py b/elyra/pipeline/kfp/processor_kfp.py index aa41539a5..ef0464422 100644 --- a/elyra/pipeline/kfp/processor_kfp.py +++ b/elyra/pipeline/kfp/processor_kfp.py @@ -47,6 +47,8 @@ from elyra.pipeline.pipeline import Operation from elyra.pipeline.pipeline import Pipeline from elyra.pipeline.pipeline_constants import COS_OBJECT_PREFIX +from elyra.pipeline.pipeline_constants import KUBERNETES_SECRETS +from elyra.pipeline.pipeline_constants import MOUNTED_VOLUMES from elyra.pipeline.processor import PipelineProcessor from elyra.pipeline.processor import PipelineProcessorResponse from elyra.pipeline.processor import RuntimePipelineProcessor @@ -520,8 +522,6 @@ def _cc_pipeline( f"Creating pipeline component archive '{operation_artifact_archive}' for operation '{operation}'" ) - volume_mounts = self._get_volume_mounts(operation=operation) - target_ops[operation.id] = ExecuteFileOp( name=sanitized_operation_name, pipeline_name=pipeline_name, @@ -546,7 +546,8 @@ def _cc_pipeline( "mlpipeline-metrics": f"{pipeline_envs['ELYRA_WRITABLE_CONTAINER_DIR']}/mlpipeline-metrics.json", # noqa "mlpipeline-ui-metadata": f"{pipeline_envs['ELYRA_WRITABLE_CONTAINER_DIR']}/mlpipeline-ui-metadata.json", # noqa }, - volume_mounts=volume_mounts, + volume_mounts=operation.component_params.get(MOUNTED_VOLUMES, []), + kubernetes_secrets=operation.component_params.get(KUBERNETES_SECRETS, []), ) if operation.doc: diff --git a/elyra/pipeline/pipeline.py b/elyra/pipeline/pipeline.py index 4519c1197..69cf44f48 100644 --- a/elyra/pipeline/pipeline.py +++ b/elyra/pipeline/pipeline.py @@ -13,6 +13,10 @@ # See the License for the specific language governing permissions and # limitations under the License. # +from dataclasses import asdict as dataclass_asdict +from dataclasses import dataclass +from dataclasses import is_dataclass +import json import logging from logging import Logger import os @@ -194,6 +198,16 @@ def _scrub_list(dirty: Optional[List[Optional[str]]]) -> List[str]: def is_generic_operation(operation_classifier) -> bool: return operation_classifier in Operation.generic_node_types + @staticmethod + def log_message(msg: str, logger: Optional[Logger] = None, level: Optional[int] = logging.DEBUG): + """ + Log a message with the given logger at the given level or simply print. + """ + if logger: + logger.log(level, msg) + else: + print(msg) + class GenericOperation(Operation): """ @@ -454,12 +468,12 @@ def to_dict(self, logger: Optional[Logger] = None) -> Dict[str, str]: key = key.strip() if not key: - KeyValueList.log_message(f"Skipping inclusion of property '{kv}': no key found", logger, logging.WARN) + Operation.log_message(f"Skipping inclusion of property '{kv}': no key found", logger, logging.WARN) continue if isinstance(value, str): value = value.strip() if not value: - KeyValueList.log_message( + Operation.log_message( f"Skipping inclusion of property '{key}': no value specified", logger, logging.DEBUG ) continue @@ -467,13 +481,17 @@ def to_dict(self, logger: Optional[Logger] = None) -> Dict[str, str]: kv_dict[key] = value return kv_dict + @classmethod + def to_str(cls, key: str, value: str) -> str: + return f"{key}{cls._key_value_separator}{value}" + @classmethod def from_dict(cls, kv_dict: Dict) -> "KeyValueList": """ Convert a set of key-value pairs stored in a dictionary to a KeyValueList of strings with the defined separator. """ - str_list = [f"{key}{cls._key_value_separator}{value}" for key, value in kv_dict.items()] + str_list = [KeyValueList.to_str(key, value) for key, value in kv_dict.items()] return KeyValueList(str_list) @classmethod @@ -487,12 +505,47 @@ def merge(cls, primary: "KeyValueList", secondary: "KeyValueList") -> "KeyValueL return KeyValueList.from_dict({**secondary_dict, **primary_dict}) - @staticmethod - def log_message(msg: str, logger: Optional[Logger] = None, level: Optional[int] = logging.DEBUG): + @classmethod + def difference(cls, minuend: "KeyValueList", subtrahend: "KeyValueList") -> "KeyValueList": """ - Log a message with the given logger at the given level or simply print. + Given KeyValueLists, convert to dictionaries and remove any keys found in the + second (subtrahend) from the first (minuend), if present. + + :param minuend: list to be subtracted from + :param subtrahend: list whose keys will be removed from the minuend + + :returns: the difference of the two lists """ - if logger: - logger.log(level, msg) - else: - print(msg) + subtract_dict = minuend.to_dict() + for key in subtrahend.to_dict().keys(): + if key in subtract_dict: + subtract_dict.pop(key) + + return KeyValueList.from_dict(subtract_dict) + + +@dataclass +class VolumeMount: + path: str + pvc_name: str + + +@dataclass +class KubernetesSecret: + env_var: str + name: str + key: str + + +class DataClassJSONEncoder(json.JSONEncoder): + """ + A JSON Encoder class to prevent errors during serialization of dataclasses. + """ + + def default(self, o): + """ + Render dataclass content as dict + """ + if is_dataclass(o): + return dataclass_asdict(o) + return super().default(o) diff --git a/elyra/pipeline/pipeline_constants.py b/elyra/pipeline/pipeline_constants.py index 42d75eafb..0c2be96cb 100644 --- a/elyra/pipeline/pipeline_constants.py +++ b/elyra/pipeline/pipeline_constants.py @@ -18,6 +18,7 @@ RUNTIME_IMAGE = "runtime_image" ENV_VARIABLES = "env_vars" MOUNTED_VOLUMES = "mounted_volumes" +KUBERNETES_SECRETS = "kubernetes_secrets" PIPELINE_META_PROPERTIES = ["name", "description", "runtime"] # optional static prefix to be used when generating an object name for object storage COS_OBJECT_PREFIX = "cos_object_prefix" diff --git a/elyra/pipeline/pipeline_definition.py b/elyra/pipeline/pipeline_definition.py index 386fe24a0..92e71e586 100644 --- a/elyra/pipeline/pipeline_definition.py +++ b/elyra/pipeline/pipeline_definition.py @@ -23,9 +23,15 @@ from jinja2 import Environment, Undefined from jinja2 import PackageLoader -from elyra.pipeline import pipeline_constants from elyra.pipeline.pipeline import KeyValueList +from elyra.pipeline.pipeline import KubernetesSecret from elyra.pipeline.pipeline import Operation +from elyra.pipeline.pipeline import VolumeMount +from elyra.pipeline.pipeline_constants import ENV_VARIABLES +from elyra.pipeline.pipeline_constants import KUBERNETES_SECRETS +from elyra.pipeline.pipeline_constants import MOUNTED_VOLUMES +from elyra.pipeline.pipeline_constants import PIPELINE_DEFAULTS +from elyra.pipeline.pipeline_constants import PIPELINE_META_PROPERTIES class AppDataBase(object): # ABC @@ -174,7 +180,7 @@ def pipeline_parameters(self) -> Dict[str, Any]: and runtime) or the pipeline defaults dictionary """ all_properties = self._node["app_data"].get("properties", {}) - excluded_properties = pipeline_constants.PIPELINE_META_PROPERTIES + [pipeline_constants.PIPELINE_DEFAULTS] + excluded_properties = PIPELINE_META_PROPERTIES + [PIPELINE_DEFAULTS] pipeline_parameters = {} for property_name, value in all_properties.items(): @@ -215,7 +221,7 @@ def convert_kv_properties(self, kv_properties: List[str]): Convert pipeline defaults-level list properties that have been identified as sets of key-value pairs from a plain list type to the KeyValueList type. """ - pipeline_defaults = self.get_property(pipeline_constants.PIPELINE_DEFAULTS, {}) + pipeline_defaults = self.get_property(PIPELINE_DEFAULTS, {}) for property_name, value in pipeline_defaults.items(): if property_name not in kv_properties: continue @@ -224,7 +230,7 @@ def convert_kv_properties(self, kv_properties: List[str]): pipeline_defaults[property_name] = KeyValueList(value) if pipeline_defaults: - self.set_property(pipeline_constants.PIPELINE_DEFAULTS, pipeline_defaults) + self.set_property(PIPELINE_DEFAULTS, pipeline_defaults) class Node(AppDataBase): @@ -312,7 +318,7 @@ def set_component_parameter(self, key: str, value: Any): if not key: raise ValueError("Key is required") - if not value: + if value is None: raise ValueError("Value is required") self._node["app_data"]["component_parameters"][key] = value @@ -325,18 +331,64 @@ def get_all_component_parameters(self) -> Dict[str, Any]: def convert_kv_properties(self, kv_properties: List[str]): """ - Convert node-level list properties that have been identified as sets of - key-value pairs from a plain list type to the KeyValueList type. + Convert node-level list properties that have been identified as sets of + key-value pairs from a plain list type to the KeyValueList type. If any + k-v property has already been converted to a KeyValueList, all k-v + properties are assumed to have already been converted. """ - for property_name, value in self.get_all_component_parameters().items(): - if property_name not in kv_properties: - continue - + for kv_property in kv_properties: + value = self.get_component_parameter(kv_property) if not value: continue + if isinstance(value, KeyValueList): + # Any KeyValueList instance implies all relevant properties have already been converted + return + # Convert plain list to KeyValueList - self.set_component_parameter(property_name, KeyValueList(value)) + self.set_component_parameter(kv_property, KeyValueList(value)) + + def remove_env_vars_with_matching_secrets(self): + """ + In the case of a matching key between env vars and kubernetes secrets, + prefer the Kubernetes Secret and remove the matching env var. + """ + env_vars = self.get_component_parameter(ENV_VARIABLES) + secrets = self.get_component_parameter(KUBERNETES_SECRETS) + if isinstance(env_vars, KeyValueList) and isinstance(secrets, KeyValueList): + new_list = KeyValueList.difference(minuend=env_vars, subtrahend=secrets) + self.set_component_parameter(ENV_VARIABLES, new_list) + + def convert_data_class_properties(self): + """ + Convert select node-level list properties to their corresponding dataclass + object type. No validation is performed. + """ + volume_mounts = self.get_component_parameter(MOUNTED_VOLUMES) + if volume_mounts and isinstance(volume_mounts, KeyValueList): + volume_objects = [] + for mount_path, pvc_name in volume_mounts.to_dict().items(): + formatted_mount_path = f"/{mount_path.strip('/')}" + + # Create a VolumeMount class instance and add to list + volume_objects.append(VolumeMount(formatted_mount_path, pvc_name)) + + self.set_component_parameter(MOUNTED_VOLUMES, volume_objects) + + secrets = self.get_component_parameter(KUBERNETES_SECRETS) + if secrets and isinstance(secrets, KeyValueList): + secret_objects = [] + for env_var_name, secret in secrets.to_dict().items(): + secret_name, *optional_key = secret.split(":", 1) + + secret_key = "" + if optional_key: + secret_key = optional_key[0].strip() + + # Create a KubernetesSecret class instance and add to list + secret_objects.append(KubernetesSecret(env_var_name, secret_name.strip(), secret_key)) + + self.set_component_parameter(KUBERNETES_SECRETS, secret_objects) class PipelineDefinition(object): @@ -460,7 +512,7 @@ def validate(self) -> list: return self._validation_issues # Has not been validated before - validation_issues = list() + validation_issues = [] # Validate pipeline schema version if "version" not in self._pipeline_definition: validation_issues.append("Pipeline schema version field is missing.") @@ -515,12 +567,12 @@ def propagate_pipeline_default_properties(self): kv_properties = PipelineDefinition.get_kv_properties() self.primary_pipeline.convert_kv_properties(kv_properties) - pipeline_default_properties = self.primary_pipeline.get_property(pipeline_constants.PIPELINE_DEFAULTS, {}) + pipeline_default_properties = self.primary_pipeline.get_property(PIPELINE_DEFAULTS, {}) for node in self.pipeline_nodes: if not Operation.is_generic_operation(node.op): continue - # Convert any key-value list node properties to the KeyValueList type + # Convert any key-value list node properties to the KeyValueList type if not done already node.convert_kv_properties(kv_properties) for property_name, pipeline_default_value in pipeline_default_properties.items(): @@ -532,12 +584,15 @@ def propagate_pipeline_default_properties(self): node.set_component_parameter(property_name, pipeline_default_value) continue - if isinstance(pipeline_default_value, KeyValueList): - if not isinstance(node_value, KeyValueList): - raise TypeError(f"The value of node property '{property_name}' is not of type KeyValueList") - merged_list: KeyValueList = KeyValueList.merge(node_value, pipeline_default_value) + if isinstance(pipeline_default_value, KeyValueList) and isinstance(node_value, KeyValueList): + merged_list = KeyValueList.merge(node_value, pipeline_default_value) node.set_component_parameter(property_name, merged_list) + if self.primary_pipeline.runtime_config != "local": + node.remove_env_vars_with_matching_secrets() + + node.convert_data_class_properties() + def is_valid(self) -> bool: """ Represents whether or not the pipeline structure is valid diff --git a/elyra/pipeline/processor.py b/elyra/pipeline/processor.py index 8179e8bc1..b3da9a454 100644 --- a/elyra/pipeline/processor.py +++ b/elyra/pipeline/processor.py @@ -41,12 +41,10 @@ from elyra.pipeline.pipeline import GenericOperation from elyra.pipeline.pipeline import Operation from elyra.pipeline.pipeline import Pipeline -from elyra.pipeline.pipeline_constants import MOUNTED_VOLUMES from elyra.pipeline.runtime_type import RuntimeProcessorType from elyra.pipeline.runtime_type import RuntimeTypeResources from elyra.util.archive import create_temp_archive from elyra.util.cos import CosClient -from elyra.util.kubernetes import is_valid_kubernetes_resource_name from elyra.util.path import get_expanded_path elyra_log_pipeline_info = os.getenv("ELYRA_LOG_PIPELINE_INFO", True) @@ -581,27 +579,3 @@ def _process_list_value(self, value: str) -> Union[List, str]: return value return converted_list - - def _get_volume_mounts(self, operation: Operation) -> Optional[Dict[str, str]]: - """ - Loops through an Operation mounted volumes to re-format path and remove - invalid PVC names. - - :param operation: the operation to check for volume mounts - :return: dictionary of mount path to valid PVC names - """ - volume_mounts_valid = {} - if operation.component_params.get(MOUNTED_VOLUMES): - volume_mounts = operation.component_params.get(MOUNTED_VOLUMES).to_dict() - for mount_path, pvc_name in volume_mounts.items(): - # Ensure the PVC name is syntactically a valid Kubernetes resource name - if not is_valid_kubernetes_resource_name(pvc_name): - self.log.warning( - f"Ignoring invalid volume mount entry '{mount_path}': the PVC " - f"name '{pvc_name}' is not a valid Kubernetes resource name." - ) - continue - - formatted_mount_path = f"/{mount_path.strip('/')}" - volume_mounts_valid[formatted_mount_path] = pvc_name - return volume_mounts_valid diff --git a/elyra/pipeline/validation.py b/elyra/pipeline/validation.py index 38a134894..da275acd4 100644 --- a/elyra/pipeline/validation.py +++ b/elyra/pipeline/validation.py @@ -30,13 +30,23 @@ from elyra.metadata.schemaspaces import Runtimes from elyra.pipeline.component import Component from elyra.pipeline.component_catalog import ComponentCache +from elyra.pipeline.pipeline import DataClassJSONEncoder +from elyra.pipeline.pipeline import KeyValueList +from elyra.pipeline.pipeline import KubernetesSecret from elyra.pipeline.pipeline import Operation from elyra.pipeline.pipeline import PIPELINE_CURRENT_SCHEMA from elyra.pipeline.pipeline import PIPELINE_CURRENT_VERSION +from elyra.pipeline.pipeline import VolumeMount +from elyra.pipeline.pipeline_constants import ENV_VARIABLES +from elyra.pipeline.pipeline_constants import KUBERNETES_SECRETS +from elyra.pipeline.pipeline_constants import MOUNTED_VOLUMES +from elyra.pipeline.pipeline_constants import RUNTIME_IMAGE from elyra.pipeline.pipeline_definition import Node from elyra.pipeline.pipeline_definition import PipelineDefinition from elyra.pipeline.processor import PipelineProcessorManager from elyra.pipeline.runtime_type import RuntimeProcessorType +from elyra.util.kubernetes import is_valid_kubernetes_key +from elyra.util.kubernetes import is_valid_kubernetes_resource_name from elyra.util.path import get_expanded_path @@ -397,10 +407,12 @@ def _validate_generic_node_properties(self, node: Node, response: ValidationResp :return: """ node_label = node.label - image_name = node.get_component_parameter("runtime_image") + image_name = node.get_component_parameter(RUNTIME_IMAGE) filename = node.get_component_parameter("filename") dependencies = node.get_component_parameter("dependencies") - env_vars = node.get_component_parameter("env_vars") + env_vars = node.get_component_parameter(ENV_VARIABLES) + volumes = node.get_component_parameter(MOUNTED_VOLUMES) + secrets = node.get_component_parameter(KUBERNETES_SECRETS) self._validate_filepath( node_id=node.id, node_label=node_label, property_name="filename", filename=filename, response=response @@ -420,6 +432,11 @@ def _validate_generic_node_properties(self, node: Node, response: ValidationResp response=response, ) + if volumes: + self._validate_mounted_volumes(node.id, node_label, volumes, response=response) + if secrets: + self._validate_kubernetes_secrets(node.id, node_label, secrets, response=response) + self._validate_label(node_id=node.id, node_label=node_label, response=response) if dependencies: notebook_root_relative_path = os.path.dirname(filename) @@ -600,6 +617,86 @@ def _validate_resource_value( }, ) + def _validate_mounted_volumes( + self, node_id: str, node_label: str, volumes: List[VolumeMount], response: ValidationResponse + ) -> None: + """ + Checks the format of mounted volumes to ensure they're in the correct form + e.g. foo/path=pvc_name + :param node_id: the unique ID of the node + :param node_label: the given node name or user customized name/label of the node + :param volumes: a KeyValueList of volumes to check + :param response: ValidationResponse containing the issue list to be updated + """ + for volume in volumes: + # Ensure the PVC name is syntactically a valid Kubernetes resource name + if not is_valid_kubernetes_resource_name(volume.pvc_name): + response.add_message( + severity=ValidationSeverity.Error, + message_type="invalidVolumeMount", + message=f"PVC name '{volume.pvc_name}' is not a valid Kubernetes resource name.", + data={ + "nodeID": node_id, + "nodeName": node_label, + "propertyName": MOUNTED_VOLUMES, + "value": KeyValueList.to_str(volume.path, volume.pvc_name), + }, + ) + + def _validate_kubernetes_secrets( + self, node_id: str, node_label: str, secrets: List[KubernetesSecret], response: ValidationResponse + ) -> None: + """ + Checks the format of Kubernetes secrets to ensure they're in the correct form + e.g. FOO=SECRET_NAME:KEY + :param node_id: the unique ID of the node + :param node_label: the given node name or user customized name/label of the node + :param secrets: a KeyValueList of secrets to check + :param response: ValidationResponse containing the issue list to be updated + """ + for secret in secrets: + if not secret.name or not secret.key: + response.add_message( + severity=ValidationSeverity.Error, + message_type="invalidKubernetesSecret", + message=f"Environment variable '{secret.env_var}' has an improperly formatted representation of " + f"secret name and key.", + data={ + "nodeID": node_id, + "nodeName": node_label, + "propertyName": KUBERNETES_SECRETS, + "value": KeyValueList.to_str(secret.env_var, f"{(secret.name or '')}:{(secret.key or '')}"), + }, + ) + continue + + # Ensure the secret name is syntactically a valid Kubernetes resource name + if not is_valid_kubernetes_resource_name(secret.name): + response.add_message( + severity=ValidationSeverity.Error, + message_type="invalidKubernetesSecret", + message=f"Secret name '{secret.name}' is not a valid Kubernetes resource name.", + data={ + "nodeID": node_id, + "nodeName": node_label, + "propertyName": KUBERNETES_SECRETS, + "value": KeyValueList.to_str(secret.env_var, f"{secret.name}:{secret.key}"), + }, + ) + # Ensure the secret key is a syntactically valid Kubernetes key + if not is_valid_kubernetes_key(secret.key): + response.add_message( + severity=ValidationSeverity.Error, + message_type="invalidKubernetesSecret", + message=f"Key '{secret.key}' is not a valid Kubernetes secret key.", + data={ + "nodeID": node_id, + "nodeName": node_label, + "propertyName": KUBERNETES_SECRETS, + "value": KeyValueList.to_str(secret.env_var, f"{secret.name}:{secret.key}"), + }, + ) + def _validate_filepath( self, node_id: str, @@ -683,7 +780,7 @@ def _validate_environmental_variables( severity=ValidationSeverity.Error, message_type="invalidEnvPair", message="Property has an improperly formatted env variable key value pair.", - data={"nodeID": node_id, "nodeName": node_label, "propertyName": "env_vars", "value": env_var}, + data={"nodeID": node_id, "nodeName": node_label, "propertyName": ENV_VARIABLES, "value": env_var}, ) def _validate_label(self, node_id: str, node_label: str, response: ValidationResponse) -> None: @@ -743,7 +840,7 @@ def _validate_pipeline_graph(self, pipeline: dict, response: ValidationResponse) :param response: ValidationResponse containing the issue list to be updated :param pipeline: A dictionary describing the pipeline """ - pipeline_json = json.loads(json.dumps(pipeline)) + pipeline_json = json.loads(json.dumps(pipeline, cls=DataClassJSONEncoder)) graph = nx.DiGraph() @@ -794,7 +891,7 @@ def _get_pipeline_id(self, pipeline: dict, node_id: str) -> Optional[str]: :param node_id: the node ID of the node :return: the pipeline ID of where the node is located """ - pipeline_json = json.loads(json.dumps(pipeline)) + pipeline_json = json.loads(json.dumps(pipeline, cls=DataClassJSONEncoder)) for single_pipeline in pipeline_json["pipelines"]: node_list = single_pipeline["nodes"] for node in node_list: @@ -835,7 +932,7 @@ def _get_node_names(self, pipeline: dict, node_id_list: list) -> List: :return: a string representing the name of the node """ node_name_list = [] - pipeline_json = json.loads(json.dumps(pipeline)) + pipeline_json = json.loads(json.dumps(pipeline, cls=DataClassJSONEncoder)) for node_id in node_id_list: found = False for single_pipeline in pipeline_json["pipelines"]: @@ -861,7 +958,7 @@ def _get_node_labels(self, pipeline: dict, link_ids: List[str]) -> Optional[List if link_ids is None: return None - pipeline_json = json.loads(json.dumps(pipeline)) + pipeline_json = json.loads(json.dumps(pipeline, cls=DataClassJSONEncoder)) node_labels = [] for link_id in link_ids: for single_pipeline in pipeline_json["pipelines"]: diff --git a/elyra/templates/airflow/airflow_template.jinja2 b/elyra/templates/airflow/airflow_template.jinja2 index 08db9b09e..246ff6e14 100644 --- a/elyra/templates/airflow/airflow_template.jinja2 +++ b/elyra/templates/airflow/airflow_template.jinja2 @@ -33,8 +33,10 @@ env_var_secret_key = Secret(deploy_type='env', key='AWS_SECRET_ACCESS_KEY', ) {% endif %} +{% set ns = namespace() %} {% for key, operation in operations_list.items() %} +{% set ns.operation_kubernetes_secrets = "" %} {% if not operation.is_generic_operator %} {% for import_statement in operation.imports %} @@ -44,6 +46,21 @@ env_var_secret_key = Secret(deploy_type='env', from airflow.contrib.operators.kubernetes_pod_operator import KubernetesPodOperator {% endif %} +{% if operation.kubernetes_secrets %} +from airflow.kubernetes.secret import Secret +# Kubernetes secrets for operation '{{ operation.id|regex_replace }}' + {% for secret in operation.kubernetes_secrets %} +secret_{{ operation.id|regex_replace }}_{{ loop.index }} = Secret(deploy_type='env', + deploy_target='{{ secret.env_var }}', + secret='{{ secret.name }}', + key='{{ secret.key }}', +) + {% set ns.operation_kubernetes_secrets = ns.operation_kubernetes_secrets ~ 'secret_' ~ operation.id|regex_replace ~ '_' ~ loop.index ~ ', ' %} + {% endfor %} +{% endif %} + +{% if cos_secret %}{% set ns.operation_kubernetes_secrets = ns.operation_kubernetes_secrets ~ "env_var_secret_id, env_var_secret_key" %}{% endif %} + {% if operation.operator_source %}# Operator source: {{ operation.operator_source }}{% endif %}{% if not operation.is_generic_operator %} op_{{ operation.id|regex_replace }} = {{ operation.class_name }}( task_id='{{ operation.notebook|regex_replace }}', @@ -56,24 +73,24 @@ from airflow.contrib.kubernetes.volume import Volume from airflow.contrib.kubernetes.volume_mount import VolumeMount volumes_{{ operation.id|regex_replace }} = [] volume_mounts_{{ operation.id|regex_replace }} = [] - {% for mount_path, volume_claim in operation.volume_mounts.items() %} -volume_mount_{{loop.index}} = VolumeMount(name='{{ volume_claim }}', - mount_path='{{ mount_path }}', + {% for volume_mount in operation.volume_mounts %} +volume_mount_{{loop.index}} = VolumeMount(name='{{ volume_mount.pvc_name }}', + mount_path='{{ volume_mount.path }}', sub_path=None, read_only=False) volume_config_{{ loop.index }}= { 'persistentVolumeClaim': { - 'claimName': '{{ volume_claim }}' + 'claimName': '{{ volume_mount.pvc_name }}' } } -volume_{{ loop.index }} = Volume(name='{{ volume_claim }}', configs=volume_config_{{ loop.index }}) +volume_{{ loop.index }} = Volume(name='{{ volume_mount.pvc_name }}', configs=volume_config_{{ loop.index }}) volumes_{{ operation.id|regex_replace }}.append(volume_{{ loop.index }}) volume_mounts_{{ operation.id|regex_replace }}.append(volume_mount_{{ loop.index }}) {% endfor %} {% endif %} op_{{ operation.id|regex_replace }} = KubernetesPodOperator(name='{{ operation.notebook|regex_replace }}', - namespace='{{ namespace }}', + namespace='{{ user_namespace }}', image='{{ operation.runtime_image }}', {% if operation.runtime_image_pull_secret %} image_pull_secrets='{{ operation.runtime_image_pull_secret }}', @@ -100,9 +117,9 @@ op_{{ operation.id|regex_replace }} = KubernetesPodOperator(name='{{ operation.n }, {% endif %} - {% if cos_secret %} - secrets=[env_var_secret_id, env_var_secret_key], - {% endif %} + {% if ns.operation_kubernetes_secrets %} + secrets=[{{ ns.operation_kubernetes_secrets }}], + {% endif %} in_cluster={{ in_cluster }}, config_file="{{ kube_config_path }}", {% endif %} diff --git a/elyra/templates/components/generic_properties_template.jinja2 b/elyra/templates/components/generic_properties_template.jinja2 index a693ff057..0809021cd 100644 --- a/elyra/templates/components/generic_properties_template.jinja2 +++ b/elyra/templates/components/generic_properties_template.jinja2 @@ -8,6 +8,7 @@ "elyra_memory": null, "elyra_outputs": [], "elyra_env_vars": [], + "elyra_kubernetes_secrets": [], "elyra_dependencies": [], "elyra_include_subdirectories": false, "elyra_mounted_volumes": [] @@ -40,6 +41,9 @@ { "id": "elyra_env_vars" }, + { + "id": "elyra_kubernetes_secrets" + }, { "id": "elyra_outputs" }, @@ -189,6 +193,22 @@ "keyValueEntries": true } }, + { + "parameter_ref": "elyra_kubernetes_secrets", + "control": "custom", + "custom_control_id": "StringArrayControl", + "label": { + "default": "Kubernetes Secrets" + }, + "description": { + "default": "Kubernetes secrets to make available as environment variables to this node. The secret name and key given must be present in the Kubernetes namespace where the node is executed or this node will not run.", + "placement": "on_panel" + }, + "data": { + "placeholder": "env_var=secret-name:secret-key", + "keyValueEntries": true + } + }, { "parameter_ref": "elyra_outputs", "control": "custom", @@ -218,7 +238,7 @@ "data": { "placeholder": "/mount/path=pvc-name" } - } + } ], "group_info": [ { @@ -260,6 +280,11 @@ "type": "controls", "parameter_refs": ["elyra_env_vars"] }, + { + "id": "elyra_kubernetes_secrets", + "type": "controls", + "parameter_refs": ["elyra_kubernetes_secrets"] + }, { "id": "elyra_outputs", "type": "controls", diff --git a/elyra/templates/pipeline/pipeline_properties_template.jinja2 b/elyra/templates/pipeline/pipeline_properties_template.jinja2 index 7683df5f1..7afa1c2e0 100644 --- a/elyra/templates/pipeline/pipeline_properties_template.jinja2 +++ b/elyra/templates/pipeline/pipeline_properties_template.jinja2 @@ -6,6 +6,7 @@ "cos_object_prefix": "", "elyra_runtime_image": null, "elyra_env_vars": [], + "elyra_kubernetes_secrets": [], "elyra_mounted_volumes": [] }, "parameters": [ @@ -27,6 +28,9 @@ { "id": "elyra_env_vars" }, + { + "id": "elyra_kubernetes_secrets" + }, { "id": "elyra_mounted_volumes" } @@ -99,6 +103,22 @@ "keyValueEntries": true } }, + { + "parameter_ref": "elyra_kubernetes_secrets", + "control": "custom", + "custom_control_id": "StringArrayControl", + "label": { + "default": "Kubernetes Secrets" + }, + "description": { + "default": "Kubernetes secrets to make available as environment variables to this node. The secret name and key given must be present in the Kubernetes namespace where the nodes are executed or the pipeline will not run.", + "placement": "on_panel" + }, + "data": { + "placeholder": "env_var=secret-name:secret-key", + "keyValueEntries": true + } + }, { "parameter_ref": "elyra_mounted_volumes", "control": "custom", @@ -163,6 +183,11 @@ "type": "controls", "parameter_refs": ["elyra_env_vars"] }, + { + "id": "elyra_kubernetes_secrets", + "type": "controls", + "parameter_refs": ["elyra_kubernetes_secrets"] + }, { "id": "elyra_mounted_volumes", "type": "controls", diff --git a/elyra/tests/pipeline/resources/properties.json b/elyra/tests/pipeline/resources/properties.json index e0b61e493..06f5f75f8 100644 --- a/elyra/tests/pipeline/resources/properties.json +++ b/elyra/tests/pipeline/resources/properties.json @@ -9,6 +9,7 @@ "elyra_mounted_volumes": [], "elyra_outputs": [], "elyra_env_vars": [], + "elyra_kubernetes_secrets": [], "elyra_dependencies": [], "elyra_include_subdirectories": false }, @@ -40,6 +41,9 @@ { "id": "elyra_env_vars" }, + { + "id": "elyra_kubernetes_secrets" + }, { "id": "elyra_outputs" }, @@ -189,6 +193,22 @@ "keyValueEntries": true } }, + { + "parameter_ref": "elyra_kubernetes_secrets", + "control": "custom", + "custom_control_id": "StringArrayControl", + "label": { + "default": "Kubernetes Secrets" + }, + "description": { + "default": "Kubernetes secrets to make available as environment variables to this node. The secret name and key given must be present in the Kubernetes namespace where the node is executed or this node will not run.", + "placement": "on_panel" + }, + "data": { + "placeholder": "env_var=secret-name:secret-key", + "keyValueEntries": true + } + }, { "parameter_ref": "elyra_outputs", "control": "custom", @@ -260,6 +280,11 @@ "type": "controls", "parameter_refs": ["elyra_env_vars"] }, + { + "id": "elyra_kubernetes_secrets", + "type": "controls", + "parameter_refs": ["elyra_kubernetes_secrets"] + }, { "id": "elyra_outputs", "type": "controls", diff --git a/elyra/tests/pipeline/test_handlers.py b/elyra/tests/pipeline/test_handlers.py index 8b4f49145..434c86bfd 100644 --- a/elyra/tests/pipeline/test_handlers.py +++ b/elyra/tests/pipeline/test_handlers.py @@ -201,5 +201,6 @@ async def test_get_pipeline_properties_definition(jp_fetch): {"id": "cos_object_prefix"}, {"id": "elyra_runtime_image"}, {"id": "elyra_env_vars"}, + {"id": "elyra_kubernetes_secrets"}, {"id": "elyra_mounted_volumes"}, ] diff --git a/elyra/tests/pipeline/test_pipeline_definition.py b/elyra/tests/pipeline/test_pipeline_definition.py index 976b0c049..4dabd548a 100644 --- a/elyra/tests/pipeline/test_pipeline_definition.py +++ b/elyra/tests/pipeline/test_pipeline_definition.py @@ -19,6 +19,8 @@ from elyra.pipeline import pipeline_constants from elyra.pipeline.pipeline import KeyValueList +from elyra.pipeline.pipeline_constants import ENV_VARIABLES +from elyra.pipeline.pipeline_constants import KUBERNETES_SECRETS from elyra.pipeline.pipeline_definition import PipelineDefinition from elyra.tests.pipeline.util import _read_pipeline_resource @@ -153,6 +155,19 @@ def test_propagate_pipeline_default_properties(monkeypatch): assert node.get_component_parameter(kv_test_property_name) == kv_list_correct +def test_remove_env_vars_with_matching_secrets(): + pipeline_json = _read_pipeline_resource("resources/sample_pipelines/pipeline_valid_with_pipeline_default.json") + pipeline_definition = PipelineDefinition(pipeline_definition=pipeline_json) + node = pipeline_definition.primary_pipeline.nodes.pop() + + # Set kubernetes_secret property to have all the same keys as those in the env_vars property + kubernetes_secrets = KeyValueList(["var1=name1:key1", "var2=name2:key2", "var3=name3:key3"]) + node.set_component_parameter(KUBERNETES_SECRETS, kubernetes_secrets) + + node.remove_env_vars_with_matching_secrets() + assert node.get_component_parameter(ENV_VARIABLES) == [] + + def _check_pipeline_correct_pipeline_name(): pipeline_json = _read_pipeline_resource("resources/sample_pipelines/pipeline_valid.json") pipeline_definition = PipelineDefinition(pipeline_definition=pipeline_json) diff --git a/elyra/tests/pipeline/test_processor_base.py b/elyra/tests/pipeline/test_processor_base.py deleted file mode 100644 index aa94a6a28..000000000 --- a/elyra/tests/pipeline/test_processor_base.py +++ /dev/null @@ -1,71 +0,0 @@ -# -# Copyright 2018-2022 Elyra Authors -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -# -import os - -import pytest - -from elyra.pipeline.parser import PipelineParser -from elyra.pipeline.pipeline import GenericOperation -from elyra.pipeline.pipeline import KeyValueList -from elyra.pipeline.processor import RuntimePipelineProcessor -from elyra.tests.pipeline.test_pipeline_parser import _read_pipeline_resource - - -@pytest.fixture -def runtime_processor(setup_factory_data): - processor = RuntimePipelineProcessor(os.getcwd()) - return processor - - -@pytest.fixture -def parsed_pipeline(request): - pipeline_resource = _read_pipeline_resource(request.param) - return PipelineParser().parse(pipeline_json=pipeline_resource) - - -@pytest.fixture -def sample_metadata(): - return { - "api_endpoint": "http://examples.com:31737", - "cos_endpoint": "http://examples.com:31671", - "cos_username": "example", - "cos_password": "example123", - "cos_bucket": "test", - "engine": "Argo", - "tags": [], - } - - -def test_get_volume_mounts(runtime_processor): - mounted_volumes = KeyValueList(["/mount/test=rwx-test-claim", "/mount/test_two=second-claim"]) - component_parameters = { - "filename": "pipelines_test_file", - "env_vars": [], - "runtime_image": "tensorflow/tensorflow:latest", - "mounted_volumes": mounted_volumes, - } - test_operation = GenericOperation( - id="this-is-a-test-id", - type="execution-node", - classifier="execute-notebook-node", - name="test", - component_params=component_parameters, - ) - parsed_volumes_dict = runtime_processor._get_volume_mounts(operation=test_operation) - assert parsed_volumes_dict == { - "/mount/test": "rwx-test-claim", - "/mount/test_two": "second-claim", - } diff --git a/elyra/tests/pipeline/test_validation.py b/elyra/tests/pipeline/test_validation.py index 1dadee195..453d962e1 100644 --- a/elyra/tests/pipeline/test_validation.py +++ b/elyra/tests/pipeline/test_validation.py @@ -20,7 +20,11 @@ from conftest import KFP_COMPONENT_CACHE_INSTANCE import pytest +from elyra.pipeline.pipeline import KubernetesSecret from elyra.pipeline.pipeline import PIPELINE_CURRENT_VERSION +from elyra.pipeline.pipeline import VolumeMount +from elyra.pipeline.pipeline_constants import KUBERNETES_SECRETS +from elyra.pipeline.pipeline_constants import MOUNTED_VOLUMES from elyra.pipeline.pipeline_definition import PipelineDefinition from elyra.pipeline.validation import PipelineValidationManager from elyra.pipeline.validation import ValidationResponse @@ -412,6 +416,48 @@ def test_invalid_node_property_env_var(validation_manager): assert issues[0]["data"]["nodeID"] == "test-id" +def test_invalid_node_property_volumes(validation_manager): + response = ValidationResponse() + node = {"id": "test-id", "app_data": {"label": "test"}} + volumes = [ + VolumeMount("/mount/test", "rwx-test-claim"), # valid + VolumeMount("/mount/test_two", "second-claim"), # valid + VolumeMount("/mount/test_four", "second#claim"), # invalid pvc name + ] + validation_manager._validate_mounted_volumes( + node_id=node["id"], node_label=node["app_data"]["label"], volumes=volumes, response=response + ) + issues = response.to_json().get("issues") + assert issues[0]["severity"] == 1 + assert issues[0]["type"] == "invalidVolumeMount" + assert issues[0]["data"]["propertyName"] == MOUNTED_VOLUMES + assert issues[0]["data"]["nodeID"] == "test-id" + assert "not a valid Kubernetes resource name" in issues[0]["message"] + + +def test_invalid_node_property_secrets(validation_manager): + response = ValidationResponse() + node = {"id": "test-id", "app_data": {"label": "test"}} + secrets = [ + KubernetesSecret("ENV_VAR1", "test-secret", "test-key1"), # valid + KubernetesSecret("ENV_VAR2", "test-secret", "test-key2"), # valid + KubernetesSecret("ENV_VAR3", "test-secret", ""), # invalid: improper format of secret name/key + KubernetesSecret("ENV_VAR5", "test%secret", "test-key"), # invalid: not a valid Kubernetes resource name + KubernetesSecret("ENV_VAR6", "test-secret", "test$key2"), # invalid: not a valid Kubernetes secret key + ] + validation_manager._validate_kubernetes_secrets( + node_id=node["id"], node_label=node["app_data"]["label"], secrets=secrets, response=response + ) + issues = response.to_json().get("issues") + assert issues[0]["severity"] == 1 + assert issues[0]["type"] == "invalidKubernetesSecret" + assert issues[0]["data"]["propertyName"] == KUBERNETES_SECRETS + assert issues[0]["data"]["nodeID"] == "test-id" + assert "improperly formatted representation of secret name and key" in issues[0]["message"] + assert "not a valid Kubernetes resource name" in issues[1]["message"] + assert "not a valid Kubernetes secret key" in issues[2]["message"] + + def test_valid_node_property_label(validation_manager): response = ValidationResponse() node = {"id": "test-id"} diff --git a/elyra/util/kubernetes.py b/elyra/util/kubernetes.py index b0a81e145..887dbee90 100644 --- a/elyra/util/kubernetes.py +++ b/elyra/util/kubernetes.py @@ -13,6 +13,7 @@ # See the License for the specific language governing permissions and # limitations under the License. # +import re def is_valid_kubernetes_resource_name(name: str) -> bool: @@ -37,3 +38,16 @@ def is_valid_kubernetes_resource_name(name: str) -> bool: elif char not in ["-", "."]: return False return True + + +def is_valid_kubernetes_key(name: str) -> bool: + """ + Returns a truthy value indicating whether name meets the kubernetes + naming constraints, as outlined in the link below. + + https://kubernetes.io/docs/concepts/configuration/secret/#restriction-names-data + """ + if name is None: + return False + + return re.match(r"^[\w\-_.]+$", name) is not None diff --git a/tests/snapshots/pipeline-editor-tests/matches-complex-pipeline-snapshot.1.snap b/tests/snapshots/pipeline-editor-tests/matches-complex-pipeline-snapshot.1.snap index c37998ea4..4df26a317 100644 --- a/tests/snapshots/pipeline-editor-tests/matches-complex-pipeline-snapshot.1.snap +++ b/tests/snapshots/pipeline-editor-tests/matches-complex-pipeline-snapshot.1.snap @@ -26,6 +26,7 @@ Object { ], "filename": "producer.ipynb", "include_subdirectories": false, + "kubernetes_secrets": Array [], "mounted_volumes": Array [], "outputs": Array [ "output-1.csv", @@ -81,6 +82,7 @@ Object { "env_vars": Array [], "filename": "consumer.ipynb", "include_subdirectories": false, + "kubernetes_secrets": Array [], "mounted_volumes": Array [], "outputs": Array [], "runtime_image": "continuumio/anaconda3:2021.11", @@ -135,6 +137,7 @@ Object { "env_vars": Array [], "filename": "../scripts/setup.py", "include_subdirectories": false, + "kubernetes_secrets": Array [], "mounted_volumes": Array [], "outputs": Array [], "runtime_image": "continuumio/anaconda3:2021.11", @@ -187,6 +190,7 @@ Object { "env_vars": Array [], "filename": "create-source-files.py", "include_subdirectories": false, + "kubernetes_secrets": Array [], "mounted_volumes": Array [], "outputs": Array [ "input-1.csv", @@ -242,6 +246,7 @@ Object { "env_vars": Array [], "filename": "producer-script.py", "include_subdirectories": false, + "kubernetes_secrets": Array [], "mounted_volumes": Array [], "outputs": Array [ "output-3.csv", diff --git a/tests/snapshots/pipeline-editor-tests/matches-simple-pipeline-snapshot.1.snap b/tests/snapshots/pipeline-editor-tests/matches-simple-pipeline-snapshot.1.snap index 5e639f0d5..96bb64bfb 100644 --- a/tests/snapshots/pipeline-editor-tests/matches-simple-pipeline-snapshot.1.snap +++ b/tests/snapshots/pipeline-editor-tests/matches-simple-pipeline-snapshot.1.snap @@ -27,6 +27,7 @@ Object { ], "filename": "helloworld.ipynb", "include_subdirectories": false, + "kubernetes_secrets": Array [], "mounted_volumes": Array [], "outputs": Array [], },