Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Kubernetes support #206

Closed
wants to merge 6 commits into from

Conversation

kamushadenes
Copy link
Contributor

@kamushadenes kamushadenes commented Feb 7, 2025

Solution

This pull request introduces a new container abstraction that consolidates Docker and Kubernetes container operations under a unified interface. The changes include:

  • Renaming the package from "docker" to "container"

  • Introducing the following core components:

    • ContainerImpl: Provides a unified interface to interact with container runtimes. It smartly selects between the Docker wrapper and a new Kubernetes wrapper based on the runtime environment.
    • ContainerOpts: Encapsulates container configuration options. In contrast to the old DockerOpts (which used maps for volumes and environment variables), the new design uses a slice for volumes and a map for environment variables. This makes it easier to manage and extend configurations.
    • DockerOpts: Now supports additional options such as restart policy and aligns with the new structure.
    • K8sOpts: Contains options specific to running containers in a Kubernetes cluster.
  • Adaptations in application code:

    • In the Airbyte provider storage module, the previous docker.DockerWrapper has been replaced by the unified container implementation (ContainerImpl). This ensures that all container operations (e.g., volume mounting, environment variables, command execution) work consistently whether running in a Docker or Kubernetes environment.
    • In the DBT runner within the transformer registry, similar changes have been applied by replacing references to the old Docker abstraction with the new container abstraction.
  • New Kubernetes wrapper implementation

    • A new wrapper (K8sWrapper) has been introduced. When the environment variable KUBERNETES_SERVICE_HOST is detected, the container implementation will instantiate the Kubernetes wrapper rather than the Docker one.
    • This wrapper provides methods for creating pods, retrieving logs, and deleting pods in a Kubernetes cluster, bridging the gap between local container operations and cloud-native orchestration.
    • Comprehensive testing has been added, including unit tests for the Kubernetes options and integration tests that create a temporary Kind cluster, where an actual pod is spawned to test log retrieval.

The diagrams below explain the flow and interactions of the new container abstraction:

1. Sequence Diagram

Purpose: Illustrates the decision-making process of choosing the proper container runtime interface.

  • The application creates a configuration using ContainerOpts.
  • The helper function NewContainerImpl(l) checks if the environment variable KUBERNETES_SERVICE_HOST is set.
  • If running in a Kubernetes environment, NewK8sWrapper is invoked to create a Kubernetes client.
  • Otherwise, the system falls back to the traditional Docker wrapper via NewDockerWrapper(l).
  • ContainerImpl converts these options to the runtime-specific format and delegates to either the Docker or Kubernetes wrapper.
  • The respective wrapper then pulls the image (if necessary) and runs the container/pod, returning logs and status.
flowchart TD
    A[Create ContainerOpts]
    B[NewContainerImpl]
    C{Is Kubernetes Environment?}
    D[NewK8sWrapper]
    E[NewDockerWrapper]
    F[Run Container/Pod]
    G[Return Logs & Status]

    A --> B
    B --> C
    C -- Yes --> D
    C -- No --> E
    D --> F
    E --> F
    F --> G
Loading

2. Class Diagram

Purpose: Shows the relationships between key components managing container operations.

  • The central ContainerImpl interface defines common methods such as Run() and Pull().
  • The DockerWrapper and K8sWrapper implement the ContainerImpl interface.
  • ContainerOpts encapsulates configuration shared between the two implementations.
  • K8sOpts and DockerOpts serve as the runtime-specific configuration options.
classDiagram
    class ContainerImpl {
      +Run(ctx, opts ContainerOpts) (stdout io.Reader, stderr io.Reader, err error)
      +Pull(ctx, image string, opts ImagePullOptions) error
    }
    
    class DockerWrapper {
      +RunContainer(ctx, opts DockerOpts) (io.Reader, io.Reader, error)
      +Pull(ctx, image string, opts ImagePullOptions) error
    }
    
    class K8sWrapper {
      +Run(ctx, opts ContainerOpts) (io.Reader, io.Reader, error)
      +RunPod(ctx, opts K8sOpts) (*bytes.Buffer, error)
      +Pull(ctx, image string, opts ImagePullOptions) error
    }
    
    class ContainerOpts {
      +Env map[string]string
      +Volumes []Volume
      +LogDriver string
      +LogOptions map[string]string
      +Image string
      +Network string
      +ContainerName string
      +Command []string
      +Args []string
      +Timeout time.Duration
      +AttachStdout bool
      +AttachStderr bool
      +RestartPolicy string
    }
    
    class DockerOpts
    class K8sOpts

    ContainerImpl <|.. DockerWrapper
    ContainerImpl <|.. K8sWrapper
    ContainerOpts --> DockerOpts : convertTo()
    ContainerOpts --> K8sOpts : convertTo()
Loading

Let me know if further refinements are needed!

Signed-off-by: Henrique Goncalves <hgoncalves@altinity.com>
@kamushadenes kamushadenes marked this pull request as draft February 7, 2025 13:31
@kamushadenes
Copy link
Contributor Author

kamushadenes commented Feb 7, 2025

@laskoviymishka can we get engineerd/setup-kind in the approved actions list? This is used to spawn a Kubernetes cluster for the integration test.

(Note this PR is still a draft, no need to review now)

@laskoviymishka
Copy link
Contributor

Can you plz elaborate what exact problem this PR try to solve? i don't get it.

@kamushadenes
Copy link
Contributor Author

@laskoviymishka

When running inside a Kubernetes cluster, running Docker-in-Docker would require elevated privileges that are not always desirable, given how much important stuff is usually running inside a cluster.

This PR adds the ability to talk directly to the Kubernetes API, allowing to use RBAC rules to restrict the "blast radius" of the spawned containers, while also removing the need for DinD (only when running inside a cluster).

@laskoviymishka
Copy link
Contributor

Why not simply change the way how docker works in helm?
Spawn opt-in docker pod and pass it's docker socket to main pod.
Also we shall preserve ability to run transfer outside k8s env, for example inside ec2 instance as pure dockerd process.

@kamushadenes
Copy link
Contributor Author

When not running inside a cluster, it will work as it did before, no K8s involved. This change only applies when it detects it's running inside a cluster.

I don't think spawning a docker pod is a good approach when Kubernetes is already able to run containers.

@laskoviymishka
Copy link
Contributor

laskoviymishka commented Feb 7, 2025

Okay, that sounds reasonable, just one important note - try to avoid any extra dependencies, and minimize changes to go.mod, it's should be as minimal as possible.

.
Signed-off-by: Henrique Goncalves <hgoncalves@altinity.com>
Signed-off-by: Henrique Goncalves <hgoncalves@altinity.com>
@kamushadenes
Copy link
Contributor Author

Only the mandatory k8s api was added, all others are indirect dependencies of it. Hope that's ok.

@kamushadenes kamushadenes marked this pull request as ready for review February 10, 2025 16:00
@laskoviymishka
Copy link
Contributor

Should be fine, just for broader context - any change of go-mod need to be in-sync with internal mono-repo, so if diffs are too big it can be problematic to merge (need a lot of manual work to fix and re-sync stuff back and force).

@kamushadenes
Copy link
Contributor Author

Got it, thanks for the context!

@laskoviymishka
Copy link
Contributor

laskoviymishka commented Feb 10, 2025

@kamushadenes can you plz create a high-level overview of changes with some design highlights? this pr is relatively big, so we can start looking more precisely.

Better to add those info in PR Description

@laskoviymishka
Copy link
Contributor

You can take a look on #147 as example of well structured description for longer PR.
It should't be that detailed, but if you can navigate through your newly added abstractions - it would make reviewer life much easier

@kamushadenes
Copy link
Contributor Author

Doc updated.

@laskoviymishka
Copy link
Contributor

That's a nice written pr description 😊
Thanks!

Signed-off-by: Henrique Goncalves <hgoncalves@altinity.com>
Copy link
Contributor

@laskoviymishka laskoviymishka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc: @boooec

Copy link

robot-magpie bot commented Feb 12, 2025

@laskoviymishka has imported your pull request. If you are a member of Transfer team, you can view this diff.

Copy link

robot-magpie bot commented Feb 18, 2025

✅ This pull request is being closed because it has been successfully merged into our internal monorepository.
Your changes will be pushed to this repository soon. Thank you for your contribution!

@robot-magpie robot-magpie bot closed this Feb 18, 2025
robot-piglet pushed a commit that referenced this pull request Feb 18, 2025
## **Solution**

This pull request introduces a new container abstraction that consolidates Docker and Kubernetes container operations under a unified interface. The changes include:

* Renaming the package from "docker" to "container"
* Introducing the following core components:
  - **ContainerImpl**: Provides a unified interface to interact with container runtimes. It smartly selects between the Docker wrapper and a new Kubernetes wrapper based on the runtime environment.
  - **ContainerOpts**: Encapsulates container configuration options. In contrast to the old DockerOpts (which used maps for volumes and environment variables), the new design uses a slice for volumes and a map for environment variables. This makes it easier to manage and extend configurations.
  - **DockerOpts**: Now supports additional options such as restart policy and aligns with the new structure.
  - **K8sOpts**: Contains options specific to running containers in a Kubernetes cluster.

* Adaptations in application code:
  - In the Airbyte provider storage module, the previous `docker.DockerWrapper` has been replaced by the unified container implementation (`ContainerImpl`). This ensures that all container operations (e.g., volume mounting, environment variables, command execution) work consistently whether running in a Docker or Kubernetes environment.
  - In the DBT runner within the transformer registry, similar changes have been applied by replacing references to the old Docker abstraction with the new container abstraction.

* New Kubernetes wrapper implementation
  - A new wrapper (`K8sWrapper`) has been introduced. When the environment variable `KUBERNETES_SERVICE_HOST` is detected, the container implementation will instantiate the Kubernetes wrapper rather than the Docker one.
  - This wrapper provides methods for creating pods, retrieving logs, and deleting pods in a Kubernetes cluster, bridging the gap between local container operations and cloud-native orchestration.
  - Comprehensive testing has been added, including unit tests for the Kubernetes options and integration tests that create a temporary Kind cluster, where an actual pod is spawned to test log retrieval.

The diagrams below explain the flow and interactions of the new container abstraction:

### **1. Sequence Diagram**

**Purpose:** Illustrates the decision-making process of choosing the proper container runtime interface.

* The application creates a configuration using **ContainerOpts**.
* The helper function `NewContainerImpl(l)` checks if the environment variable `KUBERNETES_SERVICE_HOST` is set.
* If running in a Kubernetes environment, `NewK8sWrapper` is invoked to create a Kubernetes client.
* Otherwise, the system falls back to the traditional Docker wrapper via `NewDockerWrapper(l)`.
* **ContainerImpl** converts these options to the runtime-specific format and delegates to either the Docker or Kubernetes wrapper.
* The respective wrapper then pulls the image (if necessary) and runs the container/pod, returning logs and status.

```mermaid
flowchart TD
    A[Create ContainerOpts]
    B[NewContainerImpl]
    C{Is Kubernetes Environment?}
    D[NewK8sWrapper]
    E[NewDockerWrapper]
    F[Run Container/Pod]
    G[Return Logs & Status]

    A --> B
    B --> C
    C -- Yes --> D
    C -- No --> E
    D --> F
    E --> F
    F --> G
```

### **2. Class Diagram**

**Purpose:** Shows the relationships between key components managing container operations.

* The central **ContainerImpl** interface defines common methods such as Run() and Pull().
* The **DockerWrapper** and **K8sWrapper** implement the ContainerImpl interface.
* **ContainerOpts** encapsulates configuration shared between the two implementations.
* **K8sOpts** and **DockerOpts** serve as the runtime-specific configuration options.

```mermaid
classDiagram
    class ContainerImpl {
      +Run(ctx, opts ContainerOpts) (stdout io.Reader, stderr io.Reader, err error)
      +Pull(ctx, image string, opts ImagePullOptions) error
    }

    class DockerWrapper {
      +RunContainer(ctx, opts DockerOpts) (io.Reader, io.Reader, error)
      +Pull(ctx, image string, opts ImagePullOptions) error
    }

    class K8sWrapper {
      +Run(ctx, opts ContainerOpts) (io.Reader, io.Reader, error)
      +RunPod(ctx, opts K8sOpts) (*bytes.Buffer, error)
      +Pull(ctx, image string, opts ImagePullOptions) error
    }

    class ContainerOpts {
      +Env map[string]string
      +Volumes []Volume
      +LogDriver string
      +LogOptions map[string]string
      +Image string
      +Network string
      +ContainerName string
      +Command []string
      +Args []string
      +Timeout time.Duration
      +AttachStdout bool
      +AttachStderr bool
      +RestartPolicy string
    }

    class DockerOpts
    class K8sOpts

    ContainerImpl <|.. DockerWrapper
    ContainerImpl <|.. K8sWrapper
    ContainerOpts --> DockerOpts : convertTo()
    ContainerOpts --> K8sOpts : convertTo()
```

Let me know if further refinements are needed!

---

Pull Request resolved: #206

Co-authored-by: tserakhau <tserakhau@double.cloud>
Co-authored-by: tserakhau <tserakhau@double.cloud>
Co-authored-by: tserakhau <tserakhau@double.cloud>
Co-authored-by: tserakhau <tserakhau@double.cloud>
Co-authored-by: tserakhau <tserakhau@double.cloud>
Co-authored-by: tserakhau <tserakhau@double.cloud>
commit_hash:19988b16b4a75ec523f1e2476a60c890837066a0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants