Skip to content

Add markdownlint to ensure consistent formatting #14

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Mar 28, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions .editorconfig
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
# EditorConfig is awesome: https://EditorConfig.org

# top-most EditorConfig file
root = true

# Unix-style newlines with a newline ending every file
[*]
end_of_line = lf
indent_style = space

[*.md]
indent_size = 2
insert_final_newline = true
trim_trailing_whitespace = true
17 changes: 14 additions & 3 deletions .github/workflows/gh-pages-pr.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,13 +7,24 @@ on:
paths:
- .pages/**
- docs/**
- .github/workflows/gh-pages.yaml
- README.md
- .github/workflows/gh-pages-pr.yaml
- '**.md'

env:
PLANTUML_VERSION: '1.2024.8'

jobs:
lint:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v4

- name: markdownlint-cli2-action
uses: DavidAnson/markdownlint-cli2-action@v9
with:
globs: '**/*.md'

build:
runs-on: ubuntu-latest
steps:
Expand All @@ -25,7 +36,7 @@ jobs:
with:
hugo-version: '0.129.0'
extended: true

- name: Setup PlantUML
run: |
sudo apt-get update --yes
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/gh-pages.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ on:
- .pages/**
- docs/**
- .github/workflows/gh-pages.yaml
- README.md
- '**.md'

# Allows you to run this workflow manually from the Actions tab
workflow_dispatch:
Expand Down
8 changes: 8 additions & 0 deletions .markdownlint.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
# Default state for all rules
default: true

# Path to configuration file to extend
extends: null

# MD013/line-length : Line length : https://github.com/DavidAnson/markdownlint/blob/v0.32.1/doc/md013.md
MD013: false
3 changes: 1 addition & 2 deletions .pages/archetypes/adr.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,6 @@ date: "{{ time.Now.Format "2006-01-02" }}"
| --- | --- | --- |
| {"proposed \| rejected \| accepted \| deprecated \| … \| superseded by ADR-0123"} | {YYYY-MM-DD when the decision was last updated} | {list everyone involved in the decision} |


## Context and Problem Statement

{Describe the context and problem statement, e.g., in free form using two to three sentences or in the form of an illustrative story. You may want to articulate the problem in form of a question and add links to collaboration boards or issue management systems.}
Expand All @@ -29,4 +28,4 @@ Chosen option: "{title of option 1}", because {justification. e.g., only option,

* Good, because {positive consequence, e.g., improvement of one or more desired qualities, …}
* Bad, because {negative consequence, e.g., compromising one or more desired qualities, …}
* … <!-- numbers of consequences can vary -->
* … <!-- numbers of consequences can vary -->
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
# On-prem_Kubernetes_Guide

An opinionated guide to on-prem Kubernetes

## How to run local
Expand Down
14 changes: 10 additions & 4 deletions contributing.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
# Contributing

This guide is meant hold all of the experience from Eficodeans working with Kubernetes, distilled into one easily readable guide.

This means, that we welcome all contributions to 'what is the right tech stack'.
Expand All @@ -6,11 +8,13 @@ There are fundamentally 2 ways to contribute to this guide: recommend a tool, an

## Recommend a tool

If you want to recommend a tool, the place to start is to write an Architecture Decision Record (ADR). All tools recommended in the guide is reflected in an ADR.
If you want to recommend a tool, the place to start is to write an Architecture Decision Record (ADR). All tools recommended in the guide is reflected in an ADR.

To add an ADR do the following:

`hugo new --kind adr <DesiredFolder>/ADRs/<NameOfADRFile>.md --source .pages`
```shell
hugo new --kind adr <DesiredFolder>/ADRs/<NameOfADRFile>.md --source .pages
```

Fill out the sections in the generated ADR

Expand Down Expand Up @@ -50,8 +54,10 @@ You can use either Devbox or Dev Containers to set up a consistent development e
To preview the website locally while making changes:

1. Run the Hugo development server:
```

```sh
hugo server --source .pages
```
2. Open your browser and navigate to `http://localhost:1313/On-prem_Kubernetes_Guide/ `

2. Open your browser and navigate to `http://localhost:1313/On-prem_Kubernetes_Guide/`
3. The website will automatically refresh when you make changes to the source files
5 changes: 5 additions & 0 deletions docs/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,13 @@ title: On-premises Kubernetes guide
---

An opinionated guide to building and running your on-prem tech-stack for running Kubernetes.

## Introduction

Deploying and operating Kubernetes on-premises is fundamentally different from doing so in the cloud. Without a managed control plane or provider-operated infrastructure, organizations must take full ownership of networking, security, and operational automation to ensure a stable and secure environment. The complexity of these decisions can quickly lead to fragmentation, inefficiency, and technical debt if not approached with a well-defined strategy.

This guide delivers an opinionated, battle-tested roadmap for building a production-grade on-prem Kubernetes environment, structured around three foundational pillars:

- [Getting your hardware ready to work with Kubernetes](hardware_ready/_index.md)
- [Getting your software ready to work with Kubernetes](software_ready/_index.md)
- [Working with Kubernetes](working_with_k8s/_index.md)
Expand All @@ -16,9 +19,11 @@ Instead of presenting endless options, we provide clear, prescriptive recommenda
By following this approach, organizations can confidently design, deploy, and sustain an optimized, resilient, and future-compatible Kubernetes cluster, making informed decisions that balance control, flexibility, and operational efficiency from day one.

## Key differences between On-prem and Cloud Kubernetes

One of the biggest challenges of running Kubernetes on-prem is the absence of elastic cloud-based scaling, where compute and storage resources can be provisioned on demand. Instead, on-prem environments require careful capacity planning to avoid resource contention while minimizing unnecessary infrastructure costs. Additionally, the operational burden extends beyond initial deployment—day-two operations such as upgrades, observability, disaster recovery, and compliance enforcement demand greater automation and proactive management to maintain stability and performance. Without cloud-native integrations, teams must build and maintain their own ecosystem of networking, storage, and security solutions, ensuring that each component is optimized for reliability and maintainability. These factors make on-prem Kubernetes deployments more complex but also provide greater control over cost, security, and regulatory compliance.

## Document Structure

With the introduction and key differences out of the way, we can now get into the important parts of the document. As mentioned in the introduction, the document is structured around three foundational pillars, namely:

- [Getting your hardware ready to work with Kubernetes](hardware_ready/_index.md)
Expand Down
15 changes: 14 additions & 1 deletion docs/guide.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,15 @@
---
title: On-premises Kubernetes guide
---

An opinionated guide to building and running your on-prem tech-stack for running Kubernetes.

## Introduction

Deploying and operating Kubernetes on-premises is fundamentally different from doing so in the cloud. Without a managed control plane or provider-operated infrastructure, organizations must take full ownership of networking, security, and operational automation to ensure a stable and secure environment. The complexity of these decisions can quickly lead to fragmentation, inefficiency, and technical debt if not approached with a well-defined strategy.

This guide delivers an opinionated, battle-tested roadmap for building a production-grade on-prem Kubernetes environment, structured around three foundational pillars:

- Getting your hardware ready to work with Kubernetes
- Getting your software ready to work with Kubernetes
- Working with Kubernetes
Expand All @@ -13,9 +19,11 @@ Instead of presenting endless options, we provide clear, prescriptive recommenda
By following this approach, organizations can confidently design, deploy, and sustain an optimized, resilient, and future-compatible Kubernetes cluster, making informed decisions that balance control, flexibility, and operational efficiency from day one.

## Key differences between On-prem and Cloud Kubernetes

One of the biggest challenges of running Kubernetes on-prem is the absence of elastic cloud-based scaling, where compute and storage resources can be provisioned on demand. Instead, on-prem environments require careful capacity planning to avoid resource contention while minimizing unnecessary infrastructure costs. Additionally, the operational burden extends beyond initial deployment—day-two operations such as upgrades, observability, disaster recovery, and compliance enforcement demand greater automation and proactive management to maintain stability and performance. Without cloud-native integrations, teams must build and maintain their own ecosystem of networking, storage, and security solutions, ensuring that each component is optimized for reliability and maintainability. These factors make on-prem Kubernetes deployments more complex but also provide greater control over cost, security, and regulatory compliance.

## Document Structure

With the introduction and key differences out of the way, we can now get into the important parts of the document. As mentioned in the introduction, the document is structured around three foundational pillars, namely:

- Getting your hardware ready to work with Kubernetes
Expand All @@ -25,7 +33,9 @@ With the introduction and key differences out of the way, we can now get into th
For each of these pillars, we will be providing you with primary and secondary recommendations regarding tech-stack and any accompanying tools. These recommendations will go over the tools themselves and provide you with arguments for choosing them, as well as listing out common pitfalls and important points of consideration.

## Getting your hardware ready to work with Kubernetes

### Virtualisation or bare metal

One important aspect is to determine whether the clusters should run on an OS directly on the machines, or if it makes sense to add a virtualisation layer.

Running directly on the hardware gives you a 1-1 relationship between the machines and the nodes. This is not always advised if the machines are particularly beefy. Running directly on the hardware will of course have lower latency than when adding a virtualisation layer.
Expand All @@ -35,16 +45,19 @@ A virtualisation layer can benefit via abstracting the actual hardware, and enab
In case virtualisation is chosen, the below recommendations are what you would run in your VM. For setting up your VM’s we recommend Talos with KubeVirt.

### Decision Matrix

| Problem domain | Description | Reason for importance | Primary tool recommendation | Secondary tool recommendation |
|:---:|:---:|:---:|:---:|:---:|
| Kubernetes Node Operating System | The Operating System running on each of the hosts that will be part of your Kubernetes cluster | Choosing the right OS will be the foundation for building a production-grade Kubernetes cluster | Talos Linux | Flatcar Linux |
| Storage solution | The underlying storage capabilities which Kubernetes will leverage to provide persistence for stateful workloads | Choosing the right storage solution for your clusters needs is important as there is a lot of balance tradeoffs associated with it, e.g redundancy vs. complexity | Longhorn (iscsi) OpenEBS (iscsi) | Rook Ceph |
| Container Runtime (CRI) | The software that is responsible for running containers | You need a working container runtime on each node in your cluster, so that the kubelet can launch pods and their containers | Containerd (embedded in Talos??? But maybe always containerd anyways?) | |
| Network plugin (CNI) | Plugin used for cluster networking | A CNI plugin is required to implement the Kubernetes network model | Cillium? | Calico? |


## Getting your software ready to work with Kubernetes

<!-- markdownlint-disable MD024 -->
### Decision Matrix

| Problem domain | Description | Reason for importance | Primary tool recommendation | Secondary tool recommendation |
|:---:|:---:|:---:|:---:|:---:|
| Image Registry | A common place to store and fetch images | High availability, secure access control | Harbor | Sonatype Nexus |
26 changes: 8 additions & 18 deletions docs/hardware_ready/ADRs/Cilium_as_network_plugin.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,12 +6,9 @@ title: Use Cilium as Network Plugin
| --- | --- | --- |
| proposed | 2025-02-18 | Alexandra Aldershaab, Steffen Petersen |


## Context and Problem Statement

A CNI plugin is required to implement the Kubernetes network model by assigning IP addresses from preallocated CIDR ranges
to pods and nodes. The CNI plugin is also responsible for enforcing network policies that control how traffic flows between
namespaces as well as between the cluster and the internet.
A CNI plugin is required to implement the Kubernetes network model by assigning IP addresses from preallocated CIDR ranges to pods and nodes. The CNI plugin is also responsible for enforcing network policies that control how traffic flows between namespaces as well as between the cluster and the internet.

## Considered Options

Expand All @@ -21,9 +18,7 @@ namespaces as well as between the cluster and the internet.

## Decision Outcome

Chosen option: **Cilium**, because it is a fully conformant CNI plugin that works in both cloud and on-premises environments
while also providing support for network policies as well as more advanced networking features. Cilium has also gained
rapid adoption in the Kubernetes community and is considered the future standard of CNI plugins.
Chosen option: **Cilium**, because it is a fully conformant CNI plugin that works in both cloud and on-premises environments while also providing support for network policies as well as more advanced networking features. Cilium has also gained rapid adoption in the Kubernetes community and is considered the future standard of CNI plugins.

Flannel was considered, but it does not support network policies which is considered a hard requirement.

Expand All @@ -32,14 +27,9 @@ Calico, while supporting Network policies, falls short compared to Cilium in ter
### Consequences

* Good, because Cilium provides support for network policies on L7 as well as the usual L3/L4.
* Good, because Cilium provides support for BGP controlplane integration, allowing for seamless integration with existing
networking infrastructure.
* Good, because Cilium provides a feature called Egress Gateway which allows for traffic exiting the cluster to be routed
through specific nodes, facilitating smooth integration with existing security infrastructure such as IP-based firewalls.
* Good, because Cilium comes with a utility called Hubble which provides deep observability into the network traffic, allowing
for easy debugging and troubleshooting of network issues.

* Bad, because Cilium requires you to understand both Kubernetes networking and tradition networking concepts to fully utilize
its advanced features.
* Bad, because Cilium does not come installed by default on any flavor of Kubernetes, requiring additional steps to
install it and provide necessary custom configuration.
* Good, because Cilium provides support for BGP controlplane integration, allowing for seamless integration with existing networking infrastructure.
* Good, because Cilium provides a feature called Egress Gateway which allows for traffic exiting the cluster to be routed through specific nodes, facilitating smooth integration with existing security infrastructure such as IP-based firewalls.
* Good, because Cilium comes with a utility called Hubble which provides deep observability into the network traffic, allowing for easy debugging and troubleshooting of network issues.

* Bad, because Cilium requires you to understand both Kubernetes networking and tradition networking concepts to fully utilize its advanced features.
* Bad, because Cilium does not come installed by default on any flavor of Kubernetes, requiring additional steps to install it and provide necessary custom configuration.
4 changes: 2 additions & 2 deletions docs/hardware_ready/ADRs/talos_as_os.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,10 +8,10 @@ date: "2025-02-25"
| --- | --- | --- |
| proposed | 2025-02-25 | Sofus Albertsen |


## Context and Problem Statement

Choosing the right operating system for your Kubernetes cluster is crucial for stability, security, and operational efficiency. The OS should be optimized for container workloads, minimize overhead, and integrate well with Infrastructure as Code (IaC) practices.

## Considered Options

* Talos OS
Expand All @@ -37,4 +37,4 @@ While their dashboards can simplify initial setup, they can also encourage "clic

* **Bad:** The learning curve for Talos OS might be steeper initially for teams unfamiliar with its API-driven approach.
* **Bad:** The lack of a graphical user interface might be a drawback for some users accustomed to traditional OS management.
* **Bad:** Talos is a relatively newer project compared to OpenShift or Rancher, therefore community support and available resources might be smaller.
* **Bad:** Talos is a relatively newer project compared to OpenShift or Rancher, therefore community support and available resources might be smaller.
2 changes: 2 additions & 0 deletions docs/hardware_ready/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
title: Getting your hardware ready
---
## Virtualisation or bare metal

One important aspect is to determine whether the clusters should run on an OS directly on the machines, or if it makes sense to add a virtualisation layer.

Running directly on the hardware gives you a 1-1 relationship between the machines and the nodes. This is not always advised if the machines are particularly beefy. Running directly on the hardware will of course have lower latency than when adding a virtualisation layer.
Expand All @@ -11,6 +12,7 @@ A virtualisation layer can benefit via abstracting the actual hardware, and enab
In case virtualisation is chosen, the below recommendations are what you would run in your VM. For setting up your VM’s we recommend Talos with KubeVirt.

## Decision Matrix

| Problem domain | Description | Reason for importance | Tool recommendation |
|:---:|:---:|:---:|:---:|
| Kubernetes Node Operating System | The Operating System running on each of the hosts that will be part of your Kubernetes cluster | Choosing the right OS will be the foundation for building a production-grade Kubernetes cluster | [Talos OS](hardware_ready/ADRs/talos_as_os.md) |
Expand Down
Loading