Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New kubernetes backend #21796

Open
wants to merge 24 commits into
base: main
Choose a base branch
from
Open

New kubernetes backend #21796

wants to merge 24 commits into from

Conversation

grihabor
Copy link
Contributor

@grihabor grihabor commented Dec 22, 2024

While helm backend already exists and works fine for kubernetes, creating a helm chart might be an overkill for a small thing like a single configmap or a secret. New kubernetes backend can deploy single object easily given a yaml file.

See docs for more details on usage.

I'm also planning to contribute our integration with python_format_string target (I will open a pr) that can handle simple python templating in text files.

> fd _category_ docs/docs/ --max-depth 2 --exec rg position '{}' --with-filename --line-number | sort -k 3,3 -n
docs/docs/introduction/_category_.json:3:  "position": 1
docs/docs/getting-started/_category_.json:3:  "position": 2
docs/docs/using-pants/_category_.json:3:  "position": 3
docs/docs/python/_category_.json:3:  "position": 4
docs/docs/go/_category_.json:3:  "position": 5
docs/docs/jvm/_category_.json:3:  "position": 6
docs/docs/shell/_category_.json:3:  "position": 7
docs/docs/docker/_category_.json:3:  "position": 8
docs/docs/kubernetes/_category_.json:3:  "position": 9
docs/docs/helm/_category_.json:3:  "position": 10
docs/docs/terraform/_category_.json:3:  "position": 11
docs/docs/sql/_category_.json:3:  "position": 12
docs/docs/ad-hoc-tools/_category_.json:3:  "position": 13
docs/docs/javascript/_category_.json:3:  "position": 13
docs/docs/writing-plugins/_category_.json:3:  "position": 14
docs/docs/releases/_category_.json:3:  "position": 15
docs/docs/contributions/_category_.json:3:  "position": 16
docs/docs/tutorials/_category_.json:3:  "position": 17
@cburroughs
Copy link
Contributor

cc @tgolsson who I know has done some k8s related work at https://github.com/tgolsson/pants-backends/tree/main/pants-plugins/k8s

Broad topic not intended to detail this whole PR: Something we have struggled with internally is that we would like to be able to generate the final reified k8s yaml for inspection or alternative deployments. (I have a PoC plugin that does this for helm) This is along the lines of 'package' but not not quite what helm means by 'package', and I'm unsure what to call it. It would be nice if all the Pant k8s generators could eventually both "spit out the yaml" and "deploy".

@grihabor
Copy link
Contributor Author

grihabor commented Jan 8, 2025

Something we have struggled with internally is that we would like to be able to generate the final reified k8s yaml for inspection or alternative deployments.

Haha, we have exactly this problem, I've called the goal render and implemented it for helm_deployment and k8s_bundle

Copy link
Contributor

@lilatomic lilatomic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!
My only real sticking point is about the semantics of the requirement for context

putative_targets = []

if k8s.tailor_source_targets:
all_k8s_files_globs = req.path_globs("*.yaml")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

include "*.yml" too? or maybe make this a customisable option? Not necessary for a first pass

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I want consistent .yaml or .yml in the whole repo, so probably it should be customisable, yes

if k8s.tailor_source_targets:
all_k8s_files_globs = req.path_globs("*.yaml")
all_k8s_files = await Get(Paths, PathGlobs, all_k8s_files_globs)
unowned_k8s_files = set(all_k8s_files.files) - set(all_owned_sources)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that globbing every yaml file will be too broad, but I don't know that there's a standard way of detecting if a manifest is a k8s manifest (kubectl apply --validate=true --dry-run=client seems to need a server). Worst case people can turn off tailoring.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking 2 most common use cases here:

  1. Developer's repo with code will probably use some kind of templating to generate k8s_sources (I'm planning to open couple more prs for that), so tailor for yaml won't be that useful. In this case I expect devs to disable tailoring.
  2. DevOps's repo with mostly k8s yaml files. In this case I expect tailor to work good



@dataclass(frozen=True)
class VersionHash:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could use pants.core.util_rules.external_tool.ExternalToolVersion

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not quite the same thing, because platform here is not a predefined platform that pants uses, but a platform that needs to be mapped using url_platform_mapping. But I can make it work like this 6efda67

)

backward_platform_mapping = {v: k for k, v in platform_mapping.items()}
for result in results:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could have these output in semver order (instead of lexical order) by using from packaging.version import Version, by collecting the versions and then sorting with something like sorted(versions, key=lambda e: Version(e.version))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure 9d84c9e

build-support/bin/external_tool_versions.py Outdated Show resolved Hide resolved
kubectl_tool = await Get(
DownloadedExternalTool, ExternalToolRequest, kubectl.get_request(platform)
)
digest = await Get(Digest, MergeDigests([kubectl_tool.digest, request.input_digest]))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could use Process.immutable_input_digest for the tool. Here's how it's done in the Helm backend

immutable_input_digests = {
**request.extra_immutable_input_digests,
**helm_binary.immutable_input_digests,
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure 72722dd

) -> DeployProcess:
context = field_set.context.value
if context is None:
raise ValueError(f"Missing `{K8sBundleContextField.alias}` field")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could make this more helpful by including the FS's address with field_set.address.spec

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion, thanks ed0be1f

platform: Platform,
) -> DeployProcess:
context = field_set.context.value
if context is None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I've missed something in the logic here. Why do we force the field_set.context to have a value even when kubectl.pass_context is False?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is that setting context on a target is required. This is because we had recurring issues because somebody forgot to set the correct context on the target and it got deployed to a different context by accident. However, we've got CI agents running in the cluster itself, so they can only deploy k8s objects to the specific cluster specified with KUBERNETES_SERVICE_HOST and KUBERNETES_SERVICE_PORT, that's why we need to disable --context argument in pants.ci.toml.

I understand this is kinda custom setup, so people might need some other behavior. However, in general, I think context requirement is a good thing that can prevent misdeployments. I can probably move this context validation to a small linter and check context field there, wdyt?

@cburroughs
Copy link
Contributor

Haha, we have exactly this problem, I've called the goal render and implemented it for helm_deployment and k8s_bundle

I'd be excited to see that!

Copy link
Contributor Author

@grihabor grihabor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lilatomic Thank you for review! Ready for another round

if k8s.tailor_source_targets:
all_k8s_files_globs = req.path_globs("*.yaml")
all_k8s_files = await Get(Paths, PathGlobs, all_k8s_files_globs)
unowned_k8s_files = set(all_k8s_files.files) - set(all_owned_sources)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking 2 most common use cases here:

  1. Developer's repo with code will probably use some kind of templating to generate k8s_sources (I'm planning to open couple more prs for that), so tailor for yaml won't be that useful. In this case I expect devs to disable tailoring.
  2. DevOps's repo with mostly k8s yaml files. In this case I expect tailor to work good

putative_targets = []

if k8s.tailor_source_targets:
all_k8s_files_globs = req.path_globs("*.yaml")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I want consistent .yaml or .yml in the whole repo, so probably it should be customisable, yes



@dataclass(frozen=True)
class VersionHash:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not quite the same thing, because platform here is not a predefined platform that pants uses, but a platform that needs to be mapped using url_platform_mapping. But I can make it work like this 6efda67

)

backward_platform_mapping = {v: k for k, v in platform_mapping.items()}
for result in results:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure 9d84c9e

) -> DeployProcess:
context = field_set.context.value
if context is None:
raise ValueError(f"Missing `{K8sBundleContextField.alias}` field")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion, thanks ed0be1f

platform: Platform,
) -> DeployProcess:
context = field_set.context.value
if context is None:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is that setting context on a target is required. This is because we had recurring issues because somebody forgot to set the correct context on the target and it got deployed to a different context by accident. However, we've got CI agents running in the cluster itself, so they can only deploy k8s objects to the specific cluster specified with KUBERNETES_SERVICE_HOST and KUBERNETES_SERVICE_PORT, that's why we need to disable --context argument in pants.ci.toml.

I understand this is kinda custom setup, so people might need some other behavior. However, in general, I think context requirement is a good thing that can prevent misdeployments. I can probably move this context validation to a small linter and check context field there, wdyt?

kubectl_tool = await Get(
DownloadedExternalTool, ExternalToolRequest, kubectl.get_request(platform)
)
digest = await Get(Digest, MergeDigests([kubectl_tool.digest, request.input_digest]))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure 72722dd

@grihabor
Copy link
Contributor Author

native_engine.IntrinsicError: Could not identify a process to backtrack to for: Missing digest: Was not present in the local store: Digest { hash: Fingerprint<91284dcbaa6e3a5ff00b04f4e5e1051626fc374e386a437aa899983a6eeaaf5a>, size_bytes: 1351 }, with workunit: Workunit { name: "process", level: Debug, span_id: SpanId(12402687231280246273), parent_ids: [SpanId(7266305093486847294)], state: Started { start_time: SystemTime { tv_sec: 1736631333, tv_nsec: 59791000 }, blocked: false }, metadata: Some(WorkunitMetadata { desc: Some("Scheduling: Resolving plugins: hdrhistogram"), message: None, stdout: None, stderr: None, artifacts: [], user_metadata: [] }) }

Looks like some flaky test

@grihabor
Copy link
Contributor Author

Opened a pr for python_format_string grihabor#3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants