Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add rule to check GKE service accounts for container.defaultNodeServiceAccount permissions #105

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions bin/gcpdiag
Original file line number Diff line number Diff line change
Expand Up @@ -85,8 +85,8 @@ Usage:
Commands:
help Print this help text.
lint Run diagnostics on GCP projects.
runbook Run dianostrics tree to deep dive into GCP issue.
search Find gcpdiag rules related to search terms
runbook Run diagnostics tree to deep dive into GCP issue.
search Find gcpdiag rules related to search terms.
version Print gcpdiag version.

See: gcpdiag COMMAND --help for command-specific usage.""")
Expand Down
4 changes: 2 additions & 2 deletions gcpdiag/lint/gce/err_2024_003_vm_secure_boot_failures.py
Original file line number Diff line number Diff line change
Expand Up @@ -77,11 +77,11 @@ def run_rule(context: models.Context, report: lint.LintReportRuleInterface):
i.id)):
report.add_failed(
i,
'Instance has been restricted to boot due to Sheilded VM policy violations'
'Instance has been restricted to boot due to Shielded VM policy violations'
)
else:
report.add_failed(
i, 'Instance is Sheilded VM, has Secure boot failures events')
i, 'Instance is Shielded VM, has Secure boot failures events')
else:
report.add_ok(i)
else:
Expand Down
54 changes: 54 additions & 0 deletions gcpdiag/lint/gke/err_2024_003_default_node_serviceaccount_perm.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
# Copyright 2024 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""GKE nodes service account permissions fit container.defaultNodeServiceAccount role

The service account used by GKE nodes should posess the permissions of the
container.defaultNodeServiceAccount role, otherwise ingestion of logs or metrics
won't work.
"""

from gcpdiag import lint, models
from gcpdiag.queries import gke, iam

ROLE = 'roles/container.defaultNodeServiceAccount'


def prefetch_rule(context: models.Context):
# Make sure that we have the IAM policy in cache.
project_ids = {c.project_id for c in gke.get_clusters(context).values()}
for pid in project_ids:
iam.get_project_policy(pid)


def run_rule(context: models.Context, report: lint.LintReportRuleInterface):
# Find all clusters with logging and metrics enabled.
clusters = gke.get_clusters(context)
iam_policy = iam.get_project_policy(context.project_id)
if not clusters:
report.add_skipped(None, 'no clusters found')
for _, c in sorted(clusters.items()):
if not c.has_logging_enabled():
report.add_skipped(c, 'logging disabled')
elif not c.has_monitoring_enabled():
report.add_skipped(c, 'monitoring disabled')
else:
# Verify service-account permissions for every nodepool.
for np in c.nodepools:
sa = np.service_account
if not iam.is_service_account_enabled(sa, context.project_id):
report.add_failed(np, f'service account disabled or deleted: {sa}')
elif not iam_policy.has_role_permissions(f'serviceAccount:{sa}', ROLE):
report.add_failed(np, f'service account: {sa}\nmissing role: {ROLE}')
else:
report.add_ok(np)
30 changes: 30 additions & 0 deletions website/content/en/rules/gke/ERR/2024_003.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
---
title: "gke/ERR/2024_003"
linkTitle: "ERR/2024_003"
weight: 1
type: docs
description: >
GKE nodes service account permissions fit container.defaultNodeServiceAccount role :-)
---

**Product**: [Google Kubernetes Engine](https://cloud.google.com/kubernetes-engine)\
**Rule class**: Something that is very likely to be wrong

### Description

The service account used by GKE nodes should posess the permissions of the container.defaultNodeServiceAccount role,
otherwise ingestion of logs or metrics won't work.

### Remediation

Make sure your GKE node pool service accounts have the following role binding in the IAM policy

- Principal: GKE node pool service account
- Role: `container.defaultNodeServiceAccount`

or use a custom role which
contains [those permissions](https://cloud.google.com/iam/docs/understanding-roles#container.defaultNodeServiceAccount)

### Further information

- [Hardening your cluster - Use least privilege IAM service Accounts](https://cloud.google.com/linhttps://cloud.google.com/kubernetes-engine/docs/how-to/hardening-your-cluster#use_least_privilege_sa)
Loading