Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Falco 0.33.1 "node name does not correspond to a node in the cluster" during startup due to jq filter failure on NotReady node with status.addresses missing #2358

Closed
wuub opened this issue Jan 16, 2023 · 1 comment · Fixed by falcosecurity/libs#833
Labels
Milestone

Comments

@wuub
Copy link

wuub commented Jan 16, 2023

Describe the bug

When starting falco on EKS with:

        - '--k8s-node'
        - $(FALCO_K8S_NODE_NAME)

we've experienced whole DaemonSet failures (all new pods failing to start/restart) reporting errors like: Error fetching K8s data: Failing to enrich events with Kubernetes metadata: node name does not correspond to a node in the cluster: ip-xxx-yy-zzz-www.us-west-1.compute.internal. After some looking around and enabling libs_logger.enabled: true we've been able to narrow it down to: https://github.com/falcosecurity/libs/blob/01c07df720708f19b6ba3e2f6857bddb8c2c4779/userspace/libsinsp/socket_handler.h#L792

causing this error line:

[libs]: Socket handler (k8s_node_handler_state), [https://172.20.0.1] filter processing error "json_query filtering result invalid."; JSON: <{"kind":"NodeList","apiVersion":"v1","metadata":{[HUMONGOUS-API-RESPONSE]}}>, jq filter: <{ type: "ADDED", apiVersion: .apiVersion, kind: "Node",  items: [  .items[] |   {   name: .metadata.name,   uid: .metadata.uid,   timestamp: .metadata.creationTimestamp,   labels: .metadata.labels,   addresses: [.status.addresses[].address] | unique   } ]}>

While digging more, this failure is caused by a NonReady node being returned which does not present any .addresses in the .status field:

example:

{
    "metadata": {
        "name": "ip-10-5-13-255.us-west-1.compute.internal",
    },
    // ....
    "status": {
        "conditions": [
          /// ...
        ],
        "daemonEndpoints": {
            "kubeletEndpoint": {
                "Port": 0
            }
        },
        "nodeInfo": {
            "machineID": "",
            "systemUUID": "",
            "bootID": "",
            "kernelVersion": "",
            "osImage": "",
            "containerRuntimeVersion": "",
            "kubeletVersion": "",
            "kubeProxyVersion": "",
            "operatingSystem": "",
            "architecture": ""
        }
    }
}

How to reproduce it

remove status.addresses field from a single k8s node returned by https://172.20.0.1/api/v1/nodes?pretty=false

Expected behaviour

such node should not prevent all other falco pods from starting

Environment

  • Falco version: 0.33.1
  • OS: bottlerocketOS
  • Kernel: n/a
  • Installation method: Kubernetes+Helm
@jasondellaluce
Copy link
Contributor

/milestone 0.34.0

@poiana poiana added this to the 0.34.0 milestone Jan 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants