Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Node selection for fully qualified node-names fails (--node=ip-xx-xx-xx-xx.myzone.com) #2374

Open
diranged opened this issue Apr 17, 2024 · 3 comments
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@diranged
Copy link

What happened:

I’m trying to use the kube-state-metrics pods in the DaemonSet mode with --resources=pods and --node=$(NODE_NAME)… in my local testing on a Kind environment, it worked fine. However when I run it in a real EKS cluster to test, I get an odd behavior. We see the fieldSelector get created with the node-name … but it’s missing the .'s:

eg:

│   containers:                                                                                                                                                                                                                             │   - args:                                                                                                                                                                                                                                
│     - -v=7                                                                                                                                                                                                                               
│     - --resources=pods                                                                                                                                                                                                                    
│     - --node="$(NODE_NAME)"                                                                                                                                                                                                               
│     - --port=8080                                                                                                                                                                                                                         
│     env:                                                                                                                                                                                                                                  
│     - name: NODE_NAME                                                                                                                                                                                                                     
│       valueFrom:                                                                                                                                                                                                                          
│         fieldRef:                                                                                                                                                                                                                         
│           apiVersion: v1                                                                                                                                                                                                                  
│           fieldPath: spec.nodeName                                                                                                                                                                                                        

and then we see this:

│ I0417 20:56:30.604141       1 server.go:339] "Started kube-state-metrics self metrics server" telemetryAddress=":8081"                                                                                                                    
│ I0417 20:56:30.604284       1 builder.go:520] "FieldSelector is used" fieldSelector="spec.nodeName=ip-100-80-189-206us-west-2computeinternal"                                                                                             
│ I0417 20:56:30.604321       1 builder.go:282] "Active resources" activeStoreNames="pods"                                                                                                                                                  
│ I0417 20:56:30.604332       1 reflector.go:289] Starting reflector *v1.Pod (0s) from pkg/mod/k8s.io/client-go@v0.29.3/tools/cache/reflector.go:229                                                                                        
│ I0417 20:56:30.604342       1 reflector.go:325] Listing and watching *v1.Pod from pkg/mod/k8s.io/client-go@v0.29.3/tools/cache/reflector.go:229                                                                                           
│ I0417 20:56:30.604381       1 server.go:73] levelinfomsgListening onaddress:8080                                                                                                                                                          
│ I0417 20:56:30.604414       1 server.go:73] levelinfomsgTLS is disabled.http2falseaddress:8080                                                                                                                                            
│ I0417 20:56:30.604419       1 server.go:73] levelinfomsgListening onaddress:8081                                                                                                                                                          
│ I0417 20:56:30.604429       1 server.go:73] levelinfomsgTLS is disabled.http2falseaddress:8081                                                                                                                                            
│ I0417 20:56:30.604442       1 round_trippers.go:463] GET https://172.20.0.1:443/api/v1/pods?fieldSelector=spec.nodeName%3Dip-100-80-189-206us-west-2computeinternal&limit=500&resourceVersion=0                                           
│ I0417 20:56:30.604450       1 round_trippers.go:469] Request Headers:                                                                                                                                                                     │ I0417 20:56:30.604456       1 round_trippers.go:473]     Accept: application/vnd.kubernetes.protobuf,application/json                                                                                                                    

We can verify that we are passing ip-100-80-189-206.us-west-2.compute.internal into the CLI arg properly:

[root@admin]# ps -ef  | grep kube-state
65534    1343367 1343293  0 20:56 ?        00:00:00 /kube-state-metrics --port=8080 --telemetry-port=8081 -v=7 --resources=pods --node="ip-100-80-189-206.us-west-2.compute.internal" --port=8080

The reason we looked into it is because the pod is coming up - but it’s not reporting any metrics:

% curl -v localhost:8080/metrics
*   Trying [::1]:8080...
* Connected to localhost (::1) port 8080
> GET /metrics HTTP/1.1
> Host: localhost:8080
> User-Agent: curl/8.4.0
> Accept: */*
> 
< HTTP/1.1 200 OK
< Content-Type: text/plain; version=0.0.4; charset=utf-8
< Date: Wed, 17 Apr 2024 21:00:28 GMT
< Content-Length: 0
< 
* Connection #0 to host localhost left intact

After digging, I found #2217 which introduced a Regex Pattern that only matches hostnames, and not FQDNs at

// GetNodeFieldSelector returns a nodename field selector.
func (n *NodeType) GetNodeFieldSelector() string {
if nil == n || len(*n) == 0 {
klog.InfoS("Using node type is nil")
return EmptyFieldSelector()
}
pattern := "[^a-zA-Z0-9_,-]+"
re := regexp.MustCompile(pattern)
result := re.ReplaceAllString(n.String(), "")
klog.InfoS("Using node type", "node", result)
return fields.OneTermEqualSelector("spec.nodeName", result).String()
}
.

What you expected to happen:

I expect that the input we pass in will be the input that is used - whether it is correct or not. I was completely thrown to see the code mutating my input, and effectively making the fieldSelector invalid.

Anything else we need to know?:

Environment:

  • kube-state-metrics version: 2.12.20
  • Kubernetes version (use kubectl version): 1.28.4
  • Cloud provider or hardware configuration: EKS
  • Other info:
@diranged diranged added the kind/bug Categorizes issue or PR as related to a bug. label Apr 17, 2024
@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Apr 17, 2024
@diranged
Copy link
Author

@CatherineF-dev put up a fix at #2373 ... 🚤

@logicalhan
Copy link
Member

/triage accepted
/assign @CatherineF-dev

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Apr 18, 2024
@LaikaN57
Copy link
Contributor

LaikaN57 commented Jul 25, 2024

@diranged even through we have not tested v2.13.0 with a DS for this, I think we can tell from static analysis of the code that it should be fixed now. We are also unlikely to test this as the need for a DS is gone also with the fixes in v2.13.0.

... so, tl;dr, I think we can close this issue and re-open later if we see it again.

P.S. We are still keeping #2372 open until we confirm that a KSM upgrade no longer cause the stale metrics issue that the DS was going to be a workaround for.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests

5 participants