Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Port forwarding failing to start (using Argo Rollouts) #9656

Open
mkonecny-atlassian opened this issue Jan 13, 2025 · 5 comments
Open

Port forwarding failing to start (using Argo Rollouts) #9656

mkonecny-atlassian opened this issue Jan 13, 2025 · 5 comments

Comments

@mkonecny-atlassian
Copy link

Expected behavior

When using skaffold dev --port-forward the ports I have setup in my YAML would automatically port-forward.

Actual behavior

We are using Argo Rollout in our Helm chart (not sure if this matters). When using the above I see the following:

Saving 1 charts
Deleting outdated charts
NAME: dx-ref-java-quarkus-im
LAST DEPLOYED: Mon Jan 13 16:57:25 2025
NAMESPACE: dx-ref-java-quarkus-im
STATUS: deployed
REVISION: 1
TEST SUITE: None
Waiting for deployments to stabilize...
Deployments stabilized in 12.8455ms
port forwarding pod-dx-ref-java-quarkus-im-webserver-dx-ref-java-quarkus-im-7070 got terminated: output: Error from server (NotFound): pods "dx-ref-java-quarkus-im-webserver" not found

port forwarding service-dx-ref-java-quarkus-im-service-canary-dx-ref-java-quarkus-im-80 got terminated: output: error: unable to forward port because pod is not running. Current status=Pending

port forwarding service-dx-ref-java-quarkus-im-service-dx-ref-java-quarkus-im-80 got terminated: output: error: unable to forward port because pod is not running. Current status=Pending

Listing files to watch...
 - docker.atl-paas.net/atlassian/dx-ref-java-quarkus-im
Press Ctrl+C to exit
Watching for changes...
[microservice] INFO exec -a "java" java -XX:MaxRAMPercentage=80.0 -XX:+UseParallelGC -XX:MinHeapFreeRatio=10 -XX:MaxHeapFreeRatio=20 -XX:GCTimeRatio=4 -XX:AdaptiveSizePolicyWeight=90 -XX:+ExitOnOutOfMemoryError -Dquarkus.http.host=0.0.0.0 -Djava.util.logging.manager=org.jboss.logmanager.LogManager -cp "." -jar /deployments/quarkus-run.jar
[microservice] INFO running in /deployments
[microservice] __  ____  __  _____   ___  __ ____  ______
[microservice]  --/ __ \/ / / / _ | / _ \/ //_/ / / / __/
[microservice]  -/ /_/ / /_/ / __ |/ , _/ ,< / /_/ /\ \
[microservice] --\___\_\____/_/ |_/_/|_/_/|_|\____/___/
[microservice] 2025-01-13 15:57:27,776 INFO  [io.und.websockets] (main) UT026003: Adding annotated server endpoint class com.atlassian.nebulae.micros.service.websocket.WebsocketResource for path /ws/{id}
[microservice] 2025-01-13 15:57:28,501 INFO  [io.quarkus] (main) dx-ref-java-quarkus-im 0.0.238-SNAPSHOT on JVM (powered by Quarkus 3.17.5) started in 1.056s. Listening on: http://0.0.0.0:7070
[microservice] 2025-01-13 15:57:28,501 INFO  [io.quarkus] (main) Profile local activated.
[microservice] 2025-01-13 15:57:28,501 INFO  [io.quarkus] (main) Installed features: [agroal, amazon-dynamodb, amazon-s3, amazon-sns, amazon-sqs, cdi, elasticsearch-rest-client, jdbc-postgresql, narayana-jta, redis-client, resteasy, resteasy-client, resteasy-jackson, smallrye-context-propagation, vertx, websockets, websockets-client]

Information

  • Skaffold version: v2.13.2
  • Operating system: Darwin 24.1.0
  • Installed via: Homebrew
  • Contents of skaffold.yaml:
apiVersion: skaffold/v4beta11
kind: Config
metadata:
  name: dx-ref-java-quarkus-im
build:
  artifacts:
    - image: <redacted>
      custom:
        buildCommand: |
          ./mvnw package -Dmaven.test.skip -T 1C -Pnebulae
          docker tag <redacted> $IMAGE
        dependencies:
          paths:
            - pom.xml
            - src/main
            - src/test
deploy:
  helm:
    releases:
      - name: dx-ref-java-quarkus-im
        chartPath: helm
        valuesFiles:
          - helm/values.yaml
        namespace: dx-ref-java-quarkus-im

## User defined port forwards, currently not working
portForward:
  - resourceType: pod
    resourceName:  dx-ref-java-quarkus-im-webserver
    namespace: dx-ref-java-quarkus-im
    port: 7070

Additional logs:

DEBU[0004] found open port: 7070                         subtask=-1 task=DevLoop
DEBU[0004] Running command: [kubectl --context kind-nebulae port-forward --pod-running-timeout 1s --namespace dx-ref-java-quarkus-im pod/dx-ref-java-quarkus-im-webserver 7070:7070]  subtask=pod/dx-ref-java-quarkus-im-webserver task=PortForward
DEBU[0004] port forwarding pod-dx-ref-java-quarkus-im-webserver-dx-ref-java-quarkus-im-7070 got terminated: exit status 1, output: Error from server (NotFound): pods "dx-ref-java-quarkus-im-webserver" not found  subtask=pod/dx-ref-java-quarkus-im-webserver task=PortForward
port forwarding pod-dx-ref-java-quarkus-im-webserver-dx-ref-java-quarkus-im-7070 got terminated: output: Error from server (NotFound): pods "dx-ref-java-quarkus-im-webserver" not found

I am surprised that the deployment is "stabilized" in 12ms before the pod is actually running. Note that the Pod is healthy in ~2s but port forward needs to be manually setup. The name is dx-ref-java-quarkus-im-webserver-6fb9db8544-6xx4s (note the hash) and it's not controlled by Deployment (but by a ReplicaSet which is controlled by ArgoRollout).

@kallangerard
Copy link
Contributor

You're trying to target a pod named dx-ref-java-quarkus-im-webserver, but that's not the named of the pod. The name of the pod will be randomly generated by the replicaset, like dx-ref-java-quarkus-im-webserver-6fb9db8544-6xx4s.

You could try using a service instead for the port forward. Or see if kubectl port-forward accepts Argo Rollout resource types.

See https://kubernetes.io/docs/reference/kubectl/generated/kubectl_port-forward/

@mkonecny-atlassian
Copy link
Author

mkonecny-atlassian commented Jan 16, 2025

@kallangerard the generated name was something that I thought would be an issue, but using the service doesn't work either:

port forwarding service-dx-ref-java-quarkus-im-service-canary-dx-ref-java-quarkus-im-80 got terminated: output: error: unable to forward port because pod is not running. Current status=Pending

port forwarding service-dx-ref-java-quarkus-im-service-dx-ref-java-quarkus-im-80 got terminated: output: error: unable to forward port because pod is not running. Current status=Pending

Despite the fact that the service exists. The setup in skaffold yaml is

portFroward:
  - resourceType: service
    resourceName: dx-ref-java-quarkus-im-service
    namespace: dx-ref-java-quarkus-im
    port: 80
    localPort: 8080

See:

kubectl get services -n dx-ref-java-quarkus-im
NAME                                    TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
dx-ref-java-quarkus-im-debug            ClusterIP   10.96.148.188   <none>        5005/TCP   2d
dx-ref-java-quarkus-im-service          ClusterIP   10.96.63.4      <none>        80/TCP     2m51s
dx-ref-java-quarkus-im-service-canary   ClusterIP   10.96.219.240   <none>        80/TCP     2m51s

Note that forwarding to a service WORKS after the first redeploy (if I rebuild the main docker image). At that point there is still an old Pod around and I suppose that is why the port forwarding to the service works:

Port forwarding service/dx-ref-java-quarkus-im-service in namespace dx-ref-java-quarkus-im, remote port 80 -> http://127.0.0.1:8080
Port forwarding service/dx-ref-java-quarkus-im-service-canary in namespace dx-ref-java-quarkus-im, remote port 80 -> http://127.0.0.1:4503

@kallangerard
Copy link
Contributor

Is the pod actually running when you try that?

error: unable to forward port because pod is not running. Current status=Pending

@mkonecny-atlassian
Copy link
Author

mkonecny-atlassian commented Jan 17, 2025

The pod turns to be running later, see the deployment logs in the issue summary. So it looks to me like the port forwarding to pod (or a service) is attempted once the "deployment stabilizes" (not sure what the condition is):

Waiting for deployments to stabilize...
Deployments stabilized in 12.8455ms

But the service logs are shown after this message. I am not sure why/how this works though because in k9s I see that the pod is first in the Pending state until the liveness and readiness probe pass (takes a few seconds) after which it turns to Running. However the port forwarding failure message is printed way ahead of that event.

Could it be that skaffold is looking for a Deployment object (which does not exist in my case, due to Argo Rollout supplementing that) and if it cannot find it, it just passes this condition as NO-OP? Fix for this would be to look for a ReplicaSet maybe.

@mkonecny-atlassian
Copy link
Author

I am wondering why does the Deployment stabilized in 12.8455ms message occur, the pods do not yet exist as I can see when polling via k9s. Can you please point me to the piece of code responsible for this detection? I think this is the issue, it tries to continue to setup port forwarding too soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants