Skip to content
This repository has been archived by the owner on Jul 11, 2023. It is now read-only.

Specifying root service port to a port other than containerPort/TargetPort of the leaf service doesn't not work #4894

Closed
jkotalik opened this issue Jul 13, 2022 · 2 comments · Fixed by #4902
Assignees
Labels
kind/bug Something isn't working

Comments

@jkotalik
Copy link

According to the SMI documentation, we can have a configuration where the root service port is 8080 and the leaf service has a targetPort other than 8080: https://github.com/servicemeshinterface/smi-spec/blob/main/apis/traffic-split/v1alpha4/traffic-split.md#ports

kind: Service
apiVersion: v1
metadata:
  name: birds
spec:
  selector:
    app: birds
  ports:
  - name: grpc
    port: 8080
  - name: rest
    port: 9090
---
kind: Service
apiVersion: v1
metadata:
  name: blue-birds
spec:
  selector:
    app: birds
    color: blue
  ports:
  - name: grpc
    port: 8080
  - name: rest
    port: 9090
---
kind: Service
apiVersion: v1
metadata:
  name: green-birds
spec:
  selector:
    app: birds
    color: green
  ports:
  - name: grpc
    port: 8080
    targetPort: 8081 <--- 8081 would match the container port of 
  - name: rest
    port: 9090

When trying this on the httpbin canary rollout sample though, I wasn't able to see this sample work with this kind of configuration. https://release-v1-0.docs.openservicemesh.io/docs/demos/canary_rollout/

Repro steps

Follow the tutorial in https://release-v1-0.docs.openservicemesh.io/docs/demos/canary_rollout/, except we are going to update the yaml definitions of the services and apply those instead

httpbin.yaml

apiVersion: v1
kind: ServiceAccount
metadata:
  name: httpbin
  namespace: httpbin
---
apiVersion: v1
kind: Service
metadata:
  name: httpbin
  namespace: httpbin
  labels:
    app: httpbin
spec:
  ports:
  - name: http
    port: 8080
  selector:
    app: httpbin

httpbin-v1.yaml

apiVersion: v1
kind: Service
metadata:
  name: httpbin-v1
  namespace: httpbin
  labels:
    app: httpbin
    version: v1
spec:
  ports:
  - name: http
    port: 8080
    targetPort: 14001
  selector:
    app: httpbin
    version: v1
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: httpbin-v1
  namespace: httpbin
spec:
  replicas: 1
  selector:
    matchLabels:
      app: httpbin
      version: v1
  template:
    metadata:
      labels:
        app: httpbin
        version: v1
    spec:
      serviceAccountName: httpbin
      nodeSelector:
        kubernetes.io/arch: amd64
        kubernetes.io/os: linux
      containers:
      - image: simonkowallik/httpbin
        imagePullPolicy: IfNotPresent
        name: httpbin
        command: ["gunicorn", "-b", "0.0.0.0:14001", "httpbin:app", "-k", "gevent"]
        ports:
        - containerPort: 14001
        env:
        - name: XHTTPBIN_POD
          valueFrom:
            fieldRef:
              fieldPath: metadata.name

Where we deploy these yamls in step 3 and 4 instead of the ones in the raw github link.

On Step 6, instead of getting a 200, we are consistently seeing 503s.

Other details

It seems like there are some oddities in this behavior where if we change the targetPort on the root service to 14001, then change it back to 8080, the configuration is still working.

@shashankram shashankram self-assigned this Jul 13, 2022
@shashankram
Copy link
Member

@jkotalik thanks for the detailed repro steps.

It seems like there are some oddities in this behavior where if we change the targetPort on the root service to 14001, then change it back to 8080, the configuration is still working.

I have a suspicion this happens due to a race condition somewhere. Will try to repro the issue.

@shashankram
Copy link
Member

On investigating this issue further, I do see an assumption in the code where we use the TargetPort value on the leaf service and set it as the value of the TargetPort on the root service in internal objects while building the route configuration objects within the control plane.

I'll investigate what it takes to relax the TargetPort requirement to be the same as the TargetPort on the leaf service.

@jaellio jaellio added the kind/bug Something isn't working label Jul 14, 2022
shashankram added a commit to shashankram/osm that referenced this issue Jul 14, 2022
This change does the following:

1. Fixes the incorrect legacy behavior where traffic
   directed to a root service specified in a TrafficSplit
   resource can direct traffic to pods that do not match
   the root service's selector. Not only was this behavior
   confusing, it also significantly complicated code paths
   that required special handling of this scenario that is
   unintuitive. Going forward, the root service selector
   must match pods for traffic splitting to those pods to
   function. Existing e2e tests relying on this unsupported
   behavior have been updated to correctly configure selectors
   and labels on services and pods backing them. A redundant
   test explicitly testing the only supported scenario after
   this change has been removed.

2. Fixes openservicemesh#4894, where the TargetPort on the root service was
   expected to match the ContainerPort on the pod backing
   the service. Per SMI's TrafficSplit API, the TargetPort
   on the root does not need to match the ContainerPort
   on the pod, thus allowing newer application versions
   to change the ports they listen on without needing
   to update the root service definition.

Signed-off-by: Shashank Ram <shashr2204@gmail.com>
shashankram added a commit to shashankram/osm that referenced this issue Jul 14, 2022
This change does the following:

1. Fixes the incorrect legacy behavior where traffic
   directed to a root service specified in a TrafficSplit
   resource can direct traffic to pods that do not match
   the root service's selector. Not only was this behavior
   confusing, it also significantly complicated code paths
   that required special handling of this scenario that is
   unintuitive. Going forward, the root service selector
   must match pods for traffic splitting to those pods to
   function. Existing e2e tests relying on this unsupported
   behavior have been updated to correctly configure selectors
   and labels on services and pods backing them. A redundant
   test explicitly testing the only supported scenario after
   this change has been removed. The automated demo has
   also been updated to correctly configure the selector and
   labels.

2. Fixes openservicemesh#4894, where the TargetPort on the root service was
   expected to match the ContainerPort on the pod backing
   the service. Per SMI's TrafficSplit API, the TargetPort
   on the root does not need to match the ContainerPort
   on the pod, thus allowing newer application versions
   to change the ports they listen on without needing
   to update the root service definition.

Signed-off-by: Shashank Ram <shashr2204@gmail.com>
shashankram added a commit to shashankram/osm that referenced this issue Jul 14, 2022
This change does the following:

1. Fixes the incorrect legacy behavior where traffic
   directed to a root service specified in a TrafficSplit
   resource can direct traffic to pods that do not match
   the root service's selector. Not only was this behavior
   confusing, it also significantly complicated code paths
   that required special handling of this scenario that is
   unintuitive. Going forward, the root service selector
   must match pods for traffic splitting to those pods to
   function. Existing e2e tests relying on this unsupported
   behavior have been updated to correctly configure selectors
   and labels on services and pods backing them. A redundant
   test explicitly testing the only supported scenario after
   this change has been removed. The automated demo has
   also been updated to correctly configure the selector and
   labels.

2. Fixes openservicemesh#4894, where the TargetPort on the root service was
   expected to match the ContainerPort on the pod backing
   the service. Per SMI's TrafficSplit API, the TargetPort
   on the root does not need to match the ContainerPort
   on the pod, thus allowing newer application versions
   to change the ports they listen on without needing
   to update the root service definition.

Signed-off-by: Shashank Ram <shashr2204@gmail.com>
shashankram added a commit to shashankram/osm that referenced this issue Jul 14, 2022
This change does the following:

1. Fixes the incorrect legacy behavior where traffic
   directed to a root service specified in a TrafficSplit
   resource can direct traffic to pods that do not match
   the root service's selector. Not only was this behavior
   confusing, it also significantly complicated code paths
   that required special handling of this scenario that is
   unintuitive. Going forward, the root service selector
   must match pods for traffic splitting to those pods to
   function. Existing e2e tests relying on this unsupported
   behavior have been updated to correctly configure selectors
   and labels on services and pods backing them. A redundant
   test explicitly testing the only supported scenario after
   this change has been removed. The automated demo has
   also been updated to correctly configure the selector and
   labels.

2. Fixes openservicemesh#4894, where the TargetPort on the root service was
   expected to match the ContainerPort on the pod backing
   the service. Per SMI's TrafficSplit API, the TargetPort
   on the root does not need to match the ContainerPort
   on the pod, thus allowing newer application versions
   to change the ports they listen on without needing
   to update the root service definition.

Signed-off-by: Shashank Ram <shashr2204@gmail.com>
shashankram added a commit that referenced this issue Jul 15, 2022
This change does the following:

1. Fixes the incorrect legacy behavior where traffic
   directed to a root service specified in a TrafficSplit
   resource can direct traffic to pods that do not match
   the root service's selector. Not only was this behavior
   confusing, it also significantly complicated code paths
   that required special handling of this scenario that is
   unintuitive. Going forward, the root service selector
   must match pods for traffic splitting to those pods to
   function. Existing e2e tests relying on this unsupported
   behavior have been updated to correctly configure selectors
   and labels on services and pods backing them. A redundant
   test explicitly testing the only supported scenario after
   this change has been removed. The automated demo has
   also been updated to correctly configure the selector and
   labels.

2. Fixes #4894, where the TargetPort on the root service was
   expected to match the ContainerPort on the pod backing
   the service. Per SMI's TrafficSplit API, the TargetPort
   on the root does not need to match the ContainerPort
   on the pod, thus allowing newer application versions
   to change the ports they listen on without needing
   to update the root service definition.

Signed-off-by: Shashank Ram <shashr2204@gmail.com>
shashankram added a commit to shashankram/osm that referenced this issue Jul 15, 2022
…ervicemesh#4902)

This change does the following:

1. Fixes the incorrect legacy behavior where traffic
   directed to a root service specified in a TrafficSplit
   resource can direct traffic to pods that do not match
   the root service's selector. Not only was this behavior
   confusing, it also significantly complicated code paths
   that required special handling of this scenario that is
   unintuitive. Going forward, the root service selector
   must match pods for traffic splitting to those pods to
   function. Existing e2e tests relying on this unsupported
   behavior have been updated to correctly configure selectors
   and labels on services and pods backing them. A redundant
   test explicitly testing the only supported scenario after
   this change has been removed. The automated demo has
   also been updated to correctly configure the selector and
   labels.

2. Fixes openservicemesh#4894, where the TargetPort on the root service was
   expected to match the ContainerPort on the pod backing
   the service. Per SMI's TrafficSplit API, the TargetPort
   on the root does not need to match the ContainerPort
   on the pod, thus allowing newer application versions
   to change the ports they listen on without needing
   to update the root service definition.

Signed-off-by: Shashank Ram <shashr2204@gmail.com>
trstringer pushed a commit that referenced this issue Jul 18, 2022
…sage (#4902) (#4905)

* traffic-split: update root service selector & targetPort usage (#4902)

This change does the following:

1. Fixes the incorrect legacy behavior where traffic
   directed to a root service specified in a TrafficSplit
   resource can direct traffic to pods that do not match
   the root service's selector. Not only was this behavior
   confusing, it also significantly complicated code paths
   that required special handling of this scenario that is
   unintuitive. Going forward, the root service selector
   must match pods for traffic splitting to those pods to
   function. Existing e2e tests relying on this unsupported
   behavior have been updated to correctly configure selectors
   and labels on services and pods backing them. A redundant
   test explicitly testing the only supported scenario after
   this change has been removed. The automated demo has
   also been updated to correctly configure the selector and
   labels.

2. Fixes #4894, where the TargetPort on the root service was
   expected to match the ContainerPort on the pod backing
   the service. Per SMI's TrafficSplit API, the TargetPort
   on the root does not need to match the ContainerPort
   on the pod, thus allowing newer application versions
   to change the ports they listen on without needing
   to update the root service definition.

Signed-off-by: Shashank Ram <shashr2204@gmail.com>

* release-notes: add note on traffic split behavior

Adds a note regarding the change to drop support
to split traffic from a root service to pods that
do not match the selector on the root service.

Signed-off-by: Shashank Ram <shashr2204@gmail.com>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants