Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kmesh with waypoint doesn't work fine for istio 1.24 #984

Closed
YaoZengzeng opened this issue Oct 26, 2024 · 3 comments
Closed

Kmesh with waypoint doesn't work fine for istio 1.24 #984

YaoZengzeng opened this issue Oct 26, 2024 · 3 comments
Labels
kind/enhancement New feature or request

Comments

@YaoZengzeng
Copy link
Member

What would you like to be added:

The mechanism for using waypoints has changed in istio 1.24 and Kmesh need to adapt.

Why is this needed:

I deploy istio 1.24.0-alpha.0 and run some E2E tests and all waypoint related test failed.

KMESH_WAYPOINT_IMAGE="ghcr.io/yaozengzeng/waypoint:latest" ./test/e2e/run_test.sh --only-run-tests -run "TestTrafficSplit"
Switched to context "kind-kmesh-testing".
Running tests in cluster 'kmesh-testing'
....
=== RUN   TestTrafficSplit/from_service-with-waypoint-at-service-granularity/to_service-with-waypoint-at-service-granularity/http/v1
2024-10-26T06:55:57.745043Z     info    tf      === BEGIN: Test: '_root_kmesh_test_e2e[TestTrafficSplit/from_service-with-waypoint-at-service-granularity/to_service-with-waypoint-at-service-granularity/http/v1]' ===
    baseline_test.go:232: 2 errors occurred:
                * failed calling service-with-waypoint-at-service-granularity (cluster=cluster-0)->'http://service-with-waypoint-at-service-granularity.echo-1-87305.svc.cluster.local:80': call failed from service-with-waypoint-at-service-granularity (cluster=cluster-0) to http://service-with-waypoint-at-service-granularity.echo-1-87305.svc.cluster.local:80 (using http): expected no error, but encountered rpc error: code = Unknown desc = 5/5 requests had errors; first error: Get "http://service-with-waypoint-at-service-granularity.echo-1-87305.svc.cluster.local:80": dial tcp 10.96.88.35:80: connect: connection refused
                * failed calling service-with-waypoint-at-service-granularity (cluster=cluster-0)->'http://service-with-waypoint-at-service-granularity.echo-1-87305.svc.cluster.local:80': call failed from service-with-waypoint-at-service-granularity (cluster=cluster-0) to http://service-with-waypoint-at-service-granularity.echo-1-87305.svc.cluster.local:80 (using http): expected no error, but encountered rpc error: code = Unknown desc = 5/5 requests had errors; first error: Get "http://service-with-waypoint-at-service-granularity.echo-1-87305.svc.cluster.local:80": dial tcp 10.96.88.35:80: connect: connection refused
        
2024-10-26T06:56:57.745980Z     info    tf      === DONE (failed):  Test: '_root_kmesh_test_e2e[TestTrafficSplit/from_service-with-waypoint-at-service-granularity/to_service-with-waypoint-at-service-granularity/http/v1] (1m0.00092066s)' ===
=== RUN   TestTrafficSplit/from_service-with-waypoint-at-service-granularity/to_service-with-waypoint-at-service-granularity/http/v2
2024-10-26T06:56:57.746454Z     info    tf      === BEGIN: Test: '_root_kmesh_test_e2e[TestTrafficSplit/from_service-with-waypoint-at-service-granularity/to_service-with-waypoint-at-service-granularity/http/v2]' ===
    baseline_test.go:254: 2 errors occurred:
                * failed calling service-with-waypoint-at-service-granularity (cluster=cluster-0)->'http://service-with-waypoint-at-service-granularity.echo-1-87305.svc.cluster.local:80': call failed from service-with-waypoint-at-service-granularity (cluster=cluster-0) to http://service-with-waypoint-at-service-granularity.echo-1-87305.svc.cluster.local:80 (using http): expected no error, but encountered rpc error: code = Unknown desc = 5/5 requests had errors; first error: Get "http://service-with-waypoint-at-service-granularity.echo-1-87305.svc.cluster.local:80": dial tcp 10.96.88.35:80: connect: connection refused
                * failed calling service-with-waypoint-at-service-granularity (cluster=cluster-0)->'http://service-with-waypoint-at-service-granularity.echo-1-87305.svc.cluster.local:80': call failed from service-with-waypoint-at-service-granularity (cluster=cluster-0) to http://service-with-waypoint-at-service-granularity.echo-1-87305.svc.cluster.local:80 (using http): expected no error, but encountered rpc error: code = Unknown desc = 5/5 requests had errors; first error: Get "http://service-with-waypoint-at-service-granularity.echo-1-87305.svc.cluster.local:80": dial tcp 10.96.88.35:80: connect: connection refused
        
2024-10-26T06:57:57.748534Z     info    tf      === DONE (failed):  Test: '_root_kmesh_test_e2e[TestTrafficSplit/from_service-with-waypoint-at-service-granularity/to_service-with-waypoint-at-service-granularity/http/v2] (1m0.002073873s)' ===
2024-10-26T06:57:57.748656Z     info    tf      === DONE (failed):  Test: '_root_kmesh_test_e2e[TestTrafficSplit/from_service-with-waypoint-at-service-granularity/to_service-with-waypoint-at-service-granularity/http] (2m0.032532301s)' ===
=== RUN   TestTrafficSplit/from_service-with-waypoint-at-service-granularity/to_service-with-waypoint-at-service-granularity/tcp
2024-10-26T06:57:57.763161Z     info    tf      === BEGIN: Test: '_root_kmesh_test_e2e[TestTrafficSplit/from_service-with-waypoint-at-service-granularity/to_service-with-waypoint-at-service-granularity/tcp]' ===
...
--- FAIL: TestTrafficSplit (240.10s)
    --- FAIL: TestTrafficSplit/from_service-with-waypoint-at-service-granularity (120.05s)
        --- FAIL: TestTrafficSplit/from_service-with-waypoint-at-service-granularity/to_service-with-waypoint-at-service-granularity (120.05s)
            --- FAIL: TestTrafficSplit/from_service-with-waypoint-at-service-granularity/to_service-with-waypoint-at-service-granularity/http (120.05s)
                --- FAIL: TestTrafficSplit/from_service-with-waypoint-at-service-granularity/to_service-with-waypoint-at-service-granularity/http/v1 (60.00s)
                --- FAIL: TestTrafficSplit/from_service-with-waypoint-at-service-granularity/to_service-with-waypoint-at-service-granularity/http/v2 (60.00s)
            --- PASS: TestTrafficSplit/from_service-with-waypoint-at-service-granularity/to_service-with-waypoint-at-service-granularity/tcp (0.00s)
        --- PASS: TestTrafficSplit/from_service-with-waypoint-at-service-granularity/to_enrolled-to-kmesh (0.00s)
            --- PASS: TestTrafficSplit/from_service-with-waypoint-at-service-granularity/to_enrolled-to-kmesh/http (0.00s)
            --- PASS: TestTrafficSplit/from_service-with-waypoint-at-service-granularity/to_enrolled-to-kmesh/tcp (0.00s)
    --- FAIL: TestTrafficSplit/from_enrolled-to-kmesh (120.05s)
        --- FAIL: TestTrafficSplit/from_enrolled-to-kmesh/to_service-with-waypoint-at-service-granularity (120.04s)
            --- FAIL: TestTrafficSplit/from_enrolled-to-kmesh/to_service-with-waypoint-at-service-granularity/http (120.04s)
                --- FAIL: TestTrafficSplit/from_enrolled-to-kmesh/to_service-with-waypoint-at-service-granularity/http/v1 (60.00s)
                --- FAIL: TestTrafficSplit/from_enrolled-to-kmesh/to_service-with-waypoint-at-service-granularity/http/v2 (60.00s)
            --- PASS: TestTrafficSplit/from_enrolled-to-kmesh/to_service-with-waypoint-at-service-granularity/tcp (0.00s)
        --- PASS: TestTrafficSplit/from_enrolled-to-kmesh/to_enrolled-to-kmesh (0.00s)
            --- PASS: TestTrafficSplit/from_enrolled-to-kmesh/to_enrolled-to-kmesh/http (0.00s)
            --- PASS: TestTrafficSplit/from_enrolled-to-kmesh/to_enrolled-to-kmesh/tcp (0.00s)
FAIL
@YaoZengzeng
Copy link
Member Author

#995 fixes waypoint resolution in the service, but the workload can also contains the waypoint field. When there is a pod-level waypoint, as expected, the Kmesh daemon will indeed crash:

time="2024-11-04T07:40:02Z" level=info msg="start write CNI config" subsys="cni installer"
time="2024-11-04T07:40:02Z" level=info msg="kmesh cni use chained\n" subsys="cni installer"
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x40 pc=0x64e696a7be91]

goroutine 124 [running]:
kmesh.net/kmesh/pkg/controller/workload.(*Processor).updateWorkload(0xc0008f9860, 0xc0009d5380)
        /kmesh/pkg/controller/workload/workload_processor.go:390 +0x171
kmesh.net/kmesh/pkg/controller/workload.(*Processor).handleWorkload(0xc0008f9860, 0xc0009d5380)
        /kmesh/pkg/controller/workload/workload_processor.go:456 +0x3a5
kmesh.net/kmesh/pkg/controller/workload.(*Processor).handleAddressTypeResponse(0xc0008f9860, 0xc000ae4000)
        /kmesh/pkg/controller/workload/workload_processor.go:770 +0x58b
kmesh.net/kmesh/pkg/controller/workload.(*Processor).processWorkloadResponse(0xc0008f9860, 0xc000ae4000, 0xc0000cf9c0)
        /kmesh/pkg/controller/workload/workload_processor.go:112 +0x137
kmesh.net/kmesh/pkg/controller/workload.(*Controller).HandleWorkloadStream(0xc0009690c0)
        /kmesh/pkg/controller/workload/workload_controller.go:126 +0xac
kmesh.net/kmesh/pkg/controller.(*XdsClient).handleUpstream(0xc000451560, {0x64e69823f068, 0xc00023b9e0})
        /kmesh/pkg/controller/client.go:130 +0x105
created by kmesh.net/kmesh/pkg/controller.(*XdsClient).Run in goroutine 1
        /kmesh/pkg/controller/client.go:145 +0xd2
time="2024-11-04T07:40:03Z" level=info msg="start remove CNI config" subsys="cni installer"
time="2024-11-04T07:40:03Z" level=info msg="remove CNI config done" subsys="cni installer"
kmesh exit

But strangely, after #995 , all E2E tests passed, but TestAddRemovePodWaypoint shouldn't, because it tests the waypoint at the pod granularity.

@hzxuzhonghu
Copy link
Member

@YaoZengzeng I think all the issues have been fixed

@YaoZengzeng
Copy link
Member Author

fixed already

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants