Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unstable CI #12544

Open
NikitaSkrynnik opened this issue Nov 18, 2024 · 1 comment
Open

Unstable CI #12544

NikitaSkrynnik opened this issue Nov 18, 2024 · 1 comment
Assignees

Comments

@NikitaSkrynnik
Copy link
Collaborator

NikitaSkrynnik commented Nov 18, 2024

Description

This issue aggregates all the issues that relate to CI problems and provides a decomposition for them.

sdk

  1. Failed unit test TestReselect_LocalForwarderRestart (0.57s) sdk#1627
  2. TestNSMGR_HealRegistry is unstable sdk#1615
  3. Test_Interdomain_PassThroughUsecase is unstable sdk#1593
  4. TestNSMGR_CloseHeal/With_NSE_expiration is unstable sdk#1592
  5. TestNSMGR_CloseHeal/Without_NSE_expiration is not stable sdk#1591
  6. TestListenAndServe_NotExistsFolder is unstable sdk#1575 (probably stable, 10.000.000 runs)
  7. Test_vl3MtuServer_SpoiledConnection is unstable sdk#1574
  8. TestNSMGRHealEndpoint_DatapathHealthy_CtrlPlaneBroken is unstable sdk#1573
  9. TestInterdomainFloatingNetworkServiceEndpointRegistry is unstable sdk#1444
  10. TestRefreshClient_Sandbox is unstable sdk#839 (probably stable, 10.000.000 runs)
  11. Test_DiscoverForwarder_ChangeForwarderOnClose
  12. Test_DiscoverForwarder_ChangeForwarderOnDeath_LostHeal
  13. Test_DiscoverForwarder_CloseAfterError
  14. Test_DNSUsecase
  15. TestNSMGR_HealEndpoint/Local_New
  16. TestNSMGR_HealEndpoint/Remote_New
  17. TestNSMGRHealEndpoint_DataPlaneBroken_CtrlPlaneBroken
  18. Test Test_NSC_ConnectsTo_vl3NSE is unstable sdk#1695

sdk-k8s

  1. Test_K8sNSERegistry_FindWatch is unstable sdk-k8s#514
  2. TestNSMGR_FloatingInterdomainUseCase is unstable sdk-k8s#402

integration tests

  1. Update from update/networkservicemesh/integration-tests integration-k8s-kind#1035 (First we need to fix the issue with pod deletion)
  2. TestRunFeatureOvsSuite is unstable integration-k8s-kind#1008
  3. Test TestRunFeatureSuite/TestVl3_ipv6 is unstable integration-k8s-kind#1007
  4. TestK8sMonolithSuite/External_nse is unstable integration-k8s-kind#905
  5. TestRunHealSuite/TestRemote_nsm_system_restart_memif_ip is unstable integration-k8s-kind#904
  6. TestRunFeatureSuite/TestVl3_basic is unstable integration-k8s-kind#872
  7. TestRunRvlanSuite/Rvlanvpp/TestKernel2RVlanMultiNS is unstable integration-k8s-kind#839
  8. TestRunHealSuite/TestVl3_nse_death is unstable integration-k8s-kind#776
  9. [Calico] TestKernel2Wireguard2Kernel_dual_stack is not stable  integration-k8s-kind#671
  10. [Calico] Lack of cluster resources - TestVl3 is unstable integration-k8s-kind#633
  11. [Calico] TestMutuallyAwareNSE is not working integration-k8s-kind#627
  12. [Calico] Lack of cluster resources - TestNSE_Composition is unstable integration-k8s-kind#625

public clusters

  1. Packet: Packet clusters doesn't work integration-k8s-packet#405
  2. AWS: Update from update/networkservicemesh/integration-k8s-kind integration-k8s-aws#423

decomposition

sdk and sdk-k8s

  1. Check if test is still unstable - 2h
  2. Fix test if it's unstuble - 2h (positive), 10h (negative)
    Total negative: (2h + 10h) * 18 = 216h or 31d
    Total positive: (2h + 2h) * 18 = 68h or 10d

integration tests

  1. Fix the issue with proper pod deletion - 2d (positive), 5d (negative)
  2. Each integration test (11 tests) - 1.5d (positive), 3d (negative)
    Total negative: 5d + 11 * 3d = 38d
    Total positive: 2d + 11 * 1.5d = 18.5d

public clusters

  1. Packet - 2d (positive), 5d (negative)
  2. AWS - 2d (positive), 5d (negative)

Total positive: 10d + 18.5d + 4d = 32.5d
Total negative: 31d + 38d + 10d = 79d

@NikitaSkrynnik
Copy link
Collaborator Author

NikitaSkrynnik commented Nov 19, 2024

Tests by priority


HIGH

  1. TestNSMGRHealEndpoint_DatapathHealthy_CtrlPlaneBroken (4h to 12h)
  2. Test_NSC_ConnectsTo_vl3NSE (4h to 12h)
  3. TestNSMGR_HealEndpoint/Local_New (4h to 12h)
  4. TestPassThrough/TestDeleteToDown (4h to 12h)
  5. TestIpam_policies (~3d)

Total positive: 16h + 3d = 5d
Total negative: 48h + 3d = 10d

MEDIUM
TODO

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Blocked
Development

No branches or pull requests

2 participants