For sharing review comments #1

tnqn · 2023-08-14T09:51:09Z

No description provided.

Support CIDR in group for security policy

Change thumbprint from sha256 to sha1

Fix nsx requests not using token while token exists

1.Support MatchExpression Operator: 'NotIn', 'Exists', 'DoesNotExist' for VM/Pod/Namespace label selectors in SecurityPolicy 2.Given NSX doesn't support MatchExpression Operator 'In', and only five criteria are allowed currently, which results into a gigantic group expression body to be passed to NSX. So, only allow just one Operator 'In' MatchExpressions with at most of five values in it. 3.Add NSX-T limitation and NSGroup Criteria check

…Support Add MatchExpression support for SecurityPolicy

Update some log output Remove unused test case from transport_test.go

Roundtrip should only return err from base().RoundTrip

This patch will fix 2 issues: 1. start of portRange was mistakenly set into the sourcePorts property in NSX-T. 2. json.Marshal(rule) cannot detect the change of SecurityPolicyPort.

Because that some existing container may already use this port.

Add priority range in SecurityPolicy definition

version and create resources by JWT

Remove application/x-www-form-urlencoded from header

1. When building the expression, if there's only "ncp/pod" in the "value" property, the "ncp/pod" will appear in place of the "tag equals", not of the "scope equals" which we are expecting. This patch will change "ncp/pod" to "ncp/pod|" to fix this issue. 2. If namespaceSelector not set in the peer, the rule should select the pod/vm in the same namespace instead of all namespaces.

Change the probe port from 8383 to 8384

Fix SecurityPolicyPort issues

…ondition_kv Fix the condition issues for pod/vm selector

Ensure all criterion mixed, part 1

Auto decrement page size when 60576 errcode appears

This patch is to refactor PR#55, certieria produced for PodSelector, VMSelector and NamespaceSelector, and make the certieria always to be created as a mixed one by changing cluster tag membery type. For PodSelector, VMSelector: cluster tag membery type is set as Segment For NamespaceSelector: cluster tag membery type is set as SegmentPort

Change certieria produced as a mixed one

This patch is to add MatchExpression UnitTests, fix some typo and a minor code refactor in updateMixedExpressionsMatchExpression

Enforce user cannot create SecPolicy CR in sys ns

Add MatchExpression UnitTests

Use gomonkey to mock struct method Use gomock to mock interface method Update gc function so it could be tested

Add UT for garbage collector

Api manager in config may include port and scheme when searching local thumbprint, it should move the port part and only compare host

…ntegration Support to create subnetport in subnetset

IPPool controller added

Get port number by subnet ID from subnetport store

"Project" and "External_ipv4_blocks" are retrieved from VPCNetworkConfiguration CR instead of NCP ConfigMap. Remove the validation check for these 2 options.

…ve_config Remove validation check for default_project and external_ipv4_blocks

1. Read the real StaticIPAllocation from CR during building NSX subnet. 2. Set StaticIPAllocation to true in default subnetset. 3. Correct the finalizer names of subnet/subnetset.

…rties Fix several subnet/subnetset issues

Fix errors when deleting ip block from ipblock store

tnqn · 2023-08-14T09:56:54Z

cmd/main.go

+	logf.SetLogger(logger.ZapLogger())
+	cf, err = config.NewNSXOperatorConfigFromFile()
+	if err != nil {
+		log.Error(err, "load config file error")
+		os.Exit(1)
+	}
+
+	if os.Getenv("NSX_OPERATOR_NAMESPACE") != "" {
+		nsxOperatorNamespace = os.Getenv("NSX_OPERATOR_NAMESPACE")
+	}
+
+	if cf.HAEnabled() {
+		log.Info("HA mode enabled")
+	} else {
+		log.Info("HA mode disabled")
+	}
+
+	if metrics.AreMetricsExposed(cf) {
+		metrics.InitializePrometheusMetrics()
+	}


This is not proper to be in init as they will be executed as long as the package is imported, which means you need to prepare a config file even when running unit test for cmd package.

tnqn · 2023-08-14T10:01:32Z

cmd/main.go

+	nsxClient := nsx.GetClient(cf)
+	if nsxClient == nil {
+		log.Error(err, "failed to get nsx client")


If the func could fail to return a valid nsxClient, it should use nil error to indicate it, instead of returning nil nsxClient. Currently the code is contradictory, the err is always nil as it's not received from the function, and the func never returns nil nsxClient, even it fails (Instead, it just log an error`)

nsx-operator/pkg/nsx/client.go

Lines 188 to 197 in acf8557

if !nsxClient.NSXCheckVersion(SecurityPolicy) {

err := errors.New("SecurityPolicy feature support check failed")

log.Error(err, "initial NSX version check for SecurityPolicy got error")

}

if !nsxClient.NSXCheckVersion(ServiceAccount) {

err := errors.New("NSXServiceAccount feature support check failed")

log.Error(err, "initial NSX version check for NSXServiceAccount got error")

}

return nsxClient

tnqn · 2023-08-14T10:03:09Z

cmd/main.go

+	}
+
+	if cf.CoeConfig.EnableVPCNetwork && commonService.NSXClient.NSXCheckVersion(nsx.VPC) {
+		log.V(1).Info("VPC mode enabled")


Such one time log indicating the enablement of a module could be V(0) as it's few but provide key information.

tnqn · 2023-08-14T10:10:24Z

pkg/nsx/services/subnet/subnet.go

+	if subnetService == nil {
+		lock.Lock()
+		defer lock.Unlock()
+		if subnetService == nil {


If the func is supposed to be called concurrently, it's not really thread-safe as L48 is without lock acquired.
If the func is not supposed to be called concurrently, the lock is not required at all.
From the usage, I think the lock and the global variable subnetServer makes it complicated, and it would be hard to use SubnetService in unit test, as you will need to reset the global variable every time after a test writes it.
A simpler and clearer way to initialzie the struct is just expose a constructor to main.go and inject the service to modules which rely on it:

func NewSubnetService(service common.Service) (*SubnetService, error) ... # main.go subnetService = subnet.NewSubnetService(commonService) subnetReconciler.Service = subnetService ... subnetsetReconciler.Service = subnetService

tnqn · 2023-08-14T10:15:35Z

pkg/nsx/services/subnet/subnet.go

+	wg := sync.WaitGroup{}
+	wgDone := make(chan bool)
+	fatalErrors := make(chan error)
+	subnetService := &SubnetService{
+		Service: service,
+		SubnetStore: &SubnetStore{
+			ResourceStore: common.ResourceStore{
+				Indexer: cache.NewIndexer(keyFunc, cache.Indexers{
+					common.TagScopeSubnetCRUID: subnetIndexFunc,
+				}),
+				BindingType: model.VpcSubnetBindingType(),
+			},
+		},
+	}
+
+	wg.Add(1)
+	go subnetService.InitializeResourceStore(&wg, fatalErrors, ResourceTypeSubnet, nil, subnetService.SubnetStore)
+	go func() {
+		wg.Wait()
+		close(wgDone)
+	}()
+	select {
+	case <-wgDone:
+		break
+	case err := <-fatalErrors:
+		close(fatalErrors)
+		return subnetService, err
+	}


I think the essential of the code is that it wants to get the result of InitializeResourceStore synchronously, regardless of success or failure. The usage of the channels and waitGroup makes it really complicated. Why don't just make InitializeResourceStore return an error?

func NewSubnetService(service common.Service) (*SubnetService, error) { subnetService := &SubnetService{ Service: service, SubnetStore: &SubnetStore{ ResourceStore: common.ResourceStore{ Indexer: cache.NewIndexer(keyFunc, cache.Indexers{ common.TagScopeSubnetCRUID: subnetIndexFunc, }), BindingType: model.VpcSubnetBindingType(), }, }, } if err := subnetService.InitializeResourceStore(ResourceTypeSubnet, nil, subnetService.SubnetStore); err != nil { return nil, err } return subnetService, nil }

Besides, closing a channel could make goroutines writting to it panic, which is the case of InitializeResourceStore.

I think the essential of the code is that it wants to get the result of InitializeResourceStore synchronously, regardless of success or failure. The usage of the channels and waitGroup makes it really complicated. Why don't just make InitializeResourceStore return an error?

func NewSubnetService(service common.Service) (*SubnetService, error) { subnetService := &SubnetService{ Service: service, SubnetStore: &SubnetStore{ ResourceStore: common.ResourceStore{ Indexer: cache.NewIndexer(keyFunc, cache.Indexers{ common.TagScopeSubnetCRUID: subnetIndexFunc, }), BindingType: model.VpcSubnetBindingType(), }, }, } if err := subnetService.InitializeResourceStore(ResourceTypeSubnet, nil, subnetService.SubnetStore); err != nil { return nil, err } return subnetService, nil }

For the case of Subnet service, we can just make InitializeResourceStore just return an error. I think InitializeResourceStore is designed to use channels/waitgroup to initialize multiple resource stores simultaneously in different goroutines. Taking firewall service as example, it initializes four kinds of resource stores simultaneously.

https://github.com/vmware-tanzu/nsx-operator/blob/vpc_dev/pkg/nsx/services/securitypolicy/firewall.go#L70

But I see that most of services ONLY initialize one store, so it also makes sense to simplify InitializeResourceStore.

tnqn · 2023-08-14T10:50:40Z

pkg/nsx/services/subnet/subnet.go

+	vpcInfo, err := common.ParseVPCResourcePath(vpcList.Items[0].Status.NSXResourcePath)
+	if err != nil {
+		return "", err
+	}


How does it guarantee NSXResourcePath has been set when processing this subnet? If there is no guarantee, errors may be generated randomly which might scare users who are monitoring component logs.
If having a valid NSXResourcePath is a precondition of creating subnet, perhaps it could skip enqueuing a subnet when VPC is not ready and enqueue associated subnets when VPC is ready.

tnqn · 2023-08-14T10:57:17Z

pkg/controllers/subnetport/subnetport_controller.go

+	if len(obj.Spec.SubnetSet) > 0 && len(obj.Spec.Subnet) > 0 {
+		err := errors.New("subnet and subnetset should not be configured at the same time")
+		log.Error(err, "failed to get subnet/subnetset of the subnetport", "subnetport", req.NamespacedName)
+		return common.ResultNormal, err
+	}


It could enforce the validation in CRD schema to ensure at most one SubnetSet/Subnet is set. Otherwise returning the error to make it retry is pointless.

tnqn · 2023-08-14T10:57:35Z

pkg/controllers/subnetport/subnetport_controller.go

+		// if attachmentRef.Name == "" {
+		// 	defaultVMSubnet = true
+		// }
+		old_status := obj.Status.DeepCopy()


use consistent naming style: oldStatus

tnqn · 2023-08-14T11:23:16Z

pkg/controllers/subnetport/subnetport_controller.go

+		subnetPath = subnet.Status.NSXResourcePath
+		if len(subnetPath) == 0 {
+			err := fmt.Errorf("empty NSX resource path from subnet %s", subnet.Name)
+			return subnetPath, err
+		}


Guess there is no guarantee that NSXResourcePath is not empty here. Instead of returning error, perhaps it could also skip enqueueing the subnetPort when its subnet's status is not ready, and watch subnet's status update event and make it enqueue associated subnetPorts which are not realized yet.

tnqn · 2023-08-14T11:31:56Z

pkg/controllers/subnetport/subnetport_controller.go

+		subnetPath, err := r.AllocateSubnetFromSubnetSet(obj, subnetSet)
+		if err != nil {
+			return subnetPath, err
+		}
+		return subnetPath, nil


Guess this is related to where @dantingl raised the race condition. Except for adding a lock to the method, there may be another solution:
subnetsetController also watches subsetPorts, and checks if the current capacity is enough by comparing the expected capacity and number of current subsetPorts. In its reconcile func, it auto-scales the subnetSet when the capacity is not enough.
Then subnetPort just waits for available subnet/subnetSet to be available and can unify the implementation.

tnqn · 2023-08-14T11:44:36Z

pkg/controllers/subnetset/subnetset_controller.go

+	matchedCondition := getExistingConditionOfType(newCondition.Type, subnetset.Status.Conditions)
+
+	if reflect.DeepEqual(matchedCondition, newCondition) {
+		log.V(2).Info("conditions already match", "New Condition", newCondition, "Existing Condition", matchedCondition)


Don't use space to connect multi words in the key of structured logging, otherwise it could be hard to differentiate key and value in some cases and hard to use tool to extract key and values. Some suggestions about logging:
https://github.com/tnqn/code-review-comments#logging

tnqn · 2023-08-14T11:47:08Z

pkg/controllers/subnetset/subnetset_controller.go

+	return ctrl.NewControllerManagedBy(mgr).
+		For(&v1alpha1.SubnetSet{}).
+		WithOptions(controller.Options{
+			MaxConcurrentReconciles: runtime.NumCPU(),


It doesn't make sense to use cpu number as the concurrency of network-intensive tasks, the efficiecy of the application may vary in different environments and unpredictable. It may run well when it's tested in a env but badly in another.

tnqn · 2023-08-14T12:03:33Z

pkg/controllers/subnetset/subnetset_controller.go

+		portNums := len(common.ServiceMediator.GetPortsOfSubnet(*subnet.Id))
+		if portNums > 0 {
+			continue
+		}
+		if err := r.Service.DeleteSubnet(subnet); err != nil {
+			log.Error(err, "fail to delete subnet from subnetset cr", "ID", *subnet.Id)
+			hitError = true
+		}


It could conflict with the code that allocates empty subnet. If #1 (comment) is accepted, perhaps it could implement garbage collection in Reconcile, it could be retriggered by always returning RequeueAfter as the GCInterval

Initialize subnet service once then inject it to multiple reconcilers instead of initializing service in different reconcilers with a lock ensuring the initialization is invoked once. tnqn#1 (comment)

Add LastTansitionTime in CR status which helps understand when the issue was encountered/resolved when viewing the CR. tnqn#1 (comment)

Initialize subnet service once then inject it to multiple reconcilers instead of initializing service in different reconcilers with a lock ensuring the initialization is invoked once. tnqn#1 (comment)

Add LastTansitionTime in CR status which helps understand when the issue was encountered/resolved when viewing the CR. tnqn#1 (comment)

Remove Subnet/SubnetPort service from mediator as service mediato will be removed from nsx-operator. Subnet/SubnetPort service will only be instantiated once and passed as pointer across controllers. tnqn#1 (comment)

heypnus and others added 30 commits January 26, 2022 16:21

Merge pull request vmware-tanzu#47 from heypnus/security_policy/block

23cd9f5

Support CIDR in group for security policy

Fix nsx requests not using token while token exists

2acaf94

Change thumbprint from sha256 to sha1

Merge pull request vmware-tanzu#46 from TaoZou1/fix

2cc0a71

Fix nsx requests not using token while token exists

Merge pull request vmware-tanzu#38 from timdengyun/AddMatchExpression…

545bdb6

…Support Add MatchExpression support for SecurityPolicy

Roundtrip should only return err from base().RoundTrip

7997527

Update some log output Remove unused test case from transport_test.go

Merge pull request vmware-tanzu#48 from TaoZou1/error

7f95d9b

Roundtrip should only return err from base().RoundTrip

Fix SecurityPolicyPort issues

a354568

This patch will fix 2 issues: 1. start of portRange was mistakenly set into the sourcePorts property in NSX-T. 2. json.Marshal(rule) cannot detect the change of SecurityPolicyPort.

Change the probe port from 8383 to 8384

276343e

Because that some existing container may already use this port.

Add priority range in SecurityPolicy definition

8bfcfc5

Merge pull request vmware-tanzu#45 from dantingl/priority

037a2d3

Add priority range in SecurityPolicy definition

Remove 'application/x-www-form-urlencoded' from header while get

95d7d44

version and create resources by JWT

Merge pull request vmware-tanzu#53 from TaoZou1/jwtfix

d5f78d3

Remove application/x-www-form-urlencoded from header

Merge pull request vmware-tanzu#52 from heypnus/fix/probe_port

4cb1186

Change the probe port from 8383 to 8384

Merge pull request vmware-tanzu#50 from heypnus/security_policy/fix_port

3e59543

Fix SecurityPolicyPort issues

Merge pull request vmware-tanzu#49 from heypnus/security_policy/fix_c…

ed0b261

…ondition_kv Fix the condition issues for pod/vm selector

Ensure all criterion mixed, part 1

def5a50

Merge pull request vmware-tanzu#55 from heypnus/security_policy/mixed3

fcfaf92

Ensure all criterion mixed, part 1

Enforce user cannot create SecPolicy CR in sys ns

01e8c69

Auto decrement page size when 60576 errcode appears

1a254ca

Merge pull request vmware-tanzu#54 from zhengxiexie/zhengxie/scale_page

6607d44

Auto decrement page size when 60576 errcode appears

Merge pull request vmware-tanzu#57 from timdengyun/ChangeCriterionMixed

8d21719

Change certieria produced as a mixed one

Add MatchExpression UnitTests

e33219e

This patch is to add MatchExpression UnitTests, fix some typo and a minor code refactor in updateMixedExpressionsMatchExpression

Merge pull request vmware-tanzu#56 from ggverma/enforce-no-sys-ns

02d7ddb

Enforce user cannot create SecPolicy CR in sys ns

Merge pull request vmware-tanzu#51 from timdengyun/AddMatchExpressionUT

951daa7

Add MatchExpression UnitTests

Add some UT for securitypolicy_controller, config, error

295df85

Use gomonkey to mock struct method Use gomock to mock interface method Update gc function so it could be tested

Merge pull request vmware-tanzu#43 from TaoZou1/ut

686eed6

Add UT for garbage collector

Fix not finding thumbprint

3bd1243

Api manager in config may include port and scheme when searching local thumbprint, it should move the port part and only compare host

heypnus and others added 11 commits August 10, 2023 21:53

Merge pull request vmware-tanzu#269 from heypnus/vpc/subnetset_port_i…

43f5db3

…ntegration Support to create subnetport in subnetset

Get port number by subnet ID from subnetport store

5d17ce4

IPPool controller added

52ea759

Merge pull request vmware-tanzu#160 from zhengxiexie/ippool

7566aea

IPPool controller added

Merge pull request vmware-tanzu#275 from heypnus/vpc/GetPortsOfSubnet

d6c96f7

Get port number by subnet ID from subnetport store

Remove validation check for default_project and external_ipv4_blocks

51e997c

"Project" and "External_ipv4_blocks" are retrieved from VPCNetworkConfiguration CR instead of NCP ConfigMap. Remove the validation check for these 2 options.

Merge pull request vmware-tanzu#276 from lxiaopei/topic/lxiaopei/remo…

9864be7

…ve_config Remove validation check for default_project and external_ipv4_blocks

Fix several subnet/subnetset issues

1d2e592

1. Read the real StaticIPAllocation from CR during building NSX subnet. 2. Set StaticIPAllocation to true in default subnetset. 3. Correct the finalizer names of subnet/subnetset.

Merge pull request vmware-tanzu#278 from heypnus/vpc/fix/subnet_prope…

70d156a

…rties Fix several subnet/subnetset issues

Fix errors when deleting ip block from ipblock store

d3ed048

Merge pull request vmware-tanzu#277 from seanpang-vmware/fixipblockdel

acf8557

Fix errors when deleting ip block from ipblock store

tnqn commented Aug 14, 2023

View reviewed changes

jwsui mentioned this pull request Jan 15, 2024

Subnet service refactor vmware-tanzu/nsx-operator#479

Merged

jwsui added a commit to jwsui/nsx-operator that referenced this pull request Jan 15, 2024

Add LastTansitionTime in CR status

d4642b7

Add LastTansitionTime in CR status which helps understand when the issue was encountered/resolved when viewing the CR. tnqn#1 (comment)

jwsui mentioned this pull request Jan 15, 2024

Add LastTansitionTime in CR status vmware-tanzu/nsx-operator#481

Merged

jwsui added a commit to jwsui/nsx-operator that referenced this pull request Jan 15, 2024

Add LastTansitionTime in CR status

9312ab1

Add LastTansitionTime in CR status which helps understand when the issue was encountered/resolved when viewing the CR. tnqn#1 (comment)

jwsui added a commit to jwsui/nsx-operator that referenced this pull request Jan 15, 2024

Add LastTansitionTime in CR status

89d99cd

Add LastTansitionTime in CR status which helps understand when the issue was encountered/resolved when viewing the CR. tnqn#1 (comment)

jwsui added a commit to jwsui/nsx-operator that referenced this pull request Jan 15, 2024

Add LastTansitionTime in CR status

8dff4c5

Add LastTansitionTime in CR status which helps understand when the issue was encountered/resolved when viewing the CR. tnqn#1 (comment)

jwsui added a commit to jwsui/nsx-operator that referenced this pull request Jan 18, 2024

Add LastTansitionTime in CR status

48338e4

Add LastTansitionTime in CR status which helps understand when the issue was encountered/resolved when viewing the CR. tnqn#1 (comment)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

For sharing review comments #1

For sharing review comments #1

tnqn commented Aug 14, 2023

tnqn Aug 14, 2023

tnqn Aug 14, 2023

tnqn Aug 14, 2023

tnqn Aug 14, 2023 •

edited

Loading

tnqn Aug 14, 2023

tnqn Aug 14, 2023

jwsui Oct 18, 2023

tnqn Aug 14, 2023

tnqn Aug 14, 2023 •

edited

Loading

tnqn Aug 14, 2023

tnqn Aug 14, 2023

tnqn Aug 14, 2023 •

edited

Loading

tnqn Aug 14, 2023

tnqn Aug 14, 2023

tnqn Aug 14, 2023

	if !nsxClient.NSXCheckVersion(SecurityPolicy) {
	err := errors.New("SecurityPolicy feature support check failed")
	log.Error(err, "initial NSX version check for SecurityPolicy got error")
	}
	if !nsxClient.NSXCheckVersion(ServiceAccount) {
	err := errors.New("NSXServiceAccount feature support check failed")
	log.Error(err, "initial NSX version check for NSXServiceAccount got error")
	}

	return nsxClient

For sharing review comments #1

Are you sure you want to change the base?

For sharing review comments #1

Conversation

tnqn commented Aug 14, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tnqn Aug 14, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tnqn Aug 14, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tnqn Aug 14, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tnqn Aug 14, 2023 •

edited

Loading

tnqn Aug 14, 2023 •

edited

Loading

tnqn Aug 14, 2023 •

edited

Loading