Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Default WARM_PREFIX_TARGET to 1 #1515

Merged
merged 2 commits into from
Jun 17, 2021

Conversation

jayanthvn
Copy link
Contributor

@jayanthvn jayanthvn commented Jun 17, 2021

What type of PR is this?
enhancement

Which issue does this PR fix:
WARM_PREFIX_TARGET is implemented similar to WARM_ENI_TARGET. Today we do support WARM_ENI_TARGET to 0 but the problem with this [1] with custom networking we allocate secondary ENI since primary ENI is not used and if there are no pods then IPAMD will determine this as an extra ENI since the target is 0 and free the ENI. Later IPAMD will see "available is 0 and WARM_ENI_TARGET is 0" so allocates a new ENI. This happens continuously by the reconciler adding to more EC2 calls on the account. This is tracked with this issue - #1451 [2] Without custom networking, scaling will be very slow - Initially on nodeInit and ENI will allocated Max secondary IPs and once those IPs are used, the next ENI will be allocated after the next reconciler starts and enters here - https://github.com/aws/amazon-vpc-cni-k8s/blob/master/pkg/ipamd/ipamd.go#L911 and then few more seconds for IPs to be available in IMDS so until then pods will be stuck in container creating. Similar fix done in this PR should be done even for WARM_ENI_TARGET but that will break upgrades if someone is already using this so it would need reach out and documentation update.

Similar issue will happen with WARM_PREFIX TARGET, prefixes will be attached and removed in a loop by the reconciler and when more prefixes are needed we need to wait for the reconciler.

<Sorry @anguslees copying your explanation below since it will help our documentation :D :D >
[1]"I am IP constrained" (allocate/free to maintain a small warm pool -> WARM_IP_TARGET=$small; WARM_ENI/PREFIX=$small or 0; possibly choose SIP-mode)
[2]"I am AWS-API-calls constrained" (allocate/free in large batches -> WARM_ENI/PREFIX=$small or $big)
[3]"I want to burst as quickly as I can" (maintain a large warm pool -> WARM_ENI/PREFIX=$big)

What does this PR do / Why do we need it:
[1] With prefix delegation, we won't support WARM_{PREFIX,IP}_TARGET = 0 and MINIMUM_IP_TARGET=0.
[2] default WARM_PREFIX_TARGET to 1.

If WARM_{PREFIX,IP}_TARGET = 0/unset and MINIMUM_IP_TARGET=0/unset, then aws-node will fail to start with an error.

WARM_PREFIX_TARGET = 0 but WARM_IP_TARGET or MINIMUM_IP_TARGET is non zero will work since WARM_IP_TARGET /MINIMUM_IP_TARGET overrides WARM_PREFIX_TARGET

One combination which this PR won't handle is WARM_PREFIX_TARGET = non-zero but WARM_IP_TARGET or MINIMUM_IP_TARGET is zero, since this is supported with secondary mode and this will be evaluated along with WARM_ENI_TARGET.

If an issue # is not available please add repro steps and logs from IPAMD/CNI showing the issue:
n/a

Testing done on this change:

kubectl logs aws-node-ppwcj -n kube-system
{"level":"info","ts":"2021-06-17T19:43:57.967Z","caller":"entrypoint.sh","msg":"Validating env variables ..."}
{"level":"error","ts":"2021-06-17T19:43:57.968Z","caller":"entrypoint.sh","msg":"Setting WARM_PREFIX_TARGET = 0 is not supported while WARM_IP_TARGET/MINIMUM_IP_TARGET is not set. Please configure either one of the WARM_{PREFIX/IP}_TARGET or MINIMUM_IP_TARGET env variables"}

Automation added to e2e:

No

Will this break upgrades or downgrades. Has updating a running cluster been tested?:
No

Does this change require updates to the CNI daemonset config files to work?:

No

Does this PR introduce any user-facing change?:

WARM_{PREFIX,IP}_TARGET = 0/unset and MINIMUM_IP_TARGET=0/unset is not supported with PD.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@jayanthvn jayanthvn requested a review from achevuru June 17, 2021 22:03
@jayanthvn jayanthvn changed the title Default warm prefix target to 1 Default WARM_PREFIX_TARGET to 1 Jun 17, 2021
@@ -1807,7 +1796,7 @@ func (c *IPAMContext) isDatastorePoolTooHigh() bool {
}

func (c *IPAMContext) warmPrefixTargetDefined() bool {
return c.warmPrefixTarget >= noWarmPrefixTarget && c.enableIpv4PrefixDelegation
return c.warmPrefixTarget >= defaultWarmPrefixTarget && c.enableIpv4PrefixDelegation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will we ever change defaultWarmPrefixTarget to a value than 0?
If we change it to be same thing like 2, then this read as c.warmPrefixTarget >= 2. I feel this should be something like c.warmPrefixTarget >= 0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

defaultWarmPrefixTarget is 1 in this PR and also setting just WARM_PREFIX_TARGET = 0 won't be supported. If someone sets WARM_IP_TARGET = num and WARM_PREFIX_TARGET = 0 or any num then WARM_IP_TARGET will take precedence.

scripts/entrypoint.sh Outdated Show resolved Hide resolved
Copy link
Contributor

@M00nF1sh M00nF1sh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

High level comment:

  1. we should fix the reconcile logic. It's clearly a bug if it's allocate/deallocate ENI in loops under specific condition. From the code i read, it seems the purpose of poolTooLow := available < c.maxIPsPerENI*c.warmENITarget || (c.warmENITarget == 0 && available == 0) is when warmENITarget is 0, we only allocate one more ENI when available is 0. However, the ENIDeallocation logic is not honer above, which should be fixed.
  2. Ideally the fix can be as simple as normalize warmENITarget and warmIPTarget at startup, so that if warmENITarget==0, warmIPTarget==0, we normalize it to be as "warmIPTarget==1", and remove this hacky logic of c.warmENITarget == 0 && available == 0 above.
  3. Ideally warmPrefixTarget should be handled similar to warmENITarget as above. Setting it to 0 should give customer the same effect of warmIPTarget==1.

@M00nF1sh
Copy link
Contributor

I'm ok to have this restriction at beginning and relax it later with a better fix.

@jayanthvn
Copy link
Contributor Author

jayanthvn commented Jun 17, 2021

High level comment:

  1. we should fix the reconcile logic. It's clearly a bug if it's allocate/deallocate ENI in loops under specific condition. From the code i read, it seems the purpose of poolTooLow := available < c.maxIPsPerENI*c.warmENITarget || (c.warmENITarget == 0 && available == 0) is when warmENITarget is 0, we only allocate one more ENI when available is 0. However, the ENIDeallocation logic is not honer above, which should be fixed.
  2. Ideally the fix can be as simple as normalize warmENITarget and warmIPTarget at startup, so that if warmENITarget==0, warmIPTarget==0, we normalize it to be as "warmIPTarget==1", and remove this hacky logic of c.warmENITarget == 0 && available == 0 above.
  3. Ideally warmPrefixTarget should be handled similar to warmENITarget as above. Setting it to 0 should give customer the same effect of warmIPTarget==1.

Yes I agree, that's why would want to not support WARM_PREFIX_TARGET = 0 now. Later fix up WARM_{ENI,PREFIX}_TARGET which will need rewriting warm pool high and low logic. My proposal to fix this was also [2] which you have suggested. That will be correct since now, reconciler has to start, ENI/IP should be synced to IMDS and then IP will be available for the pods which will take few minutes. In the meantime if old IPs are free then those won't be available for next 30 seconds.

Copy link
Contributor

@M00nF1sh M00nF1sh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@jayanthvn jayanthvn merged commit 347e1db into aws:prefix-delegation-preview Jun 17, 2021
jayanthvn added a commit that referenced this pull request Jun 24, 2021
…ter (#1516)

* [Preview] Prefix delegation feature development (#1434)

* ENABLE_PREFIX_DELEGATION knob

WARM_PREFIX_TARGET knob

cr https://code.amazon.com/reviews/CR-40610031

* PD changes - dev only

* Cooldown prefix IP

* minor fixes to support prefix count

* Code cleanup

* Handle few corner cases

* Nitro based check

* With custom networking, do not get prefix for primary ENI

* Code refactor

* Handle graceful upgrade/enable PD from disable PD

* code refactoring

* Code refactoring

* fix computing too low IPs

* UT for prefix store

* Fix UTs and handle CR comments

* Clean up SDK code and fix model code generation

* fix format and merge induced error

* Merge broke the code

* Fix Dockerfile.test

* Added IPAMD UTs and fixed removeENI total count

* Couple more IPAMD UTs for PD

* UTs for awsutils/imds

* Handle graceful PD enable to disable knob

* get prefix list for non-pd case

* Prevent reconcile of prefix IPs in IP datastore

* Handle disable scenario

* fix formatting

* clean up comment

* Remove unnecessary debugs

* Handle PR comments

* formatting fix

* Remodelled PD datastore

* Fix up UTs and fix Prefix nil

* formatting

* PR comments - minor cosmetic changes

* removed the sdk override from makefile

* Internal repo merge added these lines

* Update config file

* Handle wrapper of DescribeNetworkInterfacesWithContext to take one eni

* RemoveUnusedENIFromStore was not accounting for prefixes deleted

* Removed hardcoding of 16

* Code refactor - merge ENI's secondary and prefix store into single store of CIDRs  (#1471)

* Code refactor - merge to single DB

* remove few debugs

* remove prefix store files

* PR comments

* Fix up CR comments

* formatting

* Updated UT cases

* UT and formatting

* Minor fixes

* Minor comments

* Updated /32 store term

* remove unused code

* Multi-prefix and WARM/MIN IP targets support with PD (#1477)

* Multi-pd and WARM targets support

* cleanup

* Updated variable names

* Default prefix count to -1

* Get stats should be computed on the fly since CIDR pool can have /32 or /28

* Support for warm prefix 0

* code review comments

* PD test cases and readme update (#1478)

* Traffic test case and readme update

* Added testcases for warm ip/min ip with PD

* Testcases for prefix count

* Testcase for warm prefix along with warm ip/min ip

* Updated traffic test case while PD mode is flipped

* Fix minor comments

* pr comments

* added pods per eni

* fix up count

* Support mixed instances with PD (#1483)

* Support mixed instances with PD

* fix up the log

* Optimization for prefixes allocation (#1500)

* optimization for prefixes

Prefix store optimization

* pr comment

* Fixup eni allocation with warm targets (#1512)

* Fixup eni allocation with warm targets

* fixup cidr count

* code comments and warm prefix 0

* Default WARM_PREFIX_TARGET to 1 (#1515)

* Handle prefix target 0

* pr commets

* Fix up UTs, was failing because of vendor

* No need to commit this

* make format

* needed for UT workflow

* IMDS code refactor

* PR comments - v1

Error with merge

* PR comments v2

* PR comments - v3

* Update logs

* PR comments - v4
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants