[Bug]: IPAM allocation fails with "InvalidIpamPoolAllocationId" #28913

AlexBarth13 · 2023-01-16T13:24:59Z

Terraform Core Version

1.3.2

AWS Provider Version

4.32.0 and 4.50.0

Affected Resource(s)

aws_vpc_ipam_pool_cidr_allocation

Expected Behavior

I expected that the IPAM allocation will be created successfully.

Actual Behavior

In our environment we are using a multi-account setup. The IPAM pools are created in one account and shared with RAM to another account. We are running in the below mentioned issue when we want to allocate an CIDR in the shared IPAM pool.

To create our IPAM pool allocation we are using this snippet in our code:

resource "aws_vpc_ipam_pool_cidr_allocation" "vpc-ipam-pool-alloc-cidr-cf-subnet-infra" {
   count = var.cf_subnet_infra_count
   ipam_pool_id = var.ipam_pool_id
   netmask_length = 27
}

But immediately afterwards we get the following error:

Error: InvalidIpamPoolAllocationId.NotFound: The IPAM pool allocation (ipam-pool-alloc-0f1fe03456e174fea9c82affb5ee35e01) does not exist.
status code: 400, request id: 9683f21c-8972-4c40-8227-72f5c219e5d3

with aws_vpc_ipam_pool_cidr_allocation.vpc-ipam-pool-alloc-cidr-cf-subnet-infra[0],
on ipam_pool_allocations.tf line 1, in resource "aws_vpc_ipam_pool_cidr_allocation" "vpc-ipam-pool-alloc-cidr-cf-subnet-infra":
1: resource "aws_vpc_ipam_pool_cidr_allocation" "vpc-ipam-pool-alloc-cidr-cf-subnet-infra" {

When we run "aws ec2 get-ipam-pool-allocations --ipam-pool-id ipam-pool-0b325bb4efc6dacae", I properly get returned all IpamPoolAllocations.

We have not changed anything of the Terraform code in regards to the IPAM pool allocations.
3 different persons tried running the setup with assuming role "workload-terraform-role" and ran into the issue too.
We also tried running aws_vpc_ipam_pool_cidr_allocation in a different AWS account and ran also into that issue.
On previous runs couple weeks/months ago, the same code correctly created the aws_vpc_ipam_pool_cidr_allocation without throwing the error.

Relevant Error/Panic Output Snippet

No response

Terraform Configuration Files

resource "aws_vpc_ipam_pool_cidr_allocation" "vpc-ipam-pool-alloc-cidr-cf-subnet-infra" {
   count = var.cf_subnet_infra_count
   ipam_pool_id = var.ipam_pool_id
   netmask_length = 27
}

Steps to Reproduce

Create an IPAM pool
Try to allocate a CIDR in the pool

Debug Output

2023-01-16T10:54:34.077+0100 [DEBUG] provider.terraform-provider-aws_v4.32.0_x5: Action=GetIpamPoolAllocations&IpamPoolAllocationId=ipam-pool-alloc-0f1fe03456e174fea9c82affb5ee35e01&IpamPoolId=ipam-pool-0b325bb4efc6dacae&Version=2016-11-15
2023-01-16T10:54:34.077+0100 [DEBUG] provider.terraform-provider-aws_v4.32.0_x5: -----------------------------------------------------
2023-01-16T10:54:34.330+0100 [DEBUG] provider.terraform-provider-aws_v4.32.0_x5: [DEBUG] [aws-sdk-go] DEBUG: Response ec2/GetIpamPoolAllocations Details:
2023-01-16T10:54:34.330+0100 [DEBUG] provider.terraform-provider-aws_v4.32.0_x5: ---[ RESPONSE ]--------------------------------------
2023-01-16T10:54:34.330+0100 [DEBUG] provider.terraform-provider-aws_v4.32.0_x5: HTTP/1.1 400 Bad Request
2023-01-16T10:54:34.330+0100 [DEBUG] provider.terraform-provider-aws_v4.32.0_x5: Connection: close
2023-01-16T10:54:34.330+0100 [DEBUG] provider.terraform-provider-aws_v4.32.0_x5: Transfer-Encoding: chunked
2023-01-16T10:54:34.330+0100 [DEBUG] provider.terraform-provider-aws_v4.32.0_x5: Cache-Control: no-cache, no-store
2023-01-16T10:54:34.330+0100 [DEBUG] provider.terraform-provider-aws_v4.32.0_x5: Content-Type: text/xml;charset=UTF-8
2023-01-16T10:54:34.330+0100 [DEBUG] provider.terraform-provider-aws_v4.32.0_x5: Date: Mon, 16 Jan 2023 09:54:33 GMT
2023-01-16T10:54:34.330+0100 [DEBUG] provider.terraform-provider-aws_v4.32.0_x5: Server: AmazonEC2
2023-01-16T10:54:34.330+0100 [DEBUG] provider.terraform-provider-aws_v4.32.0_x5: Strict-Transport-Security: max-age=31536000; includeSubDomains
2023-01-16T10:54:34.330+0100 [DEBUG] provider.terraform-provider-aws_v4.32.0_x5: Vary: accept-encoding
2023-01-16T10:54:34.330+0100 [DEBUG] provider.terraform-provider-aws_v4.32.0_x5: X-Amzn-Requestid: 9683f21c-8972-4c40-8227-72f5c219e5d3
2023-01-16T10:54:34.330+0100 [DEBUG] provider.terraform-provider-aws_v4.32.0_x5: 
2023-01-16T10:54:34.330+0100 [DEBUG] provider.terraform-provider-aws_v4.32.0_x5: 
2023-01-16T10:54:34.330+0100 [DEBUG] provider.terraform-provider-aws_v4.32.0_x5: -----------------------------------------------------
2023-01-16T10:54:34.330+0100 [DEBUG] provider.terraform-provider-aws_v4.32.0_x5: [DEBUG] [aws-sdk-go] <?xml version="1.0" encoding="UTF-8"?>
2023-01-16T10:54:34.330+0100 [DEBUG] provider.terraform-provider-aws_v4.32.0_x5: <Response><Errors><Error><Code>InvalidIpamPoolAllocationId.NotFound</Code><Message>The IPAM pool allocation (ipam-pool-alloc-0f1fe03456e174fea9c82affb5ee35e01) does not exist.</Message></Error></Errors><RequestID>9683f21c-8972-4c40-8227-72f5c219e5d3</RequestID></Response>
2023-01-16T10:54:34.330+0100 [DEBUG] provider.terraform-provider-aws_v4.32.0_x5: [DEBUG] [aws-sdk-go] DEBUG: Validate Response ec2/GetIpamPoolAllocations failed, attempt 0/25, error InvalidIpamPoolAllocationId.NotFound: The IPAM pool allocation (ipam-pool-alloc-0f1fe03456e174fea9c82affb5ee35e01) does not exist.
2023-01-16T10:54:34.330+0100 [DEBUG] provider.terraform-provider-aws_v4.32.0_x5:     status code: 400, request id: 9683f21c-8972-4c40-8227-72f5c219e5d3

Panic Output

No response

Important Factoids

No response

References

No response

Would you like to implement a fix?

None

Alexander Barth (alexander.barth@mercedes-benz.com) on behalf of Mercedes-Benz Tech Innovation GmbH, Provider Information

The text was updated successfully, but these errors were encountered:

github-actions · 2023-01-16T13:25:11Z

Community Note

Voting for Prioritization

Please vote on this issue by adding a 👍 reaction to the original post to help the community and maintainers prioritize this request.
Please see our prioritization guide for information on how we prioritize.
Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request.

Volunteering to Work on This Issue

If you are interested in working on this issue, please leave a comment.
If this would be your first contribution, please review the contribution guide.

Tailzip · 2023-01-17T09:53:52Z

We're seeing similar issues with that resource as well (using 4.50.0).
IPAM pool isn't shared with RAM in our case, all operations happen in the same AWS account.

First attempt

`plan`

# aws_vpc_ipam_pool_cidr_allocation.workload[0] will be created
+ resource "aws_vpc_ipam_pool_cidr_allocation" "workload" {
  + cidr                    = (known after apply)
  + description             = "some desc"
  + id                      = (known after apply)
  + ipam_pool_allocation_id = (known after apply)
  + ipam_pool_id            = "ipam-pool-xxxxxxxxxx"
  + netmask_length          = 25
  + resource_id             = (known after apply)
  + resource_owner          = (known after apply)
  + resource_type           = (known after apply)
}

`apply`

apply fails with the following error message, however if we check the AWS Console the allocation is well created in IPAM service.

Error: reading IPAM Pool CIDR Allocation (ipam-pool-alloc-xxxxxxxxxxx_ipam-pool-xxxxxxxxxx): couldn't find resource

Second attempt

`plan`

Resource is shown as tainted.

# aws_vpc_ipam_pool_cidr_allocation.workload[0] is tainted, so must be replaced
-/+ resource "aws_vpc_ipam_pool_cidr_allocation" "workload" {
  + cidr                    = (known after apply)
  ~ id                      = "ipam-pool-alloc-xxxxxxxxxxx_ipam-pool-xxxxxxxxxx" -> (known after apply)
  + ipam_pool_allocation_id = (known after apply)
  + resource_id             = (known after apply)
  + resource_owner          = (known after apply)
  + resource_type           = (known after apply)
    # (3 unchanged attributes hidden)
}

`apply`

Because the resource is tainted, it is being deleted, but that fails as well.

Error: deleting IPAM Pool CIDR Allocation (ipam-pool-alloc-xxxxxx_ipam-pool-xxxxx): InvalidParameterValue: The CIDR specified :  is not in proper format.

(^ not a typo in error message, a value is missing)

EDIT: We've opened a case with AWS support in the meantime, as we believe this is likely to be an issue with AWS IPAM service API rather than the provider. We were able to replicate the issue with AWS CLI as well.

bebold-jhr · 2023-01-17T16:02:53Z

We have the exact same problem as @Tailzip. We are referencing the cidr in a local. It seems that the local is being evaluated way too early. The terraform resource might indicate that the allocation is done, but it seems like it's still ongoing asynchronously in AWS.

We tried to remove the tainted resource and then import the resource, but that doesn't seem to work. Afterwards it showed us a completely new resource being created. Applying that will result in the mentioned error again ("couldn't find resource").
The import statement looks a little bit weird as well. The text says that the allocation id is used for the import, but the example only shows the resource.

EDIT: We verified the problem with 4.19.0, 4.46.0 and 4.48.0. It seems that it started last week as sporadic behavior, but now this is a constant behavior.

justinretzolk · 2023-01-17T19:40:26Z

Potentially related: #25300

vgadde-mck · 2023-01-17T21:23:17Z

Hi @Tailzip - Can you tell me the CLI steps to reproduce. When I do the following
aws ec2 get-ipam-pool-allocations --ipam-pool-id POOL-ID --ipam-pool-allocation-id ALLOCATION-ID
I consistently get correct result.

AWS will not support unless we show them that their CLI also fails

Tailzip · 2023-01-18T08:33:55Z

Hi @Tailzip - Can you tell me the CLI steps to reproduce. When I do the following aws ec2 get-ipam-pool-allocations --ipam-pool-id POOL-ID --ipam-pool-allocation-id ALLOCATION-ID I consistently get correct result.

AWS will not support unless we show them that their CLI also fails

I've been running the following script, and issue happens randomly after a couple runs

script.sh

#!/bin/bash

set -e

export AWS_REGION=eu-central-1
export AWS_DEFAULT_REGION=eu-central-1
export AWS_DEFAULT_OUTPUT=json

IPAM_POOL_ID="ipam-pool-xxxxxxxxxxxxxxxxx"
ALLOCATION_ID="$(aws ec2 allocate-ipam-pool-cidr --ipam-pool-id "$IPAM_POOL_ID" --netmask-length 25 --description 'troubleshoot' | jq -c -r '.IpamPoolAllocation.IpamPoolAllocationId')"

aws ec2 get-ipam-pool-allocations \
    --ipam-pool-id "$IPAM_POOL_ID" \
    --ipam-pool-allocation-id "$ALLOCATION_ID" \
    --no-cli-pager

MiliDurasovic · 2023-01-18T12:36:17Z

Hi @Tailzip - Can you tell me the CLI steps to reproduce. When I do the following aws ec2 get-ipam-pool-allocations --ipam-pool-id POOL-ID --ipam-pool-allocation-id ALLOCATION-ID I consistently get correct result.
AWS will not support unless we show them that their CLI also fails

I've been running the following script, and issue happens randomly after a couple runs

script.sh
#!/bin/bash

set -e

export AWS_REGION=eu-central-1
export AWS_DEFAULT_REGION=eu-central-1
export AWS_DEFAULT_OUTPUT=json

IPAM_POOL_ID="ipam-pool-xxxxxxxxxxxxxxxxx"
ALLOCATION_ID="$(aws ec2 allocate-ipam-pool-cidr --ipam-pool-id "$IPAM_POOL_ID" --netmask-length 25 --description 'troubleshoot' | jq -c -r '.IpamPoolAllocation.IpamPoolAllocationId')"

aws ec2 get-ipam-pool-allocations \
    --ipam-pool-id "$IPAM_POOL_ID" \
    --ipam-pool-allocation-id "$ALLOCATION_ID" \
    --no-cli-pager

Interesting. For me the script always runs through without any issues and the creation through terraform still throws the error InvalidIpamPoolAllocationId.NotFound. Exact same IAM-Role used.

Mili Durasovic mili.durasovic@mercedes-benz.com, Mercedes-Benz Tech Innovation GmbH
Provider Information

AdamTylerLynch · 2023-01-18T14:03:18Z

We have the exact same problem as @Tailzip. We are referencing the cidr in a local. It seems that the local is being evaluated way too early. The terraform resource might indicate that the allocation is done, but it seems like it's still ongoing asynchronously in AWS.

We tried to remove the tainted resource and then import the resource, but that doesn't seem to work. Afterwards it showed us a completely new resource being created. Applying that will result in the mentioned error again ("couldn't find resource"). The import statement looks a little bit weird as well. The text says that the allocation id is used for the import, but the example only shows the resource.

EDIT: We verified the problem with 4.19.0, 4.46.0 and 4.48.0. It seems that it started last week as sporadic behavior, but now this is a constant behavior.

@bebold-jhr can you please log a separate GitHub issue as a bug regarding the your import observations? Thank you.

bebold-jhr · 2023-01-18T14:42:02Z

@AdamTylerLynch I will do that. I just thought it was worth mentioning that using import was not a workaround for us.

AdamTylerLynch · 2023-01-18T14:54:55Z

Appears to be related to #25300

Tailzip · 2023-01-18T16:17:29Z

Got confirmation from IPAM service team via AWS Enterprise support that:

GetIpamPoolAllocations is eventually consistent with respect to AllocateIpamPoolCidr. However, AllocateIpamPoolCidr is strongly consistent with respect to other calls to AllocateIpamPoolCidr -- IPAM will not hand out overlapping space within a pool, even if customers call AllocateIpamPoolCidr several times simultaneously on the pool. The API is eventually consistent. We recommended retrying/waiting a couple seconds.

I guess we need #25300 to be resolved then 😄

AdamTylerLynch · 2023-01-18T22:53:04Z

Hi @Tailzip - Can you tell me the CLI steps to reproduce. When I do the following aws ec2 get-ipam-pool-allocations --ipam-pool-id POOL-ID --ipam-pool-allocation-id ALLOCATION-ID I consistently get correct result.

AWS will not support unless we show them that their CLI also fails

A quick clarification, AWS Enterprise Support does offer Third-Party Product support, including open source software such as Terraform. I agree that having a reproducible case in a script using the AWS CLI is certainly helpful, though not required.

AWS works with Hashicorp and the open source community to evaluate and prioritize issues as per the Terraform AWS Provider FAQ.

JonasWieneke · 2023-01-19T09:40:18Z

Further testing yielded the following results:

Test Case 1:
The following snippet was executed in another account to which the ipam pool was shared, region for execution was eu-central-1:

resource "aws_vpc_ipam_pool_cidr_allocation" "ipam-test-allocation" {
  count          = 3
  ipam_pool_id   = var.ipam_shared_pool_172_id
  netmask_length = var.lb_subnet_netmask
}

resource "aws_vpc_ipam_pool_cidr_allocation" "ipam-test-allocation_2" {
  count          = 3
  ipam_pool_id   = var.ipam_shared_pool_172_id
  netmask_length = var.lb_subnet_netmask
}

The first run of the code fails for some of the resources. Multiple iterations of this code will result in progressively more successful resources being created. It usually takes between 3 and 6 consecutive attempts to fully create all six resources.

Test Case 2:
The following snippet was executed with a new account set. An ipam is shared with the account running this script, region for execution was eu-west-1:

resource "aws_vpc_ipam_pool_cidr_allocation" "ipam-test-allocation" {
  count          = 3
  ipam_pool_id   = var.ipam_shared_pool_172_id
  netmask_length = var.lb_subnet_netmask
}

resource "aws_vpc_ipam_pool_cidr_allocation" "ipam-test-allocation_2" {
  count          = 3
  ipam_pool_id   = var.ipam_shared_pool_172_id
  netmask_length = var.lb_subnet_netmask
}

The execution of this script just ran fine and resulted in no errors at all.

From my point of view this could be a timing problem in the resourceIPAMPoolCIDRAllocationCreate function. It seems that the resource is already being read before the creation is fully completed.

Jonas Wieneke <jonas.wieneke@mercedes-benz.com>, Mercedes-Benz Tech Innovation GmbH
Provider Information

AdamTylerLynch · 2023-01-19T18:59:06Z

I can successfully reproduce in an AccTests. Working on a fix.

kevinkupski · 2023-01-19T19:11:29Z

I can successfully reproduce in an AccTests. Working on a fix.

I also just started on this issue and added this code block to ipam_pool_cidr_allocation.go

	// Handle eventual consitency of the API and therefor retry the read
	return resource.Retry(time.Minute, func() *resource.RetryError {
		err = resourceIPAMPoolCIDRAllocationRead(d, meta)

		if err != nil {
			if tfresource.NotFound(err) {
				return resource.RetryableError(fmt.Errorf("IPAM Pool CIDR Allocation (%s) not yet ready", d.Id()))
			} else {
				return resource.NonRetryableError(err)
			}
		}

		return nil
	})

We need this change urgently. Do you work on this within the next few days or should I open a PR? If the latter, could you share your test code?

AdamTylerLynch · 2023-01-20T19:54:16Z

Hello Kevin, thanks for putting the effort in for the sample code! In the provider we have mechanisms for retries and waiting (retries and waiters), and our PR guidelines suggest that we follow any exiting patterns in the resource being modified.

I have added the mechanisms for retry and waiting to account for eventually consistency of the read operation, and I've added additional acceptance tests to verify cross region pool CIDR allocation.

I assure you this is being worked on. The provider team does releases each Thursday.

github-actions · 2023-01-27T06:15:23Z

This functionality has been released in v4.52.0 of the Terraform AWS Provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading.

For further feature requests or bug reports with this functionality, please create a new GitHub issue following the template. Thank you!

github-actions · 2023-02-27T02:18:24Z

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

AlexBarth13 added bug Addresses a defect in current functionality. needs-triage Waiting for first response or review from a maintainer. labels Jan 16, 2023

github-actions bot added the service/ipam Issues and PRs that pertain to the ipam service. label Jan 16, 2023

justinretzolk added eventual-consistency Pertains to eventual consistency issues. and removed needs-triage Waiting for first response or review from a maintainer. labels Jan 17, 2023

bebold-jhr mentioned this issue Jan 18, 2023

[Bug]: Import for aws_vpc_ipam_pool_cidr_allocation had no effect #28955

Closed

AdamTylerLynch self-assigned this Jan 18, 2023

AdamTylerLynch mentioned this issue Jan 20, 2023

IPAM Pool CIDR Allocation Eventual Consistency #29022

Merged

jar-b closed this as completed in #29022 Jan 23, 2023

github-actions bot added this to the v4.52.0 milestone Jan 23, 2023

AdamTylerLynch mentioned this issue Jan 23, 2023

[Bug]: aws_vpc_ipam_pool_cidr_allocation returns inconsistent results #29045

Closed

github-actions bot locked as resolved and limited conversation to collaborators Feb 27, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: IPAM allocation fails with "InvalidIpamPoolAllocationId" #28913

[Bug]: IPAM allocation fails with "InvalidIpamPoolAllocationId" #28913

AlexBarth13 commented Jan 16, 2023 •

edited

Loading

github-actions bot commented Jan 16, 2023

Tailzip commented Jan 17, 2023 •

edited

Loading

bebold-jhr commented Jan 17, 2023 •

edited

Loading

justinretzolk commented Jan 17, 2023

vgadde-mck commented Jan 17, 2023 •

edited

Loading

Tailzip commented Jan 18, 2023 •

edited

Loading

MiliDurasovic commented Jan 18, 2023 •

edited

Loading

AdamTylerLynch commented Jan 18, 2023

bebold-jhr commented Jan 18, 2023

AdamTylerLynch commented Jan 18, 2023

Tailzip commented Jan 18, 2023 •

edited

Loading

AdamTylerLynch commented Jan 18, 2023

JonasWieneke commented Jan 19, 2023 •

edited

Loading

AdamTylerLynch commented Jan 19, 2023

kevinkupski commented Jan 19, 2023 •

edited

Loading

AdamTylerLynch commented Jan 20, 2023 •

edited

Loading

github-actions bot commented Jan 27, 2023

github-actions bot commented Feb 27, 2023

[Bug]: IPAM allocation fails with "InvalidIpamPoolAllocationId" #28913

[Bug]: IPAM allocation fails with "InvalidIpamPoolAllocationId" #28913

Comments

AlexBarth13 commented Jan 16, 2023 • edited Loading

Terraform Core Version

AWS Provider Version

Affected Resource(s)

Expected Behavior

Actual Behavior

Relevant Error/Panic Output Snippet

Terraform Configuration Files

Steps to Reproduce

Debug Output

Panic Output

Important Factoids

References

Would you like to implement a fix?

github-actions bot commented Jan 16, 2023

Community Note

Tailzip commented Jan 17, 2023 • edited Loading

First attempt

plan

apply

Second attempt

plan

apply

bebold-jhr commented Jan 17, 2023 • edited Loading

justinretzolk commented Jan 17, 2023

vgadde-mck commented Jan 17, 2023 • edited Loading

Tailzip commented Jan 18, 2023 • edited Loading

MiliDurasovic commented Jan 18, 2023 • edited Loading

AdamTylerLynch commented Jan 18, 2023

bebold-jhr commented Jan 18, 2023

AdamTylerLynch commented Jan 18, 2023

Tailzip commented Jan 18, 2023 • edited Loading

AdamTylerLynch commented Jan 18, 2023

JonasWieneke commented Jan 19, 2023 • edited Loading

AdamTylerLynch commented Jan 19, 2023

kevinkupski commented Jan 19, 2023 • edited Loading

AdamTylerLynch commented Jan 20, 2023 • edited Loading

github-actions bot commented Jan 27, 2023

github-actions bot commented Feb 27, 2023

AlexBarth13 commented Jan 16, 2023 •

edited

Loading

Tailzip commented Jan 17, 2023 •

edited

Loading

`plan`

`apply`

`plan`

`apply`

bebold-jhr commented Jan 17, 2023 •

edited

Loading

vgadde-mck commented Jan 17, 2023 •

edited

Loading

Tailzip commented Jan 18, 2023 •

edited

Loading

MiliDurasovic commented Jan 18, 2023 •

edited

Loading

Tailzip commented Jan 18, 2023 •

edited

Loading

JonasWieneke commented Jan 19, 2023 •

edited

Loading

kevinkupski commented Jan 19, 2023 •

edited

Loading

AdamTylerLynch commented Jan 20, 2023 •

edited

Loading