Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

resource/aws_route: route is not saved in the state when it fails to be available (in 10m create timeout) #23827

Closed
Riinkesh opened this issue Mar 23, 2022 · 4 comments · Fixed by #24024
Labels
bug Addresses a defect in current functionality. service/ec2 Issues and PRs that pertain to the ec2 service.

Comments

@Riinkesh
Copy link

Riinkesh commented Mar 23, 2022

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Terraform CLI and Terraform AWS Provider Version

terraform version - 1.5.1
provider-aws version - 4.6.0

Affected Resource(s)

  • aws_route

Terraform Configuration Files

Please include all Terraform configurations required to reproduce the bug. Bug reports without a functional reproduction may be closed without investigation.

	resource "aws_ec2_transit_gateway_vpc_attachment" "tgw-attachment" {
	  subnet_ids         = var.app_subnets
	  transit_gateway_id = var.tgw_id
	  vpc_id             = var.vpc_id
	
	  tags = {
	    Name         = "${var.vpc_name}-tgw-attachment"
	  }
	}
	#########TGW VPC Routes########################################################
	locals {
	  app_route_table_routes = setproduct(var.app_route_table_id, var.app_rt_tgw_destination_cidr_blocks[var.account_type])
	}
	
	resource "aws_route" "app_rt_assn" {
	  count                  = (length(var.app_route_table_id) * length(var.app_rt_tgw_destination_cidr_blocks[var.account_type]))
	  route_table_id         = element(local.app_route_table_routes, count.index)[0]
	  destination_cidr_block = element(local.app_route_table_routes, count.index)[1]
	  transit_gateway_id     = var.tgw_id
	  depends_on = [aws_ec2_transit_gateway_vpc_attachment.tgw-attachment]
	  timeouts {
	    create = var.route_addition_timeout
	  }
	}

Debug Output

NA

Panic Output

NA

Expected Behavior

I expect that terraform-provider-aws saves the route in the state and waits for it to get ready. And terraform apply fails when the route cannot be available (in 10m) but the route is saved in the state and no state is lost/leaked.

Actual Behavior

On first run, Terraform starts creating the routes in batches (lets say 8 out of 26 routes at a time) & later fails to add the route even with the long create timeout.
Error message is also misleading-

  • 'Error waiting for Route in Route Table (rtb-05738f2552a4b7543) with destination (0.0.0.0/0) to become available: couldn't find resource (1001 retries)'

On next plan & apply, terraform gives the below error-

  • 'error creating Route in Route Table (rtb-05738f2552a4b7543) with destination (0.0.0.0/0): RouteAlreadyExists: The route identified by 0.0.0.0/0 already exists. status code: 400, request id: yyyyyyyy

Steps to Reproduce

  1. terraform apply

Important Factoids

NA

References

  • #0000
@github-actions github-actions bot added needs-triage Waiting for first response or review from a maintainer. service/ec2 Issues and PRs that pertain to the ec2 service. labels Mar 23, 2022
@ewbankkit ewbankkit added bug Addresses a defect in current functionality. and removed needs-triage Waiting for first response or review from a maintainer. labels Mar 23, 2022
@ewbankkit
Copy link
Contributor

log.Printf("[DEBUG] Creating Route: %s", input)
_, err = tfresource.RetryWhenAWSErrCodeEquals(
d.Timeout(schema.TimeoutCreate),
func() (interface{}, error) {
return conn.CreateRoute(input)
},
ErrCodeInvalidParameterException,
ErrCodeInvalidTransitGatewayIDNotFound,
)
if err != nil {
return fmt.Errorf("error creating Route in Route Table (%s) with destination (%s): %w", routeTableID, destination, err)
}
_, err = WaitRouteReady(conn, routeFinder, routeTableID, destination, d.Timeout(schema.TimeoutCreate))
if err != nil {
return fmt.Errorf("error waiting for Route in Route Table (%s) with destination (%s) to become available: %w", routeTableID, destination, err)
}
d.SetId(RouteCreateID(routeTableID, destination))

d.SetId(...) should be before the call to WaitRouteReady.

@gmichelo
Copy link
Contributor

gmichelo commented Apr 2, 2022

@ewbankkit, I'd like to work on this. The fix itself seems straight forward. But I am not sure how we can write an acceptance test for this. I was trying to understand if we could mock the WaitRouteReady somehow to let it return an error. Any suggestion?

@ewbankkit
Copy link
Contributor

Just running the current set of acceptance tests should suffice.

gmichelo added a commit to gmichelo/terraform-provider-aws that referenced this issue Apr 4, 2022
When the route's wait-for-creation operation times out,
its state is leaked as `d.SetId(...)` is called after the
`WaitRouteReady(...)`. So, the next time the configuration
is applied, the Terraform engine tries to create the same
route again, getting duplication error from AWS.

Call `d.SetId(...)` before `WaitRouteReady(...)`.

Fixes: hashicorp#23827
gmichelo added a commit to gmichelo/terraform-provider-aws that referenced this issue Apr 4, 2022
Issue:
When the route's wait-for-creation operation times out,
its state is leaked as `d.SetId(...)` is called after the
`WaitRouteReady(...)`. So, the next time the configuration
is applied, the Terraform engine tries to create the same
route again, getting duplication error from AWS.

Fix:
Call `d.SetId(...)` before `WaitRouteReady(...)`.

Fixes: hashicorp#23827
@github-actions
Copy link

github-actions bot commented May 6, 2022

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators May 6, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Addresses a defect in current functionality. service/ec2 Issues and PRs that pertain to the ec2 service.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants