Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Keep the same EIP on the instances #102

Closed
nitrocode opened this issue May 26, 2024 · 8 comments
Closed

Keep the same EIP on the instances #102

nitrocode opened this issue May 26, 2024 · 8 comments

Comments

@nitrocode
Copy link
Contributor

nitrocode commented May 26, 2024

It seems like it's possible, with some lambda hackery to keep the same set of static EIPs to instances in the autoscaling group.

For example, if the asg needs to rotate an instance due to max refresh

  • asg scales from 3 to 4 instances
  • wait for new ec2 to be healthy
  • lambda associates instance-to-be-deleted's ip to the new ec2
  • asg then destroys the older instance

Could this work?

Would this also avoid needing to update the route table every time?

Would this still drop existing connections?

References

@eddycek
Copy link
Contributor

eddycek commented May 27, 2024

@nitrocode

If the Elastic IP address is already associated with a different instance, it is disassociated from that instance and associated with the specified instance.

I think it would have to wait anyway, for all open connections. This means that you would have to have some kind of network check that would wait until all open connections are closed on the old instance.

@nitrocode
Copy link
Contributor Author

nitrocode commented May 27, 2024

Ya that's true. It would need connection draining and I don't believe there is a way to do this in ec2 natively or easily.

It's possible the aws gateway load balancer might be useful for draining (#30 (comment)). Then the ec2s may not even need public ips and the lambda wouldn't need to update the route table. The load balanacer IPs would be static while the ec2s behind it scale up and down.

@bwhaley
Copy link
Member

bwhaley commented May 31, 2024

Are you suggesting that the pool of EIPs are static within an AZ? What is the benefit of doing this?

Would this also avoid needing to update the route table every time?

The route table would still need to be updated. That's unrelated to the EIP.

Would this still drop existing connections?

Connections are managed by the kernel on the NAT instance. When the instance dies, the connections are lost. Clients may still have an open connection to that target EIP, but the NAT instance doesn't know about those connections.

I think conntrackd could help to keep the connections alive, but I've not yet tested it.

@nitrocode
Copy link
Contributor Author

The original

Are you suggesting that the pool of EIPs are static within an AZ? What is the benefit of doing this?

This way you don't need the lambda to update the route table. Instead, you can create the eips and update the route table only once in terraform. However, then you need probably a lambda to the pool of eips to the ec2s as they come up.


I haven't looked at conntrackd. Looks like it has some potential.


What about the gateway load balancer #102 (comment)? This way you would direct the routes to the static ips of the lb and then the nat instances wouldn't need an eip, no?

@ekhaydarov
Copy link

I dont get it does this not achieve what you would like? We pass in allocation ids of the eips we have reserved.

From my understanding the asg never scaled? its static to 1 instance per az? I may be off as I am trying to understand how to deploy this properly with slightly different specs.

@nitrocode
Copy link
Contributor Author

Thanks. I did not see that var before.

The instances need to be rotated to update the AMI or can be automatically rotated if setting an instance refresh. There are times when AWS will also shutdown an instance due to maintenance. All these events can impact the NAT uptime.

My hope is the gateway load balancer can simplify things but I have not tried it yet.

@ekhaydarov
Copy link

ekhaydarov commented Jun 3, 2024

for all the above scenarios the lambdas will simply route to a fallback standby aws natgw, for which you can also set static ips. the issue is when you alternat instance is back up and running with the same ip there is no mechanism to reroute it back to alternat soy ou have to manually do it yourself. theres an issue for it open.

For all of this there definitely seems like there is a cleaner way to do this all but its over my head for now

@bwhaley
Copy link
Member

bwhaley commented Jun 3, 2024

The ask is still not clear for me on this one. We don't have any plans to explore Gateway Load Balancer as another solution. I'm open to taking a peek at it if somebody wants to do the leg work. There is an hourly cost and a GLCU cost which is rated based on connections and data volume, but this seems significantly (order of magnitude) less than NAT Gw, so it may still be worthwhile.

If you'd like to discuss the GWLB solution more, please open a PR with the proposed change and some testing steps so we can take a look. Going to close this one for now though as I don't see anything else actionable here.

@bwhaley bwhaley closed this as completed Jun 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants