Keep the same EIP on the instances #102

nitrocode · 2024-05-26T14:33:54Z

It seems like it's possible, with some lambda hackery to keep the same set of static EIPs to instances in the autoscaling group.

For example, if the asg needs to rotate an instance due to max refresh

asg scales from 3 to 4 instances
wait for new ec2 to be healthy
lambda associates instance-to-be-deleted's ip to the new ec2
asg then destroys the older instance

Could this work?

Would this also avoid needing to update the route table every time?

Would this still drop existing connections?

References

eddycek · 2024-05-27T09:25:47Z

@nitrocode

If the Elastic IP address is already associated with a different instance, it is disassociated from that instance and associated with the specified instance.

I think it would have to wait anyway, for all open connections. This means that you would have to have some kind of network check that would wait until all open connections are closed on the old instance.

nitrocode · 2024-05-27T13:54:00Z

Ya that's true. It would need connection draining and I don't believe there is a way to do this in ec2 natively or easily.

It's possible the aws gateway load balancer might be useful for draining (#30 (comment)). Then the ec2s may not even need public ips and the lambda wouldn't need to update the route table. The load balanacer IPs would be static while the ec2s behind it scale up and down.

bwhaley · 2024-05-31T22:43:48Z

Are you suggesting that the pool of EIPs are static within an AZ? What is the benefit of doing this?

Would this also avoid needing to update the route table every time?

The route table would still need to be updated. That's unrelated to the EIP.

Would this still drop existing connections?

Connections are managed by the kernel on the NAT instance. When the instance dies, the connections are lost. Clients may still have an open connection to that target EIP, but the NAT instance doesn't know about those connections.

I think conntrackd could help to keep the connections alive, but I've not yet tested it.

nitrocode · 2024-06-01T14:33:52Z

The original

Are you suggesting that the pool of EIPs are static within an AZ? What is the benefit of doing this?

This way you don't need the lambda to update the route table. Instead, you can create the eips and update the route table only once in terraform. However, then you need probably a lambda to the pool of eips to the ec2s as they come up.

I haven't looked at conntrackd. Looks like it has some potential.

What about the gateway load balancer #102 (comment)? This way you would direct the routes to the static ips of the lb and then the nat instances wouldn't need an eip, no?

ekhaydarov · 2024-06-03T07:23:51Z

I dont get it does this not achieve what you would like? We pass in allocation ids of the eips we have reserved.

From my understanding the asg never scaled? its static to 1 instance per az? I may be off as I am trying to understand how to deploy this properly with slightly different specs.

nitrocode · 2024-06-03T07:31:10Z

Thanks. I did not see that var before.

The instances need to be rotated to update the AMI or can be automatically rotated if setting an instance refresh. There are times when AWS will also shutdown an instance due to maintenance. All these events can impact the NAT uptime.

My hope is the gateway load balancer can simplify things but I have not tried it yet.

ekhaydarov · 2024-06-03T08:19:41Z

for all the above scenarios the lambdas will simply route to a fallback standby aws natgw, for which you can also set static ips. the issue is when you alternat instance is back up and running with the same ip there is no mechanism to reroute it back to alternat soy ou have to manually do it yourself. theres an issue for it open.

For all of this there definitely seems like there is a cleaner way to do this all but its over my head for now

bwhaley · 2024-06-03T21:49:05Z

The ask is still not clear for me on this one. We don't have any plans to explore Gateway Load Balancer as another solution. I'm open to taking a peek at it if somebody wants to do the leg work. There is an hourly cost and a GLCU cost which is rated based on connections and data volume, but this seems significantly (order of magnitude) less than NAT Gw, so it may still be worthwhile.

If you'd like to discuss the GWLB solution more, please open a PR with the proposed change and some testing steps so we can take a look. Going to close this one for now though as I don't see anything else actionable here.

bwhaley closed this as completed Jun 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Keep the same EIP on the instances #102

Keep the same EIP on the instances #102

nitrocode commented May 26, 2024 •

edited

Loading

eddycek commented May 27, 2024

nitrocode commented May 27, 2024 •

edited

Loading

bwhaley commented May 31, 2024

nitrocode commented Jun 1, 2024

ekhaydarov commented Jun 3, 2024

nitrocode commented Jun 3, 2024

ekhaydarov commented Jun 3, 2024 •

edited

Loading

bwhaley commented Jun 3, 2024 •

edited

Loading

Keep the same EIP on the instances #102

Keep the same EIP on the instances #102

Comments

nitrocode commented May 26, 2024 • edited Loading

eddycek commented May 27, 2024

nitrocode commented May 27, 2024 • edited Loading

bwhaley commented May 31, 2024

nitrocode commented Jun 1, 2024

ekhaydarov commented Jun 3, 2024

nitrocode commented Jun 3, 2024

ekhaydarov commented Jun 3, 2024 • edited Loading

bwhaley commented Jun 3, 2024 • edited Loading

nitrocode commented May 26, 2024 •

edited

Loading

nitrocode commented May 27, 2024 •

edited

Loading

ekhaydarov commented Jun 3, 2024 •

edited

Loading

bwhaley commented Jun 3, 2024 •

edited

Loading