-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update Iceberg metadata in case of DR #5779
Comments
Just seen this thread #1617 |
Can we use migrate_table procedure for this to specify the s3 path that points to the destination location |
@asheeshgarg, will s3 access-points for iceberg, work for your use case ? |
@singhpk234 s3 access points are still region specific Access point ARNs use the format arn:aws:s3:region:account-id:accesspoint/resource |
The metadata files will still be pointing to my-bucket1 (actual s3 path) but while making s3 request via Iceberg (GET + PUT) the my-bucket1 path will be replaced by access-point. Now access point will take care of replication across buckets configured and choose the best available low latency bucket behind the access point. |
@singhpk234 so just to understand it correctly we will define two buckets for cross region |
yes, if you map both the bucket (present in different region) to a multi-region access-point. can ref to this slack thread as well, where this idea originated : https://apache-iceberg.slack.com/archives/C025PH0G1D4/p1645066803099319
No, Let's say your table path is under Now if you can use a mutli-region access-point pointing to mybucket1, and mybucket2. it acts a proxy and single global hostname between two and internally routes the request to location with lowest latency... More about access-points here https://aws.amazon.com/s3/features/multi-region-access-points/ |
This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. To permanently prevent this issue from being considered stale, add the label 'not-stale', but commenting on the issue is preferred when possible. |
This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' |
Query engine
Spark
Question
Lets say we have a DR Situation where we like to up the iceberg metadata and data copied to DR location. Since s3 buckets are global namespaces we will have different bucket names in DR locations.
How to rename the metadata so that it start pointing to the correct location of the DR s3 location? Do we have any util or spark procedure for it?
The text was updated successfully, but these errors were encountered: