IcebergMetadataRewrite

This project require hive running on local machine,you can start hive with docker image in hiveDocker directory. Project will rewrite iceberg metadata, it will accept the table base and update it to all metadata file.

Problem

Suppose you copied an iceberg table(data and metadata directory) from another cluster or move the table to another location by copying the directories, this will result in un-readable table.

Solution

There can be following solutions

use custom FileIO which will change base path of all File IO operation here I implemented the custom file IO which read/write files from different path which is not written in metadata, instead it's a new base path where you moved your iceberg data and metadata folders
We can rewrite iceberg metadata and register table with updated metadata.(implemented in another project called icebergMetadataRewrite)

Dependencies

Spark (version 3.2)
Iceberg-spark-runtime-3.2_2.12 ( version 1.1.0)

Code walk through

This project will implement customFileIO for iceberg table which will be used for spark read/write, Please check IntegrationTest.java for main flow of the program. Note: same type of implementation required if you are using hive/presto read

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
hiveDocker		hiveDocker
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

IcebergMetadataRewrite

Problem

Solution

Dependencies

Code walk through

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

ksmatharoo/IcebergCustomFileIO

Folders and files

Latest commit

History

Repository files navigation

IcebergMetadataRewrite

Problem

Solution

Dependencies

Code walk through

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages