Skip to content

Commit

Permalink
feat: support write (#10)
Browse files Browse the repository at this point in the history
* add arrow_struct_to_iceberg_struct

* refine writer interface

* support fanout partition writer

* support sort_position_delete_writer

* support equality delta writer

* support precompute partition writer

* update value convert

* fix some wrong in writer

* implement Display for NamespaceIdent

* expose _serde::DataFile

* fix FieldSummary generated from Manifest

* add delete file support for transaction

* fix record_batch_partition_spliter

* fix day transform

* fix RawLiteralEnum::Record

* fix nullable field of equality delete writer

* support to delete empty row file

* fix decimal parse for parquet statistics

---------

Co-authored-by: ZENOTME <st810918843@gmail.com>
  • Loading branch information
ZENOTME and ZENOTME committed Dec 30, 2024
1 parent 54ef090 commit 781f518
Show file tree
Hide file tree
Showing 25 changed files with 3,013 additions and 54 deletions.
3 changes: 3 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 2 additions & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -43,11 +43,13 @@ apache-avro = "0.17"
array-init = "2"
arrow-arith = { version = "53" }
arrow-array = { version = "53" }
arrow-buffer = { version = "53" }
arrow-cast = { version = "53" }
arrow-ord = { version = "53" }
arrow-schema = { version = "53" }
arrow-select = { version = "53" }
arrow-string = { version = "53" }
arrow-row = { version = "53" }
async-stream = "0.3.5"
async-trait = "0.1"
async-std = "1.12"
Expand Down
2 changes: 2 additions & 0 deletions crates/iceberg/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -46,8 +46,10 @@ apache-avro = { workspace = true }
array-init = { workspace = true }
arrow-arith = { workspace = true }
arrow-array = { workspace = true }
arrow-buffer = { workspace = true }
arrow-cast = { workspace = true }
arrow-ord = { workspace = true }
arrow-row = { workspace = true }
arrow-schema = { workspace = true }
arrow-select = { workspace = true }
arrow-string = { workspace = true }
Expand Down
5 changes: 4 additions & 1 deletion crates/iceberg/src/arrow/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -22,5 +22,8 @@ pub use schema::*;
mod reader;
pub(crate) mod record_batch_projector;
pub(crate) mod record_batch_transformer;

mod value;
pub use reader::*;
pub use value::*;
mod record_batch_partition_spliter;
pub(crate) use record_batch_partition_spliter::*;
3 changes: 1 addition & 2 deletions crates/iceberg/src/arrow/reader.rs
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@ use parquet::arrow::{ParquetRecordBatchStreamBuilder, ProjectionMask, PARQUET_FI
use parquet::file::metadata::{ParquetMetaData, ParquetMetaDataReader};
use parquet::schema::types::{SchemaDescriptor, Type as ParquetType};

use super::record_batch_transformer::RecordBatchTransformer;
use crate::arrow::{arrow_schema_to_schema, get_arrow_datum};
use crate::error::Result;
use crate::expr::visitors::bound_predicate_visitor::{visit, BoundPredicateVisitor};
Expand All @@ -51,8 +52,6 @@ use crate::spec::{DataContentType, Datum, PrimitiveType, Schema};
use crate::utils::available_parallelism;
use crate::{Error, ErrorKind};

use super::record_batch_transformer::RecordBatchTransformer;

/// Builder to create ArrowReader
pub struct ArrowReaderBuilder {
batch_size: Option<usize>,
Expand Down
Loading

0 comments on commit 781f518

Please sign in to comment.