-
Notifications
You must be signed in to change notification settings - Fork 504
Milestone
Description
Search before asking
- I searched in the issues and found nothing similar.
Description
Purpose
Linked issue: close #2310 , add field id to nested rows.
Brief change log
All fields are assigned unique, sequential IDs in a flattened order, regardless of nesting level. For example:
struct<
a: tinyint,
b: struct<
c: tinyint,
d: struct<
e: tinyint,
f: tinyint
>,
g: string
>
>
Then the field Id for each field is:
| Field Name | Field ID |
|---|---|
| a | 0 |
| b | 1 |
| b.c | 2 |
| b.d | 3 |
| b.d.e | 4 |
| b.d.f | 5 |
| b.g | 6 |
Why Flatten Numerical Order?
- Simplifies ID Management: No need to compute hierarchical offsets (e.g., parent_id * depth + child_index).
- Compatibility: Works seamlessly with flat data structures like Arrow's RecordBatch. In our FileLogProjection, same thing is also done. Thus later projection push down will be more easier.
API and Format
Add field_id to org.apache.fluss.types.DataField.
Documentation
Willingness to contribute
- I'm willing to submit a PR!
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels