Skip to content

Add Nested Row 's field_id  #2310

@loserwang1024

Description

@loserwang1024

Search before asking

  • I searched in the issues and found nothing similar.

Description

Purpose

Linked issue: close #2310 , add field id to nested rows.

Brief change log

All fields are assigned unique, sequential IDs in a flattened order, regardless of nesting level. For example:

struct< 
  a: tinyint, 
  b: struct< 
    c: tinyint, 
    d: struct< 
      e: tinyint, 
      f: tinyint
    >, 
    g: string 
  > 
>

Then the field Id for each field is:

Field Name Field ID
a 0
b 1
b.c 2
b.d 3
b.d.e 4
b.d.f 5
b.g 6

Why Flatten Numerical Order?

  • Simplifies ID Management: No need to compute hierarchical offsets (e.g., parent_id * depth + child_index).
  • Compatibility: Works seamlessly with flat data structures like Arrow's RecordBatch. In our FileLogProjection, same thing is also done. Thus later projection push down will be more easier.
image

API and Format

Add field_id to org.apache.fluss.types.DataField.

Documentation

Willingness to contribute

  • I'm willing to submit a PR!

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions