Skip to content

[C++] Enable support on field_ref compute expression for also Column Indice #35579

@davisusanibar

Description

@davisusanibar

Describe the enhancement requested

Current field_ref Expression are able to support ref by Column Name but does not offer support for Column Index.

Will be useful to also support Column index in case some integration need to pass Column index instead of Column name.

Reproduce message error:

void reproduceInferringColumnProjection(){
    const std::string& directory_base = "/diretory_of_your_parquet_file/nation_tpch/";;
    std::shared_ptr<arrow::fs::LocalFileSystem> fs =
            std::make_shared<arrow::fs::LocalFileSystem>();
    arrow::fs::FileSelector selector;
    selector.base_dir = directory_base;
    selector.recursive = true;
    std::vector<arrow::fs::FileInfo> file_infos = fs->GetFileInfo(selector).ValueOrDie();
    int num_printed = 0;
    auto format =
            std::make_shared<arrow::dataset::ParquetFileFormat>();
    arrow::dataset::FileSystemFactoryOptions options;
    std::shared_ptr<arrow::dataset::DatasetFactory> dataset_factory =
            (arrow::dataset::FileSystemDatasetFactory::Make(fs, selector, format, options)).ValueOrDie();
    std::shared_ptr<arrow::dataset::Dataset> dataset = (dataset_factory->Finish()).ValueOrDie();
    arrow::dataset::ScannerBuilder scanner_builder(dataset);
    // Error: NotImplemented: Inferring column projection from FieldRef FieldRef.FieldPath(0)
    scanner_builder.Project({compute::call("add", {compute::field_ref(0), compute::literal(10)})}, {"column_0"});
    // OK there are support in case we use Column Name instead of Column Index
    // scanner_builder.Project({compute::call("add", {compute::field_ref("n_nationkey"), compute::literal(10)})}, {"column_0"});
    std::shared_ptr<arrow::dataset::Scanner> scanner = scanner_builder.Finish().ValueOrDie();
    std::shared_ptr<arrow::Table> table = scanner->ToTable().ValueOrDie();
    std::cout << "Table with " << table->num_rows() << " rows and " << table->num_columns() << " columns" << std::endl;
    std::cout << table->ToString() << std::endl;
}

Message error:

NotImplemented: Inferring column projection from FieldRef FieldRef.FieldPath(0)

Component(s)

C++

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions