Skip to content

Improve toValues() function performance for Iceberg equality delete read #7

@yingsu00

Description

@yingsu00

Description

facebookincubator#12659
The toValues function in velox/functions/prestosql/InPredicate.cpp de-duplicates values in vector, and extract a bool to indicate if there are nulls. It uses the following loop:

for (auto i = offset; i < offset + size; i++) {
    if (simpleValues->isNullAt(i)) {
      hasNull = true;
    } else {
      if constexpr (std::is_same_v<U, Timestamp>) {
        values.emplace_back(simpleValues->valueAt(i).toMillis());
      } else {
        values.emplace_back(simpleValues->valueAt(i));
      }
    }
  }

This loop can be improved when simpleValues->mayHaveNulls() is false. In such case we can separate the loops into two loops. We also want to add a micro-benchmark for it.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions