Skip to content

[EPIC] Migrate to functions in datafusion-spark crate #2084

@andygrove

Description

@andygrove

What is the problem the feature request solves?

There is a community effort to add Spark-compatible functions in the core DataFusion project in the datafusion-spark crate. See apache/datafusion#15914 for more information.

We want to donate existing Comet functions to this new crate. This epic issue is to track donating our functions and then using them in Comet. In some cases, there may be existing versions of the function already in datafusion-spark, so we will need to review these and see if they would benefit from any improvements based on the Comet versions.

Functions already in datafusion-spark:

Functions to be donated from Comet:

  • Aggregates
  • avg / avg_decimal
  • correlation
  • covariance
  • stddev
  • sum_decimal
  • variance
  • Array
  • array_insert
  • array_repeat
  • get_array_struct_fields
  • list_extract
  • Bitwise
  • bitwise_count
  • bitwise_get
  • bitwise_not
  • Bloom Filter
  • bloom_filter_agg
  • bloom_filter_might_contain
  • Conditional
  • if
  • Conversion
  • cast (not trivial)
  • Date/Time
  • date_add
  • date_sub
  • date_trunc
  • extract_date_part
  • timestamp_trunc
  • JSON
  • to_json
  • Math
  • ceil
  • div
  • floor
  • modulo
  • negative
  • round
  • Non-deterministic
  • monotonicall_increasing_id
  • rand
  • randn
  • Predicate
  • is_nan
  • rlike
  • String
  • string_space
  • substring
  • Struct
  • create_named_struct
  • get_struct_field

Describe the potential solution

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions