Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Many DEBUG datafusion_functions_array] Overwrite existing UDF: array_to_string messages in log #10658

Closed
alamb opened this issue May 24, 2024 · 2 comments · Fixed by #10661
Closed
Assignees
Labels
bug Something isn't working good first issue Good for newcomers

Comments

@alamb
Copy link
Contributor

alamb commented May 24, 2024

Describe the bug

We noticed some additional expected log messages upstream in InfluxDB. I found the same messages are present in datafusion-cli

To Reproduce

andrewlamb@Andrews-MacBook-Pro-2:~/Software/influxdb_iox$ RUST_LOG=DEBUG datafusion-cli
DataFusion CLI v38.0.0
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: array_to_string
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: string_to_array
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: range
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: generate_series
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: array_dims
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: cardinality
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: array_ndims
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: array_append
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: array_prepend
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: array_concat
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: array_except
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: array_element
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: array_pop_back
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: array_pop_front
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: array_slice
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: array_has
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: array_has_all
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: array_has_any
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: empty
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: array_length
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: flatten
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: array_sort
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: array_repeat
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: array_resize
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: array_reverse
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: array_distinct
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: array_intersect
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: array_union
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: array_position
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: array_positions
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: array_remove
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: array_remove_all
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: array_remove_n
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: array_replace_n
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: array_replace_all
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: array_replace

Expected behavior

I would expect that we aren't re-registering the same UDF multiple times 🤔

Additional context

No response

@alamb alamb added the bug Something isn't working label May 24, 2024
@jayzhan211
Copy link
Contributor

jayzhan211 commented May 25, 2024

Just need to remove aliases in those functions

For example,

String::from("array_has"),
String::from("list_has"),
String::from("array_contains"),
String::from("list_contains"),

Remove the one duplicate name in aliases

@jayzhan211 jayzhan211 added the good first issue Good for newcomers label May 25, 2024
@goldmedal
Copy link
Contributor

take

Michael-J-Ward added a commit to Michael-J-Ward/datafusion-python that referenced this issue Jul 25, 2024
The alias list no longer includes the name of the function.

Ref: apache/datafusion#10658
andygrove pushed a commit to apache/datafusion-python that referenced this issue Jul 31, 2024
* chore: update datafusion deps

* feat: impl ExecutionPlan::static_name() for DatasetExec

This required trait method was added upstream [0] and recommends to simply forward to `static_name`.

[0]: apache/datafusion#10266

* feat: update first_value and last_value wrappers.

Upstream signatures were changed for the new new `AggregateBuilder` api [0].

This simply gets the code to work. We should better incorporate that API into `datafusion-python`.

[0] apache/datafusion#10560

* migrate count to UDAF

Builtin Count was removed upstream.

TBD whether we want to re-implement `count_star` with new API.

Ref: apache/datafusion#10893

* migrate approx_percentile_cont, approx_distinct, and approx_median to UDAF

Ref: approx_distinct apache/datafusion#10851
Ref: approx_median apache/datafusion#10840
Ref: approx_percentile_cont and _with_weight apache/datafusion#10917

* migrate avg to UDAF

Ref: apache/datafusion#10964

* migrage corr to UDAF

Ref: apache/datafusion#10884

* migrate grouping to UDAF

Ref: apache/datafusion#10906

* add alias `mean` for UDAF `avg`

* migrate stddev to UDAF

Ref: apache/datafusion#10827

* remove rust alias for stddev

The python wrapper now provides stddev_samp alias.

* migrage var_pop to UDAF

Ref: apache/datafusion#10836

* migrate regr_* functions to UDAF

Ref: apache/datafusion#10898

* migrate bitwise functions to UDAF

The functions now take a single expression instead of a Vec<_>.

Ref: apache/datafusion#10930

* add missing variants for ScalarValue with todo

* fix typo in approx_percentile_cont

* add distinct arg to count

* comment out failing test

`approx_percentile_cont` is now returning a DoubleArray instead of an IntArray.

This may be a bug upstream; it requires further investigation.

* update tests to expect lowercase `sum` in query plans

This was changed upstream.

Ref: apache/datafusion#10831

* update ScalarType data_type map

* add docs dependency pickleshare

* re-implement count_star

* lint: ruff python lint

* lint: rust cargo fmt

* include name of window function in error for find_window_fn

* refactor `find_window_fn` for debug clarity

* search default aggregate functions by both name and aliases

The alias list no longer includes the name of the function.

Ref: apache/datafusion#10658

* fix markdown in find_window_fn docs

* parameterize test_window_functions

`first_value` and `last_value` are currently failing and marked as xfail.

* add test ids to test_simple_select tests marked xfail

* update find_window_fn to search built-ins first

The behavior of `first_value` and `last_value` UDAFs currently does not match the built-in behavior.
This allowed me to remove `marks=pytest.xfail` from the window tests.

* improve first_call and last_call use of the builder API

* remove trailing todos

* fix examples/substrait.py

* chore: remove explicit aliases from functions.rs

Ref: #779

* remove `array_fn!` aliases

* remove alias rules for `expr_fn_vec!`

* remove alias rules from `expr_fn!` macro

* remove unnecessary pyo3 var-arg signatures in functions.rs

* remove pyo3 signatures that provided defaults for first_value and last_value

* parametrize test_string_functions

* test regr_ function wrappers

Closes #778
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good first issue Good for newcomers
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants