Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add name() method to UserDefinedLogicalNode #5450

Merged
merged 1 commit into from
Mar 6, 2023

Conversation

waynexia
Copy link
Member

@waynexia waynexia commented Mar 2, 2023

Which issue does this PR close?

Closes #5449.

Rationale for this change

Add name() method so that there is no need to test a UserDefinedLogicalNode using downcast_ref().

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

The public interface UserDefinedLogicalNode is changed

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
@github-actions github-actions bot added core Core DataFusion crate logical-expr Logical plan and expressions optimizer Optimizer rules labels Mar 2, 2023
@comphead
Copy link
Contributor

comphead commented Mar 2, 2023

@waynexia wouldn't this better to use enum for user node names?

@waynexia
Copy link
Member Author

waynexia commented Mar 2, 2023

@waynexia wouldn't this better to use enum for user node names?

I'm afraid it's hard to define this enum because we can't enumerate all the possible node in advance 🧐

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @waynexia -- I think it may be useful to see if we can combine this somehow with format_for_explain (as mentioned in my comments) but this also seems like useful information as well so I would be fine with merging it as is.

@@ -30,6 +30,9 @@ pub trait UserDefinedLogicalNode: fmt::Debug + Send + Sync {
/// Return a reference to self as Any, to support dynamic downcasting
fn as_any(&self) -> &dyn Any;

/// Return the plan's name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think similar information is present via

    fn fmt_for_explain(&self, f: &mut fmt::Formatter) -> fmt::Result;

For example: https://github.com/apache/arrow-datafusion/blob/a95e0ec2fd929aae1c2f67148243eb4825d81a3b/datafusion/proto/src/logical_plan/mod.rs#L1681-L1683

However, it is not easy to use (as you need a formatter, etc)

What would you think about making a function like the following that returned the result of format_for_explain

    fn to_string(&self) -> String {
       // figure out how to call format_for_explain here
     }

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ignored the fmt_for_explain method... You are right, they are redundant information.

What would you think about making a function like the following that returned the result of format_for_explain

The returned string (like "TopK: k=1") is also not easy to use for the dispatching purpose, like we need to .contains() or .start_with() those strings. What do you think about separate name and parameters? E.g.:

fn name(&self) -> &'str {
  "TopK"
}

fn fmt_for_params(&self, f: &mut fmt::Formatter) -> fmt::Result { 
   write!(f, "k={}", self.k) 
}

// this can be provided as default implementation
fn fmt_for_explain(&self, f: &mut fmt::Formatter) -> fmt::Result { 
   write!(f, "{}: {}", self.name(), self.fmt_for_params()) 
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about separate name and parameters? E.g.:

I think it would make sense, but could also be done as a follow on PR. I think this PR is good enough for now.

Thanks @waynexia

@alamb alamb merged commit c56b947 into apache:main Mar 6, 2023
@waynexia waynexia deleted the extension-name branch March 6, 2023 11:19
@ursabot
Copy link

ursabot commented Mar 6, 2023

Benchmark runs are scheduled for baseline = eb95413 and contender = c56b947. c56b947 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ec2-t3-xlarge-us-east-2] ec2-t3-xlarge-us-east-2
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on test-mac-arm] test-mac-arm
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ursa-i9-9960x] ursa-i9-9960x
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ursa-thinkcentre-m75q] ursa-thinkcentre-m75q
Buildkite builds:
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

@andygrove andygrove added the enhancement New feature or request label Mar 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core DataFusion crate enhancement New feature or request logical-expr Logical plan and expressions optimizer Optimizer rules
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add name() method to UserDefinedLogicalNode
5 participants