Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change public simplify API and add a public coerce API #3758

Merged
merged 3 commits into from
Oct 12, 2022

Conversation

alamb
Copy link
Contributor

@alamb alamb commented Oct 7, 2022

Which issue does this PR close?

Closes #3708
re #3709

Rationale for this change

See #3708

I am not sure if anyone other than IOx calls the expression simplification directly, but we do and I have had to change it non trivially now that type coercion is required prior to simplification (b/c it is required to run the constant evaluator)

What changes are included in this PR?

  1. Create a ExprSimplifier struct
  2. Simplify and make public SimplfyContext
  3. move the Expr simplification API to ExprSimplifier
  4. Add a ExprSimplifier::coerce function and tests

Note I plan to add better documentation / examples via #3740 / #3741

Follow on API improvements proposed in #3793

Are there any user-facing changes?

New API, changes to existing simplify API (I will highlight the changes inline)

@alamb alamb added the api change Changes the API exposed to users of the crate label Oct 7, 2022
@github-actions github-actions bot added logical-expr Logical plan and expressions optimizer Optimizer rules labels Oct 7, 2022
@alamb alamb force-pushed the alamb/new_simplify_coerce_api branch from 6567e5e to 56f115b Compare October 7, 2022 19:22
@@ -15,17 +15,24 @@
// specific language governing permissions and limitations
// under the License.

//! Expression simplifier
//! Expression simplification API
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR looks bigger than it really is -- it moves a bunch of code around and there are a significant number of docstrings

@@ -37,28 +44,41 @@ pub trait SimplifyInfo {
fn execution_props(&self) -> &ExecutionProps;
}

/// trait for types that can be simplified
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed in favor of calling ExprSimplifier::simplify directly

// rather than creating an DFSchemaRef coerces rather than doing
// it manually.
// TODO ticekt
pub fn coerce(&self, expr: Expr, schema: DFSchemaRef) -> Result<Expr> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is the new API for coercion

Copy link
Contributor

@ygf11 ygf11 Oct 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ticekt, typo?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call -- will file ticket.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/// let simplified = simplifier.simplify(expr).unwrap();
/// assert_eq!(simplified, col("b").lt(lit(2)));
/// ```
pub struct SimplifyContext<'a> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was moved into the public API and I added some docs and a builder interface with_schema

};
use datafusion_physical_expr::{create_physical_expr, execution_props::ExecutionProps};

/// Provides simplification information based on schema and properties
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved

@@ -950,30 +909,12 @@ macro_rules! assert_contains {
};
}

/// Apply simplification and constant propagation to ([Expr]).
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of adding a simplify method on to Expr via this trait, I propose to have an ExprSimplifier struct that has a simplify method.

cc @ygf11
I found it made the examples in expr_api.rs less awkward to write because the schema wasn't needed

Copy link
Contributor

@ygf11 ygf11 Oct 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. It is more flexible, since user can define their SimplifyInfo other than inner SimplifyContext now.

@github-actions github-actions bot added the core Core DataFusion crate label Oct 7, 2022
@alamb alamb changed the title Change public simplify API and the Public coerce API Change public simplify API and add a public coerce API Oct 7, 2022
/// See the [type coercion module](datafusion_expr::type_coercion)
/// documentation for more details on type coercion
///
// Would be nice if this API could use the SimplifyInfo
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this PR is approved, I will file a ticket to remove the schema argument in this function call.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andygrove
Copy link
Member

I plan on reviewing this one tomorrow @alamb

@andygrove andygrove self-requested a review October 11, 2022 00:53
@alamb
Copy link
Contributor Author

alamb commented Oct 11, 2022

I plan on reviewing this one tomorrow @alamb

Thank you @andygrove - I will have it ready and I'll also help try and review the outstanding DataFusion PRs as well

@alamb
Copy link
Contributor Author

alamb commented Oct 11, 2022

This is blocking me upgrading DataFusion in my project (IOx). I know @andygrove plans to review today. Unless I hear otherwise I'll plan to merge PR tomorrow.

@alamb alamb merged commit 61c38b7 into apache:master Oct 12, 2022
@alamb alamb deleted the alamb/new_simplify_coerce_api branch October 12, 2022 10:54
@alamb
Copy link
Contributor Author

alamb commented Oct 12, 2022

I am happy to respond to comments / make changes as a follow on PR

@ursabot
Copy link

ursabot commented Oct 12, 2022

Benchmark runs are scheduled for baseline = fb39d5d and contender = 61c38b7. 61c38b7 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ec2-t3-xlarge-us-east-2] ec2-t3-xlarge-us-east-2
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on test-mac-arm] test-mac-arm
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ursa-i9-9960x] ursa-i9-9960x
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ursa-thinkcentre-m75q] ursa-thinkcentre-m75q
Buildkite builds:
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api change Changes the API exposed to users of the crate core Core DataFusion crate logical-expr Logical plan and expressions optimizer Optimizer rules
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Expose + document the type coercion API publicly
4 participants