Fix and improve `CommonSubexprEliminate` rule #10396

peter-toth · 2024-05-06T19:24:45Z

Which issue does this PR close?

Part of #9873.

Rationale for this change

This PR started as part of #9873 to reduce number of Expr clones but after some investigation it shifted to be a fix for the rule's correctness issues.

The current CommonSubexprEliminate was refactored in fix(9870): common expression elimination optimization, should always re-find the correct expression during re-write. #9871 to remove the IdArray cache and simplify the identifier of expresions. Unfortunately that change doesn't seem to be correct. The source of the issue is that an idenfifier needs to represent an expression subtreee and the newly chosen "stringified expr" as identifier doesn't seem to fulfill that purpose. E.g. an identifier shouldn't belong to 2 different expressions:
```
println!("{}", ExprSet::expr_identifier(&(col("a") + col("b"))));    // a + b
println!("{}", ExprSet::expr_identifier(&(col("a + b"))));           // a + b
```
Sidenote:
Actually I wanted to show that correctness issue of the current CommonSubexprEliminate in a test, but when I wrote a test with colliding column names I run into a different issue, that DataFusion resolves the col("a") + col("b") expression as if it was col("a + b") if an a + b field exists in the schema.
This is a different issue (not related to CommonSubexprEliminate at all) and can be easily reproduced:
```
DataFusion CLI v37.1.0
> select a + b from (select 1 as a, 2 as b, 1 as "a + b");
+-------+
| a + b |
+-------+
| 1     | <- Should be 3
+-------+
1 row(s) fetched.
Elapsed 0.009 seconds.
```
So in this the first commit of PR I revert fix(9870): common expression elimination optimization, should always re-find the correct expression during re-write. #9871.
Then I investigated what is the actual purpose of Identifiers, why don't we use a simple HashMap<Expr, (usize, DataType, Identifier)> as ExprSet? It is clear that we need to generate a unique alias for the extracted common expressions, but why is the key of the map is an Identifier and not &Expr or Expr itself. And actually it turned out that the reasons are already explained in the comments.
```
/// An identifier should (ideally) be able to "hash", "accumulate", "equal" and "have no
/// collision (as low as possible)"
```
If we used Expr as the key of the map computing the hash() of the keys would require traversing on the whole Expr, which can be very costly as Exprs contain indirections to subexpressions as Box<Expr> or Vec<Expr>.

Using special identifiers to represent Expr trees and caching those identifiers by the preorder visit indexes in IdArray should significantly speed up the second top-down traversal that does the actual expression rewrite.

Sidenote: the current long String identifiers are also not a good choice. We need to revisit this in a follow-up PR and choose something like (usize, &Expr) tuple as identifiers. The first element of a tuple is a pre-calculated hash() of an expression tree, that is built-up during the first bottom-up traversal. And the referece to expression is there to implement the eq().
The second commit is a refactor and fix of the algorithm as reverting Stop copying Exprs and LogicalPlans so much during Common Subexpression Elimination #9873 caused the Common expression elimination should also re-find the correct expression, during re-write. #9870 issue to resurface. This is a major refactor but I think the code of ExprIdentifierVisitor and CommonSubexprRewriter became much cleaner.
The 3rd commit eliminates some Expr clones in ExprSets.
The 4th and 5th commit contain only renames and docs fixes. I think ExprStats is a better name for ExprSet as the purpose of that data structure is store the counts. Also, IMO CommonExpr / common_exprs is a better name for affected_id to store the common expressions that got extracted.

What changes are included in this PR?

Please see above.

Are these changes tested?

Yes, with existing UTs.

Are there any user-facing changes?

No.

… always re-find the correct expression during re-write. (apache#9871)" This reverts commit cd7a00b.

…ds, better JumpMark handling, better variable names, code cleanup, some new todos

peter-toth · 2024-05-06T19:47:24Z

cc @alamb, @erratic-pattern and @MohamedAbdeen21 as this PR is related to your recent comments/PRs.

alamb · 2024-05-06T19:55:32Z

Possibly related #10333

peter-toth · 2024-05-06T19:56:38Z

datafusion/optimizer/src/common_subexpr_eliminate.rs

-                match expr_set.get(&id) {
-                    Some((expr, _, _, symbol)) => {
-                        // todo: check `nullable`
-                        agg_exprs.push(expr.clone().alias(symbol.as_str()));


this clone is eliminated in the 3rd commit

peter-toth · 2024-05-06T19:57:48Z

datafusion/optimizer/src/common_subexpr_eliminate.rs

-                // todo: check `nullable`
-                let field = Field::new(&id, data_type.clone(), true);
-                fields_set.insert(field.name().to_owned());
-                project_exprs.push(expr.clone().alias(symbol.as_str()));


this clone is also eliminated

peter-toth · 2024-05-06T19:58:13Z

datafusion/optimizer/src/common_subexpr_eliminate.rs


-        self.expr_set
-            .entry(curr_expr_identifier)
-            .or_insert_with(|| (expr.clone(), 0, data_type, alias_symbol))


this clone is also eliminated

MohamedAbdeen21 · 2024-05-06T20:03:49Z

This seems to be a major conflict with #10333

The source of the issue is that an idenfifier needs to represent an expression subtreee and the newly chosen "stringified expr" as identifier doesn't seem to fulfill that purpose. E.g. an identifier shouldn't belong to 2 different expressions

I agree that an identifier shouldn't belong to 2 different expressions, but why does it have to represent a subtree? The expr IS the subtree itself. If we use an identifier like #{expr} that should be good enough.

And actually it turned out that the reasons are already explained in the comments

AFAIK, we only need the identifier to be unique (no collision) for correctness, I don't see why we require the other traits.

Edit:

Just took a look at the tests, we are basically trying to do the same thing, although your PR is probably more efficient. We only differ on the subtree part I mentioned above. I tried to do #{expr}, you're trying {expr|subtree}; which I think is unnecessary and unreadable. I would even argue that something like #1 is enough.

alamb · 2024-05-06T20:04:08Z

I will review this one carefully tomorrow morning.

alamb · 2024-05-06T20:04:56Z

cc @waynexia and @wiedld for your information

peter-toth · 2024-05-06T20:10:35Z

The source of the issue is that an idenfifier needs to represent an expression subtreee and the newly chosen "stringified expr" as identifier doesn't seem to fulfill that purpose. E.g. an identifier shouldn't belong to 2 different expressions

I agree that an identifier shouldn't belong to 2 different expressions, but why does it have to represent a subtree? The expr IS the subtree itself. If we use an identifier like #{expr} that should be good enough.

In the first traversal we need to count how many times we encountered with an Expr (expression subtree). We either need to use Expr as the key of the map to store the counts or an use an identifier that uniquely identifies an Expr.

Edit:

Just took a look at the tests, we are basically trying to do the same thing, although your PR is probably more efficient. We only differ on the subtree part I mentioned above. I tried to do #{expr}, you're trying {expr|subtree}; which I think is unnecessary and unreadable. I would even argue that something like #1 is enough.

The issue is not about the aliases we assign to the extracted common expressions, it is about the key of the map where we store the counts. Your PR can be a good improvement to the aliases after this PR, but we need need to fix the key of map first.

MohamedAbdeen21 · 2024-05-06T20:29:39Z

Ok yeah, I see what you mean.

But the example you mention with a + b. Doesn't that go away if we fix the side note you mentioned?

col("a + b") should be interpreted as table."a + b" and not test.a + test.b (and vice versa for the SQL example), meaning that expr would never collide in the map, right or am I missing something?

peter-toth · 2024-05-06T20:48:51Z

But the example you mention with a + b. Doesn't that go away if you fix the side note you mentioned?

col("a + b") should be interpreted as table."a + b" and not test.a + test.b, meaning that expr would never collide in the map, right or am I missing something?

There are multiple questions here and I don't have the answers for.

Can we fix the string representation? Do we always have a table reference to prefix the columns with? And actually, is the resolution bug related to the string representation of expressions at all?
And even if the string representation of Expr can be fixed, using a Strings as a key of the map is not a good choice as building a string represatation will very likely require traversing the whole expression. Think of that we need to get an identifier for an expression, and for all its subtrees, and for the subtrees' subtrees... So a good identifier would be something that we can build-up incrementally from the identifiers of the subtrees (in the first traversal's bottom-up phase).

I think the best we can do now is to revert #9871 and return to the old chained string representation. (This PR improves the identifier readability a bit by adding {} around it and | as separator of elements.) And then in a follow-up PR replace the string representation to an alternative identifier like the one I mentioned in the PR description.

MohamedAbdeen21 · 2024-05-06T20:56:58Z

Although I'd like to find answers to these questions before giving more opinions, I don't mind merging this for now.

peter-toth · 2024-05-07T08:34:30Z

datafusion/optimizer/src/common_subexpr_eliminate.rs

-                } else {
-                    Ok(Transformed::no(expr))
-                }
+        let (up_index, expr_id) = &self.id_array[self.down_index];


Reviewers might wonder why the code of CommonSubexprRewriter after the #9871 revert doesn't resemble to the code before that PR: https://github.com/apache/datafusion/pull/9871/files#diff-351499880963d6a383c92e156e75019cd9ce33107724a9635853d7d4cd1898d0L754-L809?
And why the 2 halting conditions (https://github.com/apache/datafusion/pull/9871/files#diff-351499880963d6a383c92e156e75019cd9ce33107724a9635853d7d4cd1898d0L757-L761) are missing from the new code. This is because it turned out that:

the self.curr_index >= self.id_array.len() bound check is poinless as we already indexed the array with self.curr_index: https://github.com/apache/datafusion/pull/9871/files#diff-351499880963d6a383c92e156e75019cd9ce33107724a9635853d7d4cd1898d0L754.
(Note: curr_index is called down_index in the new PR.)

self.max_series_number > *series_number is also pointless (and actually there is no point in storing max_series_number in the rewriter at all) as *series_number (called up_index in the new code, that is basically the postorder index of the node) can never be smaller than the last replaced expression's postorder index (max_series_number). This is because in the rewriter we are doing a preorder traversal and a smaller postorder index could appear only in a replaced expression's subtree or in a subtree of a previous child of the node's ancestors.
(BTW, I verified this with adding a panic! to the halting condition and running all tests.)

alamb

Thank you @peter-toth -- I went over this PR carefully and it looks like a significant improvement to me. In my opinion as long as it doesn't regress performance (I am running benchmarks now) I think we should merge it in as it adds additional test coverage

I also ran our test suite in InfluxDB IOx with this change and all the tests passed .

Doesn't seem to fix reported bug 🤔

I also filed #10413 to track the bug you found (🦅 👁️ ). However, this PR doesn't seem to fix it yet 🤔 . I pushed a test to show this and also tried it manually:

    Finished `dev` profile [unoptimized + debuginfo] target(s) in 35.29s
     Running `target/debug/datafusion-cli`
DataFusion CLI v37.1.0
> select a + b from (select 1 as a, 2 as b, 1 as "a + b");
+-------+
| a + b |
+-------+
| 1     |
+-------+
1 row(s) fetched.
Elapsed 0.018 seconds.

`{}` vs `#{}`

I believe @MohamedAbdeen21 used #{expr} in make common expression alias human-readable #10333 to follow what is done by DuckDB -- perhaps we could do so too in this PR (I also think #{} is slightly easier to notice visually than {})

EDIT -- now I see @peter-toth 's #10396 (comment) and so maybe we should postpone the identifier update until a follow on PR

alamb · 2024-05-07T17:52:52Z

datafusion/optimizer/src/common_subexpr_eliminate.rs

-/// (a `String`) as `Identifier`.
+/// Note that the current implementation contains:
+/// - the `Display` of an expression (a `String`) and
+/// - the identifiers of the childrens of the expression


Do we need to form such complicated identifiers -- what if we simply use #{} like in #10333 as suggested by @MohamedAbdeen21 in #10396 (comment)?

I added a comment here: #10396 (comment) that identifiers as keys of the ExprStats map is a must have but identifiers as aliases of exracted common expressions is not.

alamb · 2024-05-07T17:56:26Z

datafusion/optimizer/src/common_subexpr_eliminate.rs

+/// - The number of occurrences and
+/// - The DataType
+/// of an expression.
+type ExprStats = HashMap<Identifier, (usize, DataType)>;


Nit the code might be easier to follow if this a proper struct and move the manipulation functions (like to_arrays for example) into methods

Maybe as a follow on PR

alamb · 2024-05-07T18:00:13Z

datafusion/optimizer/src/common_subexpr_eliminate.rs

-///
-/// `affected_id` is updated with any sub expressions that were replaced
+fn expr_identifier(expr: &Expr, sub_expr_identifier: Identifier) -> Identifier {
+    format!("{{{expr}{sub_expr_identifier}}}")


see comments about identifiers elsewhere

alamb · 2024-05-07T18:02:12Z

datafusion/optimizer/src/common_subexpr_eliminate.rs

+            // Alias this `Column` expr to it original "expr name",
+            // `projection_push_down` optimizer use "expr name" to eliminate useless
+            // projections.
+            // TODO: do we really need to alias here?


FWIW I removed the alias and several tests failed:

$ cargo test --test sqllogictests Compiling datafusion-optimizer v37.1.0 (/Users/andrewlamb/Software/datafusion2/datafusion/optimizer) warning: unused variable: `expr_name` --> datafusion/optimizer/src/common_subexpr_eliminate.rs:802:17 | 802 | let expr_name = expr.display_name()?; | ^^^^^^^^^ help: if this is intentional, prefix it with an underscore: `_expr_name` | = note: `#[warn(unused_variables)]` on by default Compiling datafusion v37.1.0 (/Users/andrewlamb/Software/datafusion2/datafusion/core) warning: `datafusion-optimizer` (lib) generated 1 warning ... Running "map.slt" External error: query failed: DataFusion error: Optimizer rule 'common_sub_expression_eliminate' failed caused by Schema error: No field named "log(unsigned_integers.b)". Valid fields are a, "log({CAST(unsigned_integers.b AS Float32)|{unsigned_integers.b}})", "log(Int64(10),unsigned_integers.b)". [SQL] select log(a, 64) a, log(b), log(10, b) from unsigned_integers; at test_files/scalar.slt:592 External error: query failed: DataFusion error: Optimizer rule 'common_sub_expression_eliminate' failed caused by Schema error: No field named "aggregate_test_100.c2 % Int64(2) = Int64(0)". Valid fields are "{CAST(aggregate_test_100.c2 AS Int64) % Int64(2) = Int64(0)|{Int64(0)}|{CAST(aggregate_test_100.c2 AS Int64) % Int64(2)|{Int64(2)}|{CAST(aggregate_test_100.c2 AS Int64)|{aggregate_test_100.c2}}}}", "FIRST_VALUE(aggregate_test_100.c2) ORDER BY [CAST(aggregate_test_100.c2 AS Int64) % Int64(2) = Int64(0) ASC NULLS LAST, aggregate_test_100.c3 DESC NULLS FIRST]", "FIRST_VALUE(aggregate_test_100.c3 - Int64(100)) ORDER BY [CAST(aggregate_test_100.c2 AS Int64) % Int64(2) = Int64(0) ASC NULLS LAST, aggregate_test_100.c3 DESC NULLS FIRST]". [SQL] SELECT DISTINCT ON (c2 % 2 = 0) c2, c3 - 100 FROM aggregate_test_100 ORDER BY c2 % 2 = 0, c3 DESC; at test_files/distinct_on.slt:116 External error: query failed: DataFusion error: Optimizer rule 'common_sub_expression_eliminate' failed caused by Schema error: No field named "acos(round(Float64(1) / doubles.f64))". Valid fields are doubles.f64, i64_1, "acos({round(Float64(1) / doubles.f64)|{Float64(1) / doubles.f64|{doubles.f64}|{Float64(1)}}})". [SQL] select f64, round(1.0 / f64) as i64_1, acos(round(1.0 / f64)) from doubles; at test_files/expr.slt:2272 External error: query result mismatch: [SQL] explain select a/2, a/2 + 1 from t [Diff] (-expected|+actual) - logical_plan - 01)Projection: {t.a / Int64(2)|{Int64(2)}|{t.a}} AS t.a / Int64(2), {t.a / Int64(2)|{Int64(2)}|{t.a}} AS t.a / Int64(2) + Int64(1) - 02)--Projection: t.a / Int64(2) AS {t.a / Int64(2)|{Int64(2)}|{t.a}} - 03)----TableScan: t projection=[a] at test_files/subquery.slt:1071 External error: query result mismatch: [SQL] EXPLAIN SELECT x/2, x/2+1 FROM t; [Diff] (-expected|+actual) - logical_plan - 01)Projection: {t.x / Int64(2)|{Int64(2)}|{t.x}} AS t.x / Int64(2), {t.x / Int64(2)|{Int64(2)}|{t.x}} AS t.x / Int64(2) + Int64(1) - 02)--Projection: t.x / Int64(2) AS {t.x / Int64(2)|{Int64(2)}|{t.x}} - 03)----TableScan: t projection=[x] - physical_plan - 01)ProjectionExec: expr=[{t.x / Int64(2)|{Int64(2)}|{t.x}}@0 as t.x / Int64(2), {t.x / Int64(2)|{Int64(2)}|{t.x}}@0 + 1 as t.x / Int64(2) + Int64(1)] - 02)--ProjectionExec: expr=[x@0 / 2 as {t.x / Int64(2)|{Int64(2)}|{t.x}}] - 03)----MemoryExec: partitions=1, partition_sizes=[1] at test_files/select.slt:1425 External error: query result mismatch: [SQL] EXPLAIN SELECT c3, SUM(c9) OVER(ORDER BY c3+c4 DESC, c9 DESC, c2 ASC) as sum1, SUM(c9) OVER(ORDER BY c3+c4 ASC, c9 ASC ) as sum2 FROM aggregate_test_100 LIMIT 5 [Diff] (-expected|+actual) logical_plan 01)Projection: aggregate_test_100.c3, SUM(aggregate_test_100.c9) ORDER BY [aggregate_test_100.c3 + aggregate_test_100.c4 DESC NULLS FIRST, aggregate_test_100.c9 DESC NULLS FIRST, aggregate_test_100.c2 ASC NULLS LAST] RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW AS sum1, SUM(aggregate_test_100.c9) ORDER BY [aggregate_test_100.c3 + aggregate_test_100.c4 ASC NULLS LAST, aggregate_test_100.c9 ASC NULLS LAST] RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW AS sum2 02)--Limit: skip=0, fetch=5 - 03)----WindowAggr: windowExpr=[[SUM(aggregate_test_100.c9) ORDER BY [{aggregate_test_100.c3 + aggregate_test_100.c4|{aggregate_test_100.c4}|{aggregate_test_100.c3}} AS aggregate_test_100.c3 + aggregate_test_100.c4 ASC NULLS LAST, aggregate_test_100.c9 ASC NULLS LAST] RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW AS SUM(aggregate_test_100.c9) ORDER BY [aggregate_test_100.c3 + aggregate_test_100.c4 ASC NULLS LAST, aggregate_test_100.c9 ASC NULLS LAST] RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW]] + 03)----WindowAggr: windowExpr=[[SUM(aggregate_test_100.c9) ORDER BY [{aggregate_test_100.c3 + aggregate_test_100.c4|{aggregate_test_100.c4}|{aggregate_test_100.c3}} ASC NULLS LAST, aggregate_test_100.c9 ASC NULLS LAST] RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW AS SUM(aggregate_test_100.c9) ORDER BY [aggregate_test_100.c3 + aggregate_test_100.c4 ASC NULLS LAST, aggregate_test_100.c9 ASC NULLS LAST] RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW]] 04)------Projection: {aggregate_test_100.c3 + aggregate_test_100.c4|{aggregate_test_100.c4}|{aggregate_test_100.c3}}, aggregate_test_100.c3, aggregate_test_100.c9, SUM(aggregate_test_100.c9) ORDER BY [aggregate_test_100.c3 + aggregate_test_100.c4 DESC NULLS FIRST, aggregate_test_100.c9 DESC NULLS FIRST, aggregate_test_100.c2 ASC NULLS LAST] RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW - 05)--------WindowAggr: windowExpr=[[SUM(aggregate_test_100.c9) ORDER BY [{aggregate_test_100.c3 + aggregate_test_100.c4|{aggregate_test_100.c4}|{aggregate_test_100.c3}} AS aggregate_test_100.c3 + aggregate_test_100.c4 DESC NULLS FIRST, aggregate_test_100.c9 DESC NULLS FIRST, aggregate_test_100.c2 ASC NULLS LAST] RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW AS SUM(aggregate_test_100.c9) ORDER BY [aggregate_test_100.c3 + aggregate_test_100.c4 DESC NULLS FIRST, aggregate_test_100.c9 DESC NULLS FIRST, aggregate_test_100.c2 ASC NULLS LAST] RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW]] + 05)--------WindowAggr: windowExpr=[[SUM(aggregate_test_100.c9) ORDER BY [{aggregate_test_100.c3 + aggregate_test_100.c4|{aggregate_test_100.c4}|{aggregate_test_100.c3}} DESC NULLS FIRST, aggregate_test_100.c9 DESC NULLS FIRST, aggregate_test_100.c2 ASC NULLS LAST] RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW AS SUM(aggregate_test_100.c9) ORDER BY [aggregate_test_100.c3 + aggregate_test_100.c4 DESC NULLS FIRST, aggregate_test_100.c9 DESC NULLS FIRST, aggregate_test_100.c2 ASC NULLS LAST] RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW]] 06)----------Projection: aggregate_test_100.c3 + aggregate_test_100.c4 AS {aggregate_test_100.c3 + aggregate_test_100.c4|{aggregate_test_100.c4}|{aggregate_test_100.c3}}, aggregate_test_100.c2, aggregate_test_100.c3, aggregate_test_100.c9 07)------------TableScan: aggregate_test_100 projection=[c2, c3, c4, c9] physical_plan 01)ProjectionExec: expr=[c3@1 as c3, SUM(aggregate_test_100.c9) ORDER BY [aggregate_test_100.c3 + aggregate_test_100.c4 DESC NULLS FIRST, aggregate_test_100.c9 DESC NULLS FIRST, aggregate_test_100.c2 ASC NULLS LAST] RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW@3 as sum1, SUM(aggregate_test_100.c9) ORDER BY [aggregate_test_100.c3 + aggregate_test_100.c4 ASC NULLS LAST, aggregate_test_100.c9 ASC NULLS LAST] RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW@4 as sum2] 02)--GlobalLimitExec: skip=0, fetch=5 03)----WindowAggExec: wdw=[SUM(aggregate_test_100.c9) ORDER BY [aggregate_test_100.c3 + aggregate_test_100.c4 ASC NULLS LAST, aggregate_test_100.c9 ASC NULLS LAST] RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW: Ok(Field { name: "SUM(aggregate_test_100.c9) ORDER BY [aggregate_test_100.c3 + aggregate_test_100.c4 ASC NULLS LAST, aggregate_test_100.c9 ASC NULLS LAST] RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW", data_type: UInt64, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }), frame: WindowFrame { units: Range, start_bound: CurrentRow, end_bound: Following(Int16(NULL)), is_causal: false }] 04)------ProjectionExec: expr=[{aggregate_test_100.c3 + aggregate_test_100.c4|{aggregate_test_100.c4}|{aggregate_test_100.c3}}@0 as {aggregate_test_100.c3 + aggregate_test_100.c4|{aggregate_test_100.c4}|{aggregate_test_100.c3}}, c3@2 as c3, c9@3 as c9, SUM(aggregate_test_100.c9) ORDER BY [aggregate_test_100.c3 + aggregate_test_100.c4 DESC NULLS FIRST, aggregate_test_100.c9 DESC NULLS FIRST, aggregate_test_100.c2 ASC NULLS LAST] RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW@4 as SUM(aggregate_test_100.c9) ORDER BY [aggregate_test_100.c3 + aggregate_test_100.c4 DESC NULLS FIRST, aggregate_test_100.c9 DESC NULLS FIRST, aggregate_test_100.c2 ASC NULLS LAST] RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW] 05)--------BoundedWindowAggExec: wdw=[SUM(aggregate_test_100.c9) ORDER BY [aggregate_test_100.c3 + aggregate_test_100.c4 DESC NULLS FIRST, aggregate_test_100.c9 DESC NULLS FIRST, aggregate_test_100.c2 ASC NULLS LAST] RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW: Ok(Field { name: "SUM(aggregate_test_100.c9) ORDER BY [aggregate_test_100.c3 + aggregate_test_100.c4 DESC NULLS FIRST, aggregate_test_100.c9 DESC NULLS FIRST, aggregate_test_100.c2 ASC NULLS LAST] RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW", data_type: UInt64, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }), frame: WindowFrame { units: Range, start_bound: Preceding(Int16(NULL)), end_bound: CurrentRow, is_causal: false }], mode=[Sorted] 06)----------SortPreservingMergeExec: [{aggregate_test_100.c3 + aggregate_test_100.c4|{aggregate_test_100.c4}|{aggregate_test_100.c3}}@0 DESC,c9@3 DESC,c2@1 ASC NULLS LAST] 07)------------SortExec: expr=[{aggregate_test_100.c3 + aggregate_test_100.c4|{aggregate_test_100.c4}|{aggregate_test_100.c3}}@0 DESC,c9@3 DESC,c2@1 ASC NULLS LAST], preserve_partitioning=[true] 08)--------------ProjectionExec: expr=[c3@1 + c4@2 as {aggregate_test_100.c3 + aggregate_test_100.c4|{aggregate_test_100.c4}|{aggregate_test_100.c3}}, c2@0 as c2, c3@1 as c3, c9@3 as c9] 09)----------------RepartitionExec: partitioning=RoundRobinBatch(2), input_partitions=1 10)------------------CsvExec: file_groups={1 group: [[WORKSPACE_ROOT/testing/data/csv/aggregate_test_100.csv]]}, projection=[c2, c3, c4, c9], has_header=true at test_files/window.slt:1680 External error: query failed: DataFusion error: Optimizer rule 'common_sub_expression_eliminate' failed caused by Schema error: No field named "hits.ClientIP - Int64(1)". Valid fields are hits."ClientIP", "{CAST(hits.ClientIP AS Int64)|{hits.ClientIP}} - Int64(1)", "{CAST(hits.ClientIP AS Int64)|{hits.ClientIP}} - Int64(2)", "{CAST(hits.ClientIP AS Int64)|{hits.ClientIP}} - Int64(3)", "COUNT(*)". [SQL] SELECT "ClientIP", "ClientIP" - 1, "ClientIP" - 2, "ClientIP" - 3, COUNT(*) AS c FROM hits GROUP BY "ClientIP", "ClientIP" - 1, "ClientIP" - 2, "ClientIP" - 3 ORDER BY c DESC LIMIT 10; at test_files/clickbench.slt:241 External error: query failed: DataFusion error: Optimizer rule 'common_sub_expression_eliminate' failed caused by Schema error: No field named "value_dict.x_dict % Int64(2)". Valid fields are "{CAST(value_dict.x_dict AS Int64)|{value_dict.x_dict}} % Int64(2)", "SUM(value_dict.x_dict)". [SQL] select sum(x_dict) from value_dict group by x_dict % 2 order by sum(x_dict); at test_files/aggregate.slt:2696 External error: query result mismatch: [SQL] EXPLAIN SELECT SUM(DISTINCT CAST(x AS DOUBLE)), MAX(DISTINCT CAST(x AS DOUBLE)) FROM t1 GROUP BY y; [Diff] (-expected|+actual) logical_plan - 01)Projection: SUM(alias1) AS SUM(DISTINCT t1.x), MAX(alias1) AS MAX(DISTINCT t1.x) - 02)--Aggregate: groupBy=[[t1.y]], aggr=[[SUM(alias1), MAX(alias1)]] - 03)----Aggregate: groupBy=[[t1.y, {CAST(t1.x AS Float64)|{t1.x}} AS t1.x AS alias1]], aggr=[[]] - 04)------Projection: CAST(t1.x AS Float64) AS {CAST(t1.x AS Float64)|{t1.x}}, t1.y - 05)--------TableScan: t1 projection=[x, y] + 01)Projection: SUM(DISTINCT t1.x), MAX(DISTINCT t1.x) + 02)--Aggregate: groupBy=[[t1.y]], aggr=[[SUM(DISTINCT {CAST(t1.x AS Float64)|{t1.x}}) AS SUM(DISTINCT t1.x), MAX(DISTINCT {CAST(t1.x AS Float64)|{t1.x}}) AS MAX(DISTINCT t1.x)]] + 03)----Projection: CAST(t1.x AS Float64) AS {CAST(t1.x AS Float64)|{t1.x}}, t1.y + 04)------TableScan: t1 projection=[x, y] physical_plan - 01)ProjectionExec: expr=[SUM(alias1)@1 as SUM(DISTINCT t1.x), MAX(alias1)@2 as MAX(DISTINCT t1.x)] - 02)--AggregateExec: mode=FinalPartitioned, gby=[y@0 as y], aggr=[SUM(alias1), MAX(alias1)] + 01)ProjectionExec: expr=[SUM(DISTINCT t1.x)@1 as SUM(DISTINCT t1.x), MAX(DISTINCT t1.x)@2 as MAX(DISTINCT t1.x)] + 02)--AggregateExec: mode=FinalPartitioned, gby=[y@0 as y], aggr=[SUM(DISTINCT t1.x), MAX(DISTINCT t1.x)] 03)----CoalesceBatchesExec: target_batch_size=2 04)------RepartitionExec: partitioning=Hash([y@0], 8), input_partitions=8 - 05)--------AggregateExec: mode=Partial, gby=[y@0 as y], aggr=[SUM(alias1), MAX(alias1)] - 06)----------AggregateExec: mode=FinalPartitioned, gby=[y@0 as y, alias1@1 as alias1], aggr=[] - 07)------------CoalesceBatchesExec: target_batch_size=2 - 08)--------------RepartitionExec: partitioning=Hash([y@0, alias1@1], 8), input_partitions=8 - 09)----------------RepartitionExec: partitioning=RoundRobinBatch(8), input_partitions=1 - 10)------------------AggregateExec: mode=Partial, gby=[y@1 as y, {CAST(t1.x AS Float64)|{t1.x}}@0 as alias1], aggr=[] - 11)--------------------ProjectionExec: expr=[CAST(x@0 AS Float64) as {CAST(t1.x AS Float64)|{t1.x}}, y@1 as y] - 12)----------------------MemoryExec: partitions=1, partition_sizes=[1] + 05)--------RepartitionExec: partitioning=RoundRobinBatch(8), input_partitions=1 + 06)----------AggregateExec: mode=Partial, gby=[y@1 as y], aggr=[SUM(DISTINCT t1.x), MAX(DISTINCT t1.x)] + 07)------------ProjectionExec: expr=[CAST(x@0 AS Float64) as {CAST(t1.x AS Float64)|{t1.x}}, y@1 as y] + 08)--------------MemoryExec: partitions=1, partition_sizes=[1] at test_files/group_by.slt:4184 Error: Execution("9 failures") error: test failed, to rerun pass `-p datafusion-sqllogictest --test sqllogictests` Caused by: process didn't exit successfully: `/Users/andrewlamb/Software/datafusion2/target/debug/deps/sqllogictests-ce3a36cfeab74789` (exit status: 1)

Yes, I see that is needed but it looks weird to me. I mean if we consider the example:

select a + b as x, a + b as y from ...

that should be eliminated to:

select common as x, common as y from ( select a + b as common from ... )

I.e. we need to add aliases the extracted common expressions, but with this alias here we alias common and this is what the rule does:

select common as "a + b" as x, common as "a + b" as y from ( select a + b as common from ... )

I wanted to understand why exactly we need to alias at both places.

BTW I left a few other TODOs in the code but none of them means that code should be worse than it was before #9871. Maybe the longer identifiers due to the extra {, } and | can make it a bit slower. I left the TODOs there to myself and others to remember where is some room for improvement.

I have a stashed PR that addresses this issue.

I'm not 100% sure about why the second alias is needed, but there was a comment saying that the second alias is used by another rule, so removing it inside CSE seems to break other rules (or CSE itself sometimes (??) idk, I'm still looking into it).

Anyway, the solution I have at the moment is calling .unalias() on the expr when applying a new alias through expr.alias(). That way, when the true alias x is later (I assume after all the rules finished) given to the expr common as a + b, it will become common as x.

It passes the tests, but I really want to understand why it works before pushing the PR + I hope maybe I can find a better way by removing the second alias added by CSE all together, even if it means changing other rules.

alamb · 2024-05-07T18:02:36Z

datafusion/optimizer/src/common_subexpr_eliminate.rs

@@ -789,6 +861,74 @@ mod test {
        assert_eq!(expected, formatted_plan);
    }

+    #[test]
+    fn id_array_visitor() -> Result<()> {


This is really nice to see more of how this code works 👍

alamb · 2024-05-07T18:06:27Z

datafusion/sqllogictest/test_files/select.slt

@@ -1613,6 +1613,14 @@ select count(1) from v;
 ----
 1

+# Ensure CSE resolves columns correctly


I added this test but it still resolves the expr as 1 (not 3) 🤔

peter-toth · 2024-05-07T18:37:19Z

I also filed #10413 to track the bug you found (🦅 👁️ ). However, this PR doesn't seem to fix it yet 🤔 . I pushed a test to show this and also tried it manually:

No, this PR doesn't fix that issue at all. That issue is a resolution issue (#10413) and has nothing to do with CSE. The example I gave in the description doesn't contain any subexpressions to eliminate and CommonSubexprEliminate has no effect on the query plan.

The reason I mentioned the resolution issue is because of that issue I couldn't add a test case to this PR which would illustrate the issue of colflicting identifiers in CommonSubexprEliminate after #9871.

Once #10413 is solved I can add a test case here.

I believe @MohamedAbdeen21 used #{expr} in #10333 to follow what is done by DuckDB -- perhaps we could do so too in this PR (I also think #{} is slightly easier to notice visually than {})

I fully aggree that the current alias is very hard to read and this is because the identifiers are used for aliases as well.
But there are 2 different things here:

The identifier must be unique for each unique expression when it is used as the key of the ExprStats map that stores the counts.
The aliases we gave to extracted common expressions shouldn't collide.

Currently for both 1. and 2. we use the identifier and I'm sure that in 1. we have touse the identifier. In 2. I'm not sure and @MohamedAbdeen21's PR can be a good follow-up improvement.

alamb · 2024-05-07T18:42:31Z

No, this PR doesn't fix that issue at all. That issue is a resolution issue (#10413) and has nothing to do with CSE. The example I gave in the description doesn't contain any subexpressions to eliminate and CommonSubexprEliminate has no effect on the query plan.

Sorry -- I missed that -- updated #10413 to match

alamb

I think this PR is an improvement:

My performance benchmarks show it appears to be slightly slower than main

logical_plan_tpcds_all                        1.00    157.7±1.80ms        ? ?/sec    1.03    162.6±2.85ms        ? ?/sec
logical_plan_tpch_all                         1.00     16.8±0.19ms        ? ?/sec    1.02     17.2±0.30ms        ? ?/sec

However I expect we'll make this up on subsequent PRs as we fix this rule to avoid the copies

Details

++ critcmp main refactor-commonsubexpreliminate
group                                         main                                   refactor-commonsubexpreliminate
-----                                         ----                                   -------------------------------
logical_aggregate_with_join                   1.00  1198.0±10.58µs        ? ?/sec    1.03  1231.8±65.81µs        ? ?/sec
logical_plan_tpcds_all                        1.00    157.7±1.80ms        ? ?/sec    1.03    162.6±2.85ms        ? ?/sec
logical_plan_tpch_all                         1.00     16.8±0.19ms        ? ?/sec    1.02     17.2±0.30ms        ? ?/sec
logical_select_all_from_1000                  1.05     18.8±0.10ms        ? ?/sec    1.00     18.0±0.10ms        ? ?/sec
logical_select_one_from_700                   1.01    826.5±7.83µs        ? ?/sec    1.00   817.5±21.62µs        ? ?/sec
logical_trivial_join_high_numbered_columns    1.00   763.6±11.20µs        ? ?/sec    1.00   761.5±10.77µs        ? ?/sec
logical_trivial_join_low_numbered_columns     1.00    745.5±8.27µs        ? ?/sec    1.00    745.3±7.96µs        ? ?/sec
physical_plan_tpcds_all                       1.00  1351.3±16.05ms        ? ?/sec    1.00  1345.2±13.72ms        ? ?/sec
physical_plan_tpch_all                        1.00     91.3±1.75ms        ? ?/sec    1.04     94.5±1.87ms        ? ?/sec
physical_plan_tpch_q1                         1.00      5.0±0.06ms        ? ?/sec    1.05      5.3±0.08ms        ? ?/sec
physical_plan_tpch_q10                        1.02      4.5±0.06ms        ? ?/sec    1.00      4.4±0.10ms        ? ?/sec
physical_plan_tpch_q11                        1.03      4.0±0.08ms        ? ?/sec    1.00      3.9±0.09ms        ? ?/sec
physical_plan_tpch_q12                        1.07      3.2±0.05ms        ? ?/sec    1.00      3.0±0.08ms        ? ?/sec
physical_plan_tpch_q13                        1.00      2.2±0.04ms        ? ?/sec    1.00      2.2±0.05ms        ? ?/sec
physical_plan_tpch_q14                        1.04      2.8±0.05ms        ? ?/sec    1.00      2.7±0.05ms        ? ?/sec
physical_plan_tpch_q16                        1.00      3.8±0.05ms        ? ?/sec    1.03      3.9±0.05ms        ? ?/sec
physical_plan_tpch_q17                        1.00      3.6±0.12ms        ? ?/sec    1.01      3.6±0.05ms        ? ?/sec
physical_plan_tpch_q18                        1.00      3.9±0.07ms        ? ?/sec    1.04      4.0±0.07ms        ? ?/sec
physical_plan_tpch_q19                        1.01      6.2±0.09ms        ? ?/sec    1.00      6.2±0.09ms        ? ?/sec
physical_plan_tpch_q2                         1.00      7.8±0.09ms        ? ?/sec    1.01      7.8±0.09ms        ? ?/sec
physical_plan_tpch_q20                        1.00      4.6±0.09ms        ? ?/sec    1.06      4.8±0.08ms        ? ?/sec
physical_plan_tpch_q21                        1.00      6.2±0.11ms        ? ?/sec    1.00      6.2±0.12ms        ? ?/sec
physical_plan_tpch_q22                        1.01      3.4±0.08ms        ? ?/sec    1.00      3.3±0.08ms        ? ?/sec
physical_plan_tpch_q3                         1.00      3.1±0.05ms        ? ?/sec    1.06      3.3±0.08ms        ? ?/sec
physical_plan_tpch_q4                         1.00      2.3±0.03ms        ? ?/sec    1.11      2.5±0.04ms        ? ?/sec
physical_plan_tpch_q5                         1.00      4.7±0.15ms        ? ?/sec    1.00      4.7±0.04ms        ? ?/sec
physical_plan_tpch_q6                         1.03  1629.0±90.38µs        ? ?/sec    1.00  1578.4±32.11µs        ? ?/sec
physical_plan_tpch_q7                         1.00      5.9±0.08ms        ? ?/sec    1.00      5.9±0.06ms        ? ?/sec
physical_plan_tpch_q8                         1.00      7.5±0.09ms        ? ?/sec    1.00      7.5±0.08ms        ? ?/sec
physical_plan_tpch_q9                         1.02      5.7±0.08ms        ? ?/sec    1.00      5.6±0.06ms        ? ?/sec
physical_select_all_from_1000                 1.04     61.4±0.28ms        ? ?/sec    1.00     59.1±0.37ms        ? ?/sec
physical_select_one_from_700                  1.07      3.8±0.03ms        ? ?/sec    1.00      3.6±0.04ms        ? ?/sec

peter-toth · 2024-05-07T18:56:30Z

Thanks for the benchmarks @alamb! Maybe the longer identifiers can explain that gap.

peter-toth · 2024-05-07T19:09:06Z

@alamb, IMO if this PR can be merged then the next steps should be:

Fix Incorrect results with expression resolution #10413 as that is correctness bug.
Continue Rewrite CommonSubexprEliminate to avoid copies using TreeNode #10067 efforts to refactor the CommonSubexprEliminate rule to rewrite as that could improve a lot on Stop copying Exprs and LogicalPlans so much during Common Subexpression Elimination #9873.
Rebase @MohamedAbdeen21's make common expression alias human-readable #10333 on the top of this PR as probably we don't need to use the current string identifiers in aliases and we could improve readablity.
Revisit the identifiers as using these string identifiers as the keys of ExprStats was not the best choice. (Please note that this was not my choice but this is how CSE has been working since the feature was added initially.) See my comment on this in the PR description.

I'm happy to take 4 as I already worked on it a bit, but unfortunately I have very little time to work on this project lately so I can't take 1. and 2.

MohamedAbdeen21 · 2024-05-07T19:14:57Z

I'll rebase my PR this weekend.

I do have other changes in mind regarding plan readability. If 1 is still available by the time I'm done, I'll be happy to take a look at it.

alamb · 2024-05-08T17:21:04Z

@alamb, IMO if this PR can be merged then the next steps should be:

Fix Incorrect results with expression resolution #10413 as that is correctness bug.

Agree -- this is now tracked as its own issue and we can deal with it separately

Continue Rewrite CommonSubexprEliminate to avoid copies using TreeNode #10067 efforts to refactor the CommonSubexprEliminate rule to rewrite as that could improve a lot on Stop copying Exprs and LogicalPlans so much during Common Subexpression Elimination #9873.

I will do this

Rebase @MohamedAbdeen21's make common expression alias human-readable #10333 on the top of this PR as probably we don't need to use the current string identifiers in aliases and we could improve readablity.

Sounds like @MohamedAbdeen21 is going to do this maybe this weekend

Revisit the identifiers as using these string identifiers as the keys of ExprStats was not the best choice. (Please note that this was not my choice but this is how CSE has been working since the feature was added initially.) See my comment on this in the PR description.

👍

I'm happy to take 4 as I already worked on it a bit, but unfortunately I have very little time to work on this project lately so I can't take 1. and 2.

That would be amazing -- thank you @peter-toth -- I filed #10426 to track

alamb · 2024-05-08T17:21:23Z

All right, I think we have our next steps outlined and tracked with tickets. 🚀 !

alamb · 2024-05-08T17:21:32Z

Thanks again @peter-toth and @MohamedAbdeen21

peter-toth · 2024-05-08T17:26:48Z

Thanks for the review!

* Revert "fix(9870): common expression elimination optimization, should always re-find the correct expression during re-write. (apache#9871)" This reverts commit cd7a00b. * expr id should always contain the full expr structure, cleaner expr ids, better JumpMark handling, better variable names, code cleanup, some new todos * move `Expr` from `expr_set`s to `affected_id`s * better naming, docs fixes * introduce `CommonExprs` type alias, minor todo fix * add test --------- Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

peter-toth added 4 commits May 6, 2024 10:07

Revert "fix(9870): common expression elimination optimization, should…

9dd9a2f

… always re-find the correct expression during re-write. (apache#9871)" This reverts commit cd7a00b.

expr id should always contain the full expr structure, cleaner expr i…

3303096

…ds, better JumpMark handling, better variable names, code cleanup, some new todos

move Expr from expr_sets to affected_ids

b84835c

better naming, docs fixes

aad9768

github-actions bot added logical-expr Logical plan and expressions optimizer Optimizer rules sqllogictest SQL Logic Tests (.slt) labels May 6, 2024

peter-toth mentioned this pull request May 6, 2024

Stop copying Exprs and LogicalPlans so much during Common Subexpression Elimination #9873

Closed

alamb mentioned this pull request May 6, 2024

DataFusion weekly project plan (Andrew Lamb) - May 6, 2024 #10395

Closed

7 tasks

peter-toth commented May 6, 2024

View reviewed changes

peter-toth mentioned this pull request May 6, 2024

make common expression alias human-readable #10333

Merged

introduce CommonExprs type alias, minor todo fix

ad1b6c1

peter-toth commented May 7, 2024

View reviewed changes

alamb mentioned this pull request May 7, 2024

Incorrect results with expression resolution #10413

Closed

alamb reviewed May 7, 2024

View reviewed changes

add test

688c38d

alamb reviewed May 7, 2024

View reviewed changes

alamb approved these changes May 7, 2024

View reviewed changes

alamb mentioned this pull request May 8, 2024

Make CommonSubexprEliminate faster by stop copying so many strings #10426

Closed

alamb merged commit 0d283a4 into apache:main May 8, 2024
25 checks passed

alamb mentioned this pull request May 13, 2024

DataFusion weekly project plan (Andrew Lamb) - May 13, 2024 #10482

Closed

8 tasks

peter-toth mentioned this pull request May 17, 2024

Improve CommonSubexprEliminate identifier management (10% faster planning) #10473

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix and improve `CommonSubexprEliminate` rule #10396

Fix and improve `CommonSubexprEliminate` rule #10396

peter-toth commented May 6, 2024 •

edited by alamb

Loading

peter-toth commented May 6, 2024

alamb commented May 6, 2024

peter-toth May 6, 2024

peter-toth May 6, 2024

peter-toth May 6, 2024

MohamedAbdeen21 commented May 6, 2024 •

edited

Loading

alamb commented May 6, 2024

alamb commented May 6, 2024

peter-toth commented May 6, 2024 •

edited

Loading

MohamedAbdeen21 commented May 6, 2024 •

edited

Loading

peter-toth commented May 6, 2024 •

edited

Loading

MohamedAbdeen21 commented May 6, 2024

peter-toth May 7, 2024 •

edited

Loading

alamb left a comment •

edited

Loading

alamb May 7, 2024

peter-toth May 7, 2024

alamb May 7, 2024

alamb May 7, 2024

alamb May 7, 2024

peter-toth May 7, 2024 •

edited

Loading

MohamedAbdeen21 May 7, 2024 •

edited

Loading

alamb May 7, 2024

alamb May 7, 2024

peter-toth commented May 7, 2024

alamb commented May 7, 2024

alamb left a comment

peter-toth commented May 7, 2024

peter-toth commented May 7, 2024 •

edited

Loading

MohamedAbdeen21 commented May 7, 2024

alamb commented May 8, 2024

alamb commented May 8, 2024

alamb commented May 8, 2024

peter-toth commented May 8, 2024

Fix and improve CommonSubexprEliminate rule #10396

Fix and improve CommonSubexprEliminate rule #10396

Conversation

peter-toth commented May 6, 2024 • edited by alamb Loading

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

peter-toth commented May 6, 2024

alamb commented May 6, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MohamedAbdeen21 commented May 6, 2024 • edited Loading

alamb commented May 6, 2024

alamb commented May 6, 2024

peter-toth commented May 6, 2024 • edited Loading

MohamedAbdeen21 commented May 6, 2024 • edited Loading

peter-toth commented May 6, 2024 • edited Loading

MohamedAbdeen21 commented May 6, 2024

peter-toth May 7, 2024 • edited Loading

Choose a reason for hiding this comment

alamb left a comment • edited Loading

Choose a reason for hiding this comment

Doesn't seem to fix reported bug 🤔

{} vs #{}

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

peter-toth May 7, 2024 • edited Loading

Choose a reason for hiding this comment

MohamedAbdeen21 May 7, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

peter-toth commented May 7, 2024

alamb commented May 7, 2024

alamb left a comment

Choose a reason for hiding this comment

peter-toth commented May 7, 2024

peter-toth commented May 7, 2024 • edited Loading

MohamedAbdeen21 commented May 7, 2024

alamb commented May 8, 2024

alamb commented May 8, 2024

alamb commented May 8, 2024

peter-toth commented May 8, 2024

Fix and improve `CommonSubexprEliminate` rule #10396

Fix and improve `CommonSubexprEliminate` rule #10396

peter-toth commented May 6, 2024 •

edited by alamb

Loading

MohamedAbdeen21 commented May 6, 2024 •

edited

Loading

peter-toth commented May 6, 2024 •

edited

Loading

MohamedAbdeen21 commented May 6, 2024 •

edited

Loading

peter-toth commented May 6, 2024 •

edited

Loading

peter-toth May 7, 2024 •

edited

Loading

alamb left a comment •

edited

Loading

`{}` vs `#{}`

peter-toth May 7, 2024 •

edited

Loading

MohamedAbdeen21 May 7, 2024 •

edited

Loading

peter-toth commented May 7, 2024 •

edited

Loading