-
Notifications
You must be signed in to change notification settings - Fork 468
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
transform: plumb through OptimizerFeatures
#25628
transform: plumb through OptimizerFeatures
#25628
Conversation
MitigationsCompleting required mitigations increases Resilience Coverage.
Risk Summary:The pull request poses a high risk, with a score of 81, potentially due to the average age of the files involved, the cognitive complexity within these files, and the changes in executable lines of code. Notably, historical data indicates that pull requests with these characteristics are 107% more likely to introduce bugs compared to the repository's baseline. Additionally, the pull request modifies one file that has recently seen a high number of bug fixes, which could further contribute to its risk level. While the repository's observed bug trend is decreasing, this pull request's risk factors warrant careful review. Note: The risk score is not based on semantic analysis but on historical predictors of bug occurrence in the repository. The attributes above were deemed the strongest predictors based on that history. Predictors and the score may change as the PR evolves in code, time, and review activity. Bug Hotspots:
|
1685183
to
625420b
Compare
5f623b4
to
0c91c00
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wrote a few comments, but still reviewing.
@@ -398,26 +459,23 @@ pub struct Optimizer { | |||
impl Optimizer { | |||
/// Builds a logical optimizer that only performs logical transformations. | |||
#[deprecated = "Create an Optimize instance and call `optimize` instead."] | |||
pub fn logical_optimizer(ctx: &crate::typecheck::SharedContext) -> Self { | |||
// TODO: pass TransformCtx instead | |||
pub fn logical_optimizer(ctx: &mut TransformCtx) -> Self { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO is actually resolved?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, will remove this.
/// Transforms can use this field to communicate information outside the result plans. | ||
pub dataflow_metainfo: &'a mut DataflowMetainfo, | ||
pub df_meta: &'a mut DataflowMetainfo, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please update the doc comment in filter.rs
? Or you could maybe even just delete the TransformCtx
part of that comment, because it probably has limited usefulness but always annoyingly goes out of sync.
And the same in predicate_pushdown.rs
. (typecheck_ctx
is missing.)
/// stage. | ||
/// | ||
/// Used to call [`dataflow::optimize_dataflow`]. | ||
pub fn global( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might be missing something, but why not just put this code inside optimize_dataflow
(and pass these same arguments to optimize_dataflow
instead)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you look at the sequence of commits there is an intermediate stage where actually things are as suggested by you, but then I am pulling the TransformCtx::global(...)
call outside of optimize_dataflow
into the Optimize
implementations in `mz_adapter.
Mostly this is done for consistency between the local and the lobal pass, and because I don't want us to touch the signature of optimize_dataflow
every time when the TransformCtx
contents change—the Optimize
implementations in mz_adapter
have everything needed to construct a TransformCtx
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok
/// Used to call [`Optimizer::optimize`] on a | ||
/// [`Optimizer::logical_optimizer`] in order to transform a stand-alone | ||
/// [`MirRelationExpr`]. | ||
pub fn local( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here: Why not put this into optimize_mir_local
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same reason as above. Assume that in the next three months we want to touch the definition of TransformCtx
three times—I think it's less effort to only modify the five call-sites in src/adapter/src/optimize
or think about making the conversion part of some generic trait interface.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think about it like that:
- The
Optimzie
implementations insrc/adapter/src/optimize
are a high-level interface that abstracts over the end-to-end entire optimization pipeline for our various statement types. They each own a configuration as aconfig: OptimizerConfig
field which is passed fully formed into theOptimizer::new(...)
constructor calls. - The various parts of the pipeline sit behind a low-level API which
Optimzie
calls. Each primitive works by either calling a function or instantiating an object which contains a primitive-specific optimization. Examples are:HirRelationExpr::lower(...)
which is parameterized by amz_sql::plan::lowering::Config
.Optimizer::<optimizer_type_ctor>(...)
which is parameterized by aTransformCtx
.Optimizer::optimize(...)
which is parameterized by aTransformCtx
.Plan::lower_dataflow(...) which is parameterized by an
mz_compute_types::plan::lowering::Context`.
The patterns at the moment are not quite as uniform as I would like them to be, but the gist is that each API primitive wants to have it's own context type as a parameter, and the caller should be responsible in forming this paramter from it's own context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thanks for the reviews. @teskje / @ggevay I'm certainly open to suggestions how to make this more ergonomic and further reduce the amount of boilerplate code for context forming. I went for regularity at the expense of some code duplication here, hoping that such ideas might emerge naturally in the next 1-2 months. |
OptimizerFeatures
plumbingOptimizerFeatures
plumbing into mz_transform
OptimizerFeatures
plumbing into mz_transform
OptimizerFeatures
- Call `TransformCtx::global(...)` in `optimize_dataflow`. - Pass the result as a `optimize_dataflow_relations` parameter.
- Make the `Optimize` implementations in `mz_adapter` responsible for constructing a `TransformCtx` instance. - Pass that instance as a parameter to the `optimize_dataflow` function in `mz_transform`.
Pull the `TransformCtx::local(...)` call one level up from `Optimizer::logical_optimizer` to the caller (`optimize_mir_local`).
- Make the `Optimize` implementations in `mz_adapter` responsible for constructing a `TransformCtx` instance. - Pass that instance as a parameter to the `Optimizer::optimize` method in `mz_transform`.
Share the same `DataflowMetainfo` between the `TransformCtx::local(...)` and the `TransformCtx::global(...)` calls.
0c91c00
to
d33bb48
Compare
Make
OptimizerConfig
available in all parts of the MIR optimizations defined inmz_transform
(local and global).Motivation
Part of MaterializeInc/database-issues#7541.
Tips for reviewer
To follow what is happening it's best to review this one commit at a time.
Checklist
$T ⇔ Proto$T
mapping (possibly in a backwards-incompatible way), then it is tagged with aT-proto
label.