Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: PipelineMetrics common scheme, no-op version #760

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

steph-rs
Copy link
Contributor

Summary of Metrics Abstraction

In this PR, I've introduced a PipelineMetrics struct that contains trait objects for metrics (e.g., Arc<dyn DerivationPipelineMetrics + Send + Sync>). This allows us to pass a single metrics object through the pipeline without adding generics for each metric trait, simplifying type signatures and reducing complexity.

I implemented DerivationPipelineMetrics for PipelineMetrics by delegating the methods to the inner derivation_pipeline_metrics field. We also have a PipelineMetrics::no_op() method that provides standard no-op implementations.

Question

Could you please review this abstraction scheme? I initially tried using generics for each metric, but this made the code too complicated. What do you think of this model? Any suggestions for improvement are welcome.

@steph-rs
Copy link
Contributor Author

subtask from #721

Copy link

codecov bot commented Oct 30, 2024

Codecov Report

Attention: Patch coverage is 95.31250% with 3 lines in your changes missing coverage. Please review.

Project coverage is 91.8%. Comparing base (13d7e82) to head (04f79ba).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
crates/derive/src/metrics/mod.rs 75.0% 3 Missing ⚠️
Additional details and impacted files

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@steph-rs steph-rs marked this pull request as ready for review October 30, 2024 17:28

/// Composite metrics struct containing metrics for all stages.
pub struct PipelineMetrics {
pub(crate) derivation_pipeline_metrics: Arc<dyn DerivationPipelineMetrics + Send + Sync>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can remove the Arc here - it can be placed around the PipelineMetrics.

If we do this, we can then make a PipelineMetrics::NOOP const. We should use a generic here bound to DerivationPipelineMetrics + Send + Sync.

e.g.

impl<M> PipelineMetrics<M> where M: DerivationPipelineMetrics + Send + Sync {
    /// A no-op implementation for derivation `PipelineMetrics`.
    pub const NOOP: Self<M> = Self { 
        inner: NoopDerivationPipelineMetrics,
    };
}

Copy link
Contributor Author

@steph-rs steph-rs Oct 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case we have to use generics for each stage like

impl<M, AM, ...> PipelineMetrics<M, AM, ...>
where
M: DerivationPipelineMetrics + Send + Sync,
AM: AttributesQueueMetrics + Send + Sync,
...

I was trying to avoid this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd love to do that btw, but how should other metrics be implemented?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what you're saying... Let me think about this a little bit and get back to you. What you're currently proposing is to place the various stage metrics inside PipelineMetrics and then each stage holds a PipelineMetrics which implements any metrics trait required through the internal field for that stage? Is that right?

e.g.

struct AttributesQueue {
    metrics: PipelineMetrics,
    ...
}
PipelineMetrics {
    attributes_queue_metrics: Arc<dyn AttributesQueueMetrics + Send + Sync>,
    ...
}

impl AttributesQueueMetrics for PipelineMetrics {
   fn some_metrics_method() {
      self.attributes_queue_metrics.some_metrics_method();
   }
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, exactly
but I think there's a typo in first block:

pub struct AttributesQueue<P, AB, M>
where
    M: AttributesQueueMetrics + Send + Sync,
{
    ...
    metrics: M,
}

and then we could use self.metrics.some_metrics_method()

I'm not sure about this idea, but it can help us easily add other versions as no-op, or prometheus. I thought this was a problem we wanted to solve.

Does it make sense?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I thought you're intention of placing the various metric types inside PipelineMetrics was to be able to clone PipelineMetrics (doing Arc::clone for each internal Arc<dyn ..> field) and then passing it as a concrete type to each stage. This way, you could just avoid having another generic on each stage, and instead just import the trait that is implemented on the PipelineMetrics type.

This is why in my first block above I specific PipelineMetrics explicitly. And yes like you said - the problem to solve is having different metrics. since PipelineMetrics itself has dynamic fields, you can swap in a prometheus, or no-op, or whatever individual metrics easily, without any api changes.

Let me know if this makes sense. Happy to open a draft PR into this branch to show you a diff of what I'm getting at here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @refcell!
Could you please check the fixes I just pushed and the way I work with AttributesQueueMetrics. If it's okay with you, I'll add other no-op metrics.
But since I don't really see the whole picture of what we actually need in metrics, I will ask you to define the exact methods we want. It will be much appreciated.

@steph-rs
Copy link
Contributor Author

steph-rs commented Nov 7, 2024

Hi @refcell, it's been a long time for this PR...
I just pushed all the noop metrics for each stage, please check it out.
The use of the metrics still needs some work. Please don't be afraid: all metrics are record_*. I did it on purpose so that we can change it and don't forget anything.
Also, I'm not quite sure about the files (I created a few in src/traits for each stage).
Please let me know what you think about all this.

@steph-rs steph-rs requested a review from refcell November 7, 2024 15:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants