Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(metrics): separate metrics recorder impl from the server #2561

Merged
merged 10 commits into from
Oct 19, 2024

Conversation

kariy
Copy link
Member

@kariy kariy commented Oct 19, 2024

Summary by CodeRabbit

  • New Features

    • Introduced a new Prometheus metrics exporter with enhanced metrics handling.
    • Added a Server struct to serve metrics and collect process metrics.
    • New functions for collecting and describing memory statistics based on the "jemalloc" feature.
  • Bug Fixes

    • Streamlined metrics initialization and error handling.
  • Refactor

    • Reorganized metrics handling by replacing the old exporter with a more structured approach.
    • Renamed and restructured modules for clarity and improved functionality.
  • Chores

    • Updated dependencies in the Cargo.toml file.

Copy link

coderabbitai bot commented Oct 19, 2024

Walkthrough

Ohayo, sensei! This pull request introduces significant changes to the metrics handling within the Torii and Katana projects. The prometheus_exporter module has been replaced with the new PrometheusRecorder and MetricsServer, streamlining the recording and serving of metrics. The Cargo.toml file for the dojo-metrics package has been updated to include a new dependency, tokio-util. Additionally, new modules and traits have been introduced to enhance metrics reporting and server functionality, while outdated code related to the previous exporter has been removed.

Changes

File Change Summary
bin/torii/src/main.rs Replaced prometheus_exporter with PrometheusRecorder and updated metrics handling logic.
crates/katana/node/src/lib.rs Updated Node struct to remove prometheus_handle, added metrics_server, and modified launch logic.
crates/metrics/Cargo.toml Added tokio-util dependency to the workspace.
crates/metrics/src/exporters/mod.rs Introduced new prometheus module and Exporter trait for metrics recorders.
crates/metrics/src/exporters/prometheus.rs Added PrometheusRecorder struct with methods for installation and retrieval of the Prometheus handle.
crates/metrics/src/lib.rs Renamed prometheus_exporter to exporters, added process and server modules, and introduced error handling.
crates/metrics/src/process.rs Added collect_memory_stats and describe_memory_stats functions for jemalloc memory statistics.
crates/metrics/src/prometheus_exporter.rs Removed file containing previous Prometheus metrics exporting logic.
crates/metrics/src/server.rs Introduced Server struct for serving metrics with hooks for metrics collection.

Suggested labels

sozo

Suggested reviewers

  • glihm

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

🧹 Outside diff range and nitpick comments (6)
crates/metrics/src/lib.rs (2)

2-2: Consider making the process module public if needed, sensei.

Currently, process is declared as mod process;, which makes it private to the crate. If external modules or crates need to access items from process, consider changing it to pub mod process;.


22-32: Enhance error variant documentation for clarity, sensei.

Adding documentation comments to each variant of the Error enum can improve code readability and help other developers understand the error cases.

Apply this diff to add documentation comments:

 #[derive(Debug, thiserror::Error)]
 pub enum Error {
+    /// Global metrics recorder has already been installed.
     #[error("global metrics recorder already installed.")]
     GlobalRecorderAlreadyInstalled,

+    /// Could not bind to the specified address.
     #[error("could not bind to address: {addr}")]
     FailedToBindAddress { addr: SocketAddr },

+    /// An error occurred within the server.
     #[error(transparent)]
     Server(#[from] hyper::Error),
 }
crates/metrics/src/exporters/prometheus.rs (1)

15-17: Consider expanding the documentation for PrometheusRecorder.

Ohayo sensei! Providing more detailed documentation for the PrometheusRecorder struct can help others better understand its purpose and how to use it effectively.

crates/metrics/src/server.rs (1)

93-93: Enhance Debug implementation for hooks field

Ohayo, sensei! To improve debugging, consider displaying the number of hooks instead of a placeholder. This provides more insightful information about the state of the Server.

Apply the following change:

 fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
-    f.debug_struct("Server").field("hooks", &"...").field("exporter", &self.exporter).finish()
+    f.debug_struct("Server")
+        .field("hooks_count", &self.hooks.len())
+        .field("exporter", &self.exporter)
+        .finish()
 }
crates/metrics/src/process.rs (1)

83-116: Ohayo, sensei! Ensure consistency in metric descriptions

Some metric descriptions in describe_memory_stats are using backslashes for line continuation, which may not be necessary and can affect readability.

Consider removing the backslash and enclosing the description in a single string:

-describe_gauge!(
    "jemalloc.retained",
    metrics::Unit::Bytes,
-   "Total number of bytes in virtual memory mappings that were retained rather than being \
-    returned to the operating system via e.g. munmap(2)"
+   "Total number of bytes in virtual memory mappings that were retained rather than being returned to the operating system via e.g. munmap(2)"
);
crates/katana/node/src/lib.rs (1)

100-100: Ohayo, sensei! Consider addressing the TODO comment

The comment suggests that this code might be better placed in the build stage. Refactoring it accordingly could enhance the code organization and initialization flow.

Would you like assistance in moving this code to the build stage?

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between a20b53c and fd3b89d.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (9)
  • bin/torii/src/main.rs (2 hunks)
  • crates/katana/node/src/lib.rs (3 hunks)
  • crates/metrics/Cargo.toml (1 hunks)
  • crates/metrics/src/exporters/mod.rs (1 hunks)
  • crates/metrics/src/exporters/prometheus.rs (1 hunks)
  • crates/metrics/src/lib.rs (2 hunks)
  • crates/metrics/src/process.rs (1 hunks)
  • crates/metrics/src/prometheus_exporter.rs (0 hunks)
  • crates/metrics/src/server.rs (1 hunks)
💤 Files with no reviewable changes (1)
  • crates/metrics/src/prometheus_exporter.rs
🧰 Additional context used
🔇 Additional comments (9)
crates/metrics/src/exporters/mod.rs (1)

1-1: Ohayo, sensei! LGTM for the module declaration.

The prometheus module is declared as public, which is appropriate if it contains types or functions that need to be accessible from outside this module. This follows Rust conventions nicely.

crates/metrics/Cargo.toml (1)

21-21: Ohayo, sensei! New dependency added. Let's verify its usage.

The addition of tokio-util as a workspace-level dependency looks good. It's likely related to the metrics handling refactoring mentioned in the PR objectives.

Let's make sure this new dependency is being used in the codebase:

If the search doesn't return any results, we might need to reconsider adding this dependency or ensure it's used in the upcoming changes.

✅ Verification successful

Ohayo, sensei! tokio-util is successfully being utilized in the codebase. Approved.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for the usage of tokio-util in the codebase

# Test: Search for tokio-util usage
rg --type rust 'use tokio_util::'

Length of output: 376

crates/metrics/src/lib.rs (4)

5-5: Ohayo, sensei! Validate the necessity of the SocketAddr import.

The addition of use std::net::SocketAddr; suggests network functionalities are being utilized. Ensure that this import is necessary and that all uses of SocketAddr are correctly implemented.


42-46: Implementation of Report trait is spot on, sensei!

The implementation of the Report trait for ::metrics_process::Collector by invoking self.collect() is correct and aligns with the expected behavior.


1-3: Ohayo, sensei! Ensure module renaming is consistent throughout the codebase.

The module prometheus_exporter has been renamed to exporters. Please verify that all references to prometheus_exporter have been updated to prevent potential import errors.

Run the following script to check for any remaining references:

✅ Verification successful

Ohayo, sensei! It seems there was a syntax hiccup in the previous script. Let's run a corrected version to ensure all instances of prometheus_exporter are addressed.


Ohayo, sensei! All references to prometheus_exporter have been successfully updated and are consistent throughout the codebase. 🎉

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Find any occurrences of 'prometheus_exporter' in the codebase.

# Test: Search for 'prometheus_exporter' in Rust files. Expect: No matches.
rg --type rust 'prometheus_exporter'

Length of output: 191


Script:

#!/bin/bash
# Description: Find any occurrences of 'prometheus_exporter' in Rust files.

# Test: Search for 'prometheus_exporter' in Rust files. Expect: No matches.
rg --type rust 'prometheus_exporter'

Length of output: 36


15-15: Re-exporting server module: Verify exposed API, sensei.

By using pub use server::*;, all public items in the server module are re-exported. Please confirm that only intended items are made public to avoid exposing internal implementation details.

Run the following script to list all publicly exported items from the server module:

crates/metrics/src/exporters/prometheus.rs (1)

1-52: LGTM!

Ohayo sensei! The implementation looks solid overall.

crates/metrics/src/process.rs (1)

118-122: Ohayo, sensei! Confirm that empty functions are necessary

The empty implementations of collect_memory_stats and describe_memory_stats for non-jemalloc or non-Unix configurations might lead to confusion.

Please ensure that these functions are required and that they won't cause issues when called in those configurations.

✅ Verification successful

Ohayo, sensei! Verified that empty functions are necessary

The empty implementations of collect_memory_stats and describe_memory_stats are required in configurations without jemalloc or on non-Unix systems to ensure that calls to these functions do not cause issues.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify if `collect_memory_stats` and `describe_memory_stats` are used in configurations without jemalloc or on non-Unix systems.

# Test: Search for usages of these functions.
rg --type rust 'collect_memory_stats|describe_memory_stats'

Length of output: 565

bin/torii/src/main.rs (1)

23-23: Ohayo sensei! Correctly imported PrometheusRecorder

The PrometheusRecorder is correctly imported from dojo_metrics::exporters::prometheus.

Comment on lines +3 to +7
/// Trait for metrics recorder whose metrics can be exported.
pub trait Exporter: Clone + Send + Sync {
/// Export the metrics that have been recorded by the metrics thus far.
fn export(&self) -> String;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Ohayo again, sensei! The Exporter trait looks solid, with a small suggestion.

The Exporter trait is well-defined and documented. The Clone + Send + Sync bounds ensure thread safety, which is crucial for metrics systems. The export method signature is clear and straightforward.

One small suggestion to consider:

You might want to make the export method more flexible by allowing parameters. For example:

fn export(&self, format: Option<ExportFormat>) -> String;

This would allow users to specify different export formats if needed in the future, while maintaining backwards compatibility with the current implementation.

crates/metrics/src/exporters/prometheus.rs Show resolved Hide resolved
crates/metrics/src/server.rs Show resolved Hide resolved
crates/metrics/src/server.rs Show resolved Hide resolved
crates/metrics/src/process.rs Show resolved Hide resolved
crates/metrics/src/process.rs Show resolved Hide resolved
crates/metrics/src/process.rs Show resolved Hide resolved
if config.metrics.is_some() {
// Metrics recorder must be initialized before calling any of the metrics macros, in order
// for it to be registered.
let _ = PrometheusRecorder::install("katana")?;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Ohayo, sensei! Remove unnecessary assignment to _

Assigning the result to _ is unnecessary when using the ? operator. You can directly call the function and let the ? operator handle any errors.

Apply this diff:

-let _ = PrometheusRecorder::install("katana")?;
+PrometheusRecorder::install("katana")?;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
let _ = PrometheusRecorder::install("katana")?;
PrometheusRecorder::install("katana")?;

crates/katana/node/src/lib.rs Show resolved Hide resolved
bin/torii/src/main.rs Outdated Show resolved Hide resolved
Copy link

codecov bot commented Oct 19, 2024

Codecov Report

Attention: Patch coverage is 1.49254% with 132 lines in your changes missing coverage. Please review.

Project coverage is 69.80%. Comparing base (1d428ba) to head (f8bdbe1).
Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
crates/metrics/src/process.rs 0.00% 50 Missing ⚠️
crates/metrics/src/server.rs 0.00% 49 Missing ⚠️
crates/metrics/src/exporters/prometheus.rs 0.00% 20 Missing ⚠️
crates/katana/node/src/lib.rs 25.00% 6 Missing ⚠️
crates/metrics/src/lib.rs 0.00% 4 Missing ⚠️
bin/torii/src/main.rs 0.00% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2561      +/-   ##
==========================================
+ Coverage   69.44%   69.80%   +0.36%     
==========================================
  Files         397      400       +3     
  Lines       51222    51171      -51     
==========================================
+ Hits        35571    35720     +149     
+ Misses      15651    15451     -200     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Outside diff range and nitpick comments (3)
crates/metrics/src/server.rs (3)

24-25: Ohayo, sensei! Don't forget about the TODO.

There's a TODO comment about allowing configuration of the server executor. This could be an important feature for cancelling individual connection tasks.

Consider creating a GitHub issue to track this TODO and ensure it's not forgotten. Would you like me to create an issue for this task?


47-56: Ohayo, sensei! The with_reports method looks good, but let's make it even better!

The implementation of with_reports is well done, providing flexibility in adding new metrics reporters. However, we can make a small improvement for better performance.

Consider using extend_from_slice instead of extend if you know the exact number of items beforehand. This can be more efficient for large collections. Here's a suggested change:

 pub fn with_reports<I>(mut self, reports: I) -> Self
 where
     I: IntoIterator<Item = Box<dyn Report>>,
 {
     // convert the report types into callable hooks
-    let hooks = reports.into_iter().map(|r| Box::new(move || r.report()) as BoxedHook);
-    self.hooks.extend(hooks);
+    let new_hooks: Vec<BoxedHook> = reports.into_iter().map(|r| Box::new(move || r.report()) as BoxedHook).collect();
+    self.hooks.extend_from_slice(&new_hooks);
     self
 }

This change can potentially improve performance when adding a large number of reports.


58-63: Ohayo, sensei! The with_process_metrics method is solid, but let's add a cherry on top!

The implementation of with_process_metrics is well done, adding a hook for collecting process metrics using the default Collector.

For improved clarity and to make the code more self-documenting, consider adding a brief comment explaining what the process.describe() call does. Here's a suggested change:

 pub fn with_process_metrics(mut self) -> Self {
     let process = metrics_process::Collector::default();
+    // Register metric descriptions with the global registry
     process.describe();
     self.hooks.push(Box::new(move || process.collect()) as BoxedHook);
     self
 }

This small addition will help future maintainers understand the purpose of the describe() call at a glance.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between fd3b89d and 06df0d0.

📒 Files selected for processing (2)
  • bin/torii/src/main.rs (2 hunks)
  • crates/metrics/src/server.rs (1 hunks)
🧰 Additional context used
🔇 Additional comments (5)
crates/metrics/src/server.rs (2)

65-87: Ohayo, sensei! The start method looks great!

The implementation of the start method is well-structured and efficient. It properly sets up the HTTP server using Hyper, handles incoming requests, and serves metrics. The use of Arc for sharing hooks across threads is a good practice for concurrent scenarios.

Great job on implementing this crucial part of the metrics server!


90-97: Ohayo, sensei! Nice Debug implementation!

The custom Debug implementation for the Server struct is well done. It provides a clean and concise debug output by omitting the details of the hooks field, which is a good practice to avoid cluttering the output.

Good job on providing a useful debug representation!

bin/torii/src/main.rs (3)

23-23: Ohayo, sensei! The import statement looks good!

The updated import for PrometheusRecorder aligns perfectly with the refactoring objectives. It's a clean change that reflects the new structure of the dojo_metrics crate.


300-302: Sugoi work on the metrics setup, sensei!

The new implementation looks great! You've successfully separated the metrics recorder from the server, which aligns perfectly with our refactoring goals. Here's what I love about it:

  1. Using PrometheusRecorder::install is a clean upgrade from the previous prometheus_exporter::install_recorder.
  2. The dojo_metrics::Server::new setup with process metrics is a nice touch for comprehensive monitoring.
  3. Spawning the server in a separate tokio task is brilliant - it addresses the previous review comment about potential blocking of the main thread.

Domo arigato for implementing these changes so thoughtfully!


23-23: Ohayo once more, sensei! Let's wrap up this review.

Your changes to the metrics handling are truly impressive. Here's a quick summary of the impact:

  1. The refactoring has been executed cleanly, focusing solely on the metrics-related code.
  2. You've improved the separation of concerns by decoupling the metrics recorder from the server.
  3. The rest of the file remains untouched, which shows great precision in implementing the intended changes.

Overall, these modifications enhance the modularity and maintainability of the metrics system without disrupting the existing functionality. Excellent work, sensei!

Also applies to: 300-302

crates/metrics/src/server.rs Show resolved Hide resolved
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (2)
crates/metrics/src/server.rs (2)

23-24: Ohayo, sensei! Let's address the TODO comment.

There's a TODO comment about allowing the configuration of the server executor. This is an important feature that should be implemented to provide more control over the server's behavior.

Would you like assistance in implementing this feature? I can help you create a new method to configure the executor, which could look something like this:

pub fn with_executor<E>(mut self, executor: E) -> Self
where
    E: hyper::rt::Executor<BoxFuture<'static, ()>> + Send + Sync + 'static,
{
    // Implementation details
}

Let me know if you'd like me to elaborate on this implementation or if you prefer to track this as a separate issue.


1-101: Ohayo, sensei! Overall, this is a well-implemented metrics server.

The Server struct and its implementation provide a robust foundation for serving metrics. The code is well-structured, follows Rust best practices, and provides a clear API for users.

Great job on this implementation! With the minor adjustments suggested in the previous comments (simplifying the Hook trait, adding trait bounds to MetricsExporter, and addressing the TODO), this will be an excellent addition to the project.

If you need any help implementing the suggested changes or have any questions, please don't hesitate to ask. Keep up the great work, sensei!

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 06df0d0 and f8bdbe1.

📒 Files selected for processing (1)
  • crates/metrics/src/server.rs (1 hunks)
🧰 Additional context used
🔇 Additional comments (3)
crates/metrics/src/server.rs (3)

39-92: Ohayo, sensei! The implementation looks good overall.

The methods new, with_reports, with_process_metrics, and start are well-implemented and provide a good API for creating and configuring the server.

The implementation is clear, concise, and follows good Rust practices. Great job!


94-101: Ohayo, sensei! The Debug implementation looks great.

The custom Debug implementation for the Server struct is well-done. It provides a clear and concise representation of the struct, showing the exporter while omitting the detailed representation of the hooks.

This implementation follows best practices for custom Debug implementations. Well done!


35-38: ⚠️ Potential issue

Ohayo, sensei! Let's add some trait bounds.

As suggested in the previous review, we should add Clone + Send + Sync trait bounds to MetricsExporter. This ensures thread safety and proper functionality, especially when using self.exporter.clone() in async closures.

Please apply the following change:

 impl<MetricsExporter> Server<MetricsExporter>
 where
-    MetricsExporter: Exporter + 'static,
+    MetricsExporter: Exporter + Clone + Send + Sync + 'static,
 {

This change will make the code more robust and prevent potential runtime issues.

Likely invalid or redundant comment.

crates/metrics/src/server.rs Show resolved Hide resolved
@kariy kariy merged commit fac93b0 into main Oct 19, 2024
14 of 15 checks passed
@kariy kariy deleted the refactor/metrics branch October 19, 2024 20:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant