Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(katana): move the rpc address log #2518

Merged
merged 2 commits into from
Oct 10, 2024
Merged

refactor(katana): move the rpc address log #2518

merged 2 commits into from
Oct 10, 2024

Conversation

kariy
Copy link
Member

@kariy kariy commented Oct 10, 2024

ref #2514

moving this exactly when the server is spawned, to better reflect the server initialization point. also because with the new node and config structures, when user request for a random port (ie --port 0) because we delegate the search for a free port to the jsonrpsee (ie we just pass the raw ip address with port 0 when we build the rpc server), we can only know the actual port number once the server has been spawned. we could technically find the free port first on the node level so that we can determined the full address beforehand. but i prefer this approach.

Summary by CodeRabbit

  • New Features

    • Enhanced JSON log handling for improved observability and parsing.
    • Introduced a new RPC address logging feature during server startup.
  • Bug Fixes

    • Improved reliability in parsing logs and handling errors.
  • Chores

    • Updated dependencies to include regex for enhanced functionality.
    • Streamlined CI workflow for better testing efficiency.

Copy link

coderabbitai bot commented Oct 10, 2024

Walkthrough

Ohayo, sensei! This pull request introduces several modifications across multiple files in the Katana project. Key changes include the simplification of the print_intro function by removing the address parameter, the addition of a new dependency in Cargo.toml, and the introduction of a more generic JSON logging structure. Enhancements also focus on improved parsing and error handling for RPC server addresses, alongside minor updates to logging and comments in the lib.rs file.

Changes

File Path Change Summary
bin/katana/src/cli/node.rs Updated print_intro method signature by removing address parameter.
crates/katana/node-bindings/Cargo.toml Added new dependency: regex.workspace = true.
crates/katana/node-bindings/src/json.rs Replaced JsonLogMessage with a generic JsonLog<T>, introduced Fields<T>, modified KatanaInfo, and added RpcAddr struct.
crates/katana/node-bindings/src/lib.rs Replaced JsonLogMessage with JsonLog<RpcAddr>, added functions for parsing logs and cleaning ANSI codes, and improved error handling.
crates/katana/node/src/lib.rs Added log statement for RPC server address in spawn function, with minor formatting adjustments.
.github/workflows/ci.yml Modified CI workflow to streamline testing steps, added ensure-wasm job, and adjusted environment variable settings.

Possibly related PRs


🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Outside diff range and nitpick comments (2)
bin/katana/src/cli/node.rs (1)

237-237: Ohayo, sensei! LGTM with a small suggestion.

The removal of the address parameter from the print_intro function aligns well with the PR objectives. It simplifies the function and ensures that the RPC address is not logged prematurely.

However, to improve code readability, consider using the method call syntax for print_intro:

- print_intro(&self, genesis);
+ self.print_intro(genesis);

This change would make the code more idiomatic and consistent with Rust's object-oriented style.

crates/katana/node-bindings/src/lib.rs (1)

698-708: Expand test cases for parse_rpc_addr_log function

Ohayo sensei! The test test_parse_rpc_addr_log is well-written. To ensure robustness, consider adding test cases for logs with missing addr= substrings or malformed addresses to handle edge cases gracefully.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 7cd7d08 and 00072e4.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (5)
  • bin/katana/src/cli/node.rs (2 hunks)
  • crates/katana/node-bindings/Cargo.toml (1 hunks)
  • crates/katana/node-bindings/src/json.rs (1 hunks)
  • crates/katana/node-bindings/src/lib.rs (6 hunks)
  • crates/katana/node/src/lib.rs (1 hunks)
🧰 Additional context used
🔇 Additional comments (7)
crates/katana/node-bindings/Cargo.toml (1)

12-12: Ohayo, sensei! The regex dependency looks good to me.

The addition of the regex dependency aligns well with the PR objectives, particularly for parsing or validating RPC addresses. Using .workspace = true is a great practice for maintaining consistent versions across the project.

Could you enlighten this humble reviewer about the specific use case for the regex crate in the context of RPC address logging? It would be most helpful to understand how it ties into the refactoring mentioned in the PR objectives.

✅ Verification successful

Ohayo, sensei! The addition of the regex dependency aligns perfectly with the PR objectives and ensures consistent version management across the project.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the usage of the newly added regex dependency

# Test: Search for regex usage in Rust files
rg --type rust 'use\s+regex|regex::'

Length of output: 627

crates/katana/node/src/lib.rs (1)

358-359: Ohayo, sensei! This log statement is a welcome addition.

The new log statement effectively addresses the PR objective by relocating the RPC address log to occur precisely when the server is spawned. This change offers several benefits:

  1. It provides clearer feedback on the exact moment the RPC server becomes available.
  2. It helps users quickly identify the address they need to connect to the RPC server.
  3. It aligns well with the new node and configuration structures, especially in scenarios where random ports are used.
bin/katana/src/cli/node.rs (2)

Line range hint 371-467: Ohayo again, sensei! The implementation looks good.

The print_intro function has been successfully updated to remove the address-related logging, which is consistent with the PR objectives. The function continues to log important information such as account details, seed, and genesis contracts.

This change ensures that the RPC address is not logged prematurely, allowing for the correct handling of random port allocation as mentioned in the PR summary.


Line range hint 1-624: Ohayo one last time, sensei! Here's a summary of the changes.

The modifications in this file successfully achieve the PR's main objective of relocating the RPC address logging. By removing the address parameter from the print_intro function and its call site, we ensure that the RPC address is not logged prematurely.

These changes support the new node and configuration structures, particularly in scenarios where a random port is requested (e.g., using --port 0). This allows the jsonrpsee library to handle port allocation, with the actual port number being determined after the server has been spawned.

The overall impact of these changes is positive:

  1. It provides a clearer representation of the server's initialization point.
  2. It allows for more flexible port allocation, especially in random port scenarios.
  3. It simplifies the print_intro function, making it more focused on its primary responsibilities.

Great work on implementing these changes, sensei! They align well with the PR objectives and improve the overall architecture of the Katana node.

crates/katana/node-bindings/src/json.rs (3)

12-16: Ohayo, sensei! Excellent use of generics in JsonLog struct

By introducing a generic type parameter T in the JsonLog struct, you've enhanced flexibility for parsing different log message formats. This change allows for more scalable and adaptable code. Well done!


49-54: Ohayo, sensei! Efficient parsing with TryFrom<String> for KatanaInfo

Implementing TryFrom<String> for KatanaInfo using serde_json::from_str provides a clean and straightforward way to parse the JSON string contained in the message field. This enhances readability and maintainability. Great job!


70-71: Verify the deserialization of RpcAddr, sensei

Ensure that the addr field in the JSON logs matches the expected format for SocketAddr. This is particularly important if the logs might contain IPv6 addresses or unusual port configurations. Verifying this will prevent potential runtime errors.

Run the following script to check the format of addr in the logs:

Comment on lines +20 to +23
pub struct Fields<T = serde_json::Value> {
pub message: String,
#[serde(flatten)]
pub other: T,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Consider adding a Deserialize bound to the generic type T, sensei

Currently, the generic type T in Fields<T> does not have any trait bounds. To ensure that serde can deserialize other: T properly, it would be prudent to add a T: Deserialize<'de> constraint.

Apply this diff to add the required constraint:

 #[derive(Deserialize, Debug)]
-pub struct Fields<T = serde_json::Value> {
+pub struct Fields<T = serde_json::Value>
+where
+    T: for<'de> Deserialize<'de>,
+{
     pub message: String,
     #[serde(flatten)]
     pub other: T,
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
pub struct Fields<T = serde_json::Value> {
pub message: String,
#[serde(flatten)]
pub other: T,
#[derive(Deserialize, Debug)]
pub struct Fields<T = serde_json::Value>
where
T: for<'de> Deserialize<'de>,
{
pub message: String,
#[serde(flatten)]
pub other: T,

Comment on lines +514 to +515
dbg!(&line);

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Remove dbg! macros from production code

Ohayo sensei! The dbg! macro is intended for debugging and should not be present in production code. Please remove the dbg!(&line); at line 514 and the dbg!(...) at line 519 to keep the code clean.

Apply this diff to remove the dbg! statements:

-    dbg!(&line);

...

-    if let Ok(log) = dbg!(serde_json::from_str::<JsonLog<RpcAddr>>(&line)) {
+    if let Ok(log) = serde_json::from_str::<JsonLog<RpcAddr>>(&line) {

Also applies to: 519-519

///
/// This is useful for removing the color codes from the katana output.
fn clean_ansi_escape_codes(input: &str) -> Result<Cow<'_, str>, Error> {
let re = regex::Regex::new(r"\x1b\[[0-9;]*[a-zA-Z]")?;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Initialize the regex once to improve performance

Ohayo sensei! Currently, the regex in clean_ansi_escape_codes is compiled each time the function is called, which can impact performance. Consider initializing the regex once and reusing it.

Apply this diff to use Lazy from once_cell to compile the regex only once:

+use once_cell::sync::Lazy;
+use regex::Regex;

 fn clean_ansi_escape_codes(input: &str) -> Result<Cow<'_, str>, Error> {
+    static ANSI_REGEX: Lazy<Regex> = Lazy::new(|| Regex::new(r"\x1b\[[0-9;]*[a-zA-Z]").unwrap());
-    let re = regex::Regex::new(r"\x1b\[[0-9;]*[a-zA-Z]")?;
-    Ok(re.replace_all(input, ""))
+    Ok(ANSI_REGEX.replace_all(input, ""))
 }

Remember to add once_cell = "1.18" to your Cargo.toml dependencies.

Committable suggestion was skipped due to low confidence.

Comment on lines +601 to +612
// Example RPC address log format (ansi color codes removed):
// 2024-10-10T14:20:53.563106Z INFO rpc: RPC server started. addr=127.0.0.1:60373
fn parse_rpc_addr_log(log: &str) -> Result<SocketAddr, Error> {
// remove any ANSI escape codes from the log.
let cleaned = clean_ansi_escape_codes(log)?;

// This will separate the log into two parts as separated by `addr=` str and we take
// only the second part which is the address.
let addr_part = cleaned.split("addr=").nth(1).ok_or(Error::MissingSocketAddr)?;
let addr = addr_part.trim();
Ok(SocketAddr::from_str(addr)?)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Handle potential parsing errors in parse_rpc_addr_log

Ohayo sensei! The parse_rpc_addr_log function assumes that the log contains addr=, which might not always be the case. Consider adding error handling for logs that do not contain addr= or have unexpected formats.

Apply this diff to improve error handling:

 fn parse_rpc_addr_log(log: &str) -> Result<SocketAddr, Error> {
     // remove any ANSI escape codes from the log.
     let cleaned = clean_ansi_escape_codes(log)?;

     // This will separate the log into two parts as separated by `addr=` str and we take
     // only the second part which is the address.
-    let addr_part = cleaned.split("addr=").nth(1).ok_or(Error::MissingSocketAddr)?;
+    let addr_part = cleaned.split("addr=").nth(1)
+        .ok_or_else(|| Error::UnexpectedFormat(format!("Missing 'addr=' in log: {}", cleaned)))?;
     let addr = addr_part.trim();
     Ok(SocketAddr::from_str(addr)?)
 }

This ensures that any unexpected formats are reported clearly.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Example RPC address log format (ansi color codes removed):
// 2024-10-10T14:20:53.563106Z INFO rpc: RPC server started. addr=127.0.0.1:60373
fn parse_rpc_addr_log(log: &str) -> Result<SocketAddr, Error> {
// remove any ANSI escape codes from the log.
let cleaned = clean_ansi_escape_codes(log)?;
// This will separate the log into two parts as separated by `addr=` str and we take
// only the second part which is the address.
let addr_part = cleaned.split("addr=").nth(1).ok_or(Error::MissingSocketAddr)?;
let addr = addr_part.trim();
Ok(SocketAddr::from_str(addr)?)
}
fn parse_rpc_addr_log(log: &str) -> Result<SocketAddr, Error> {
// remove any ANSI escape codes from the log.
let cleaned = clean_ansi_escape_codes(log)?;
// This will separate the log into two parts as separated by `addr=` str and we take
// only the second part which is the address.
let addr_part = cleaned.split("addr=").nth(1)
.ok_or_else(|| Error::UnexpectedFormat(format!("Missing 'addr=' in log: {}", cleaned)))?;
let addr = addr_part.trim();
Ok(SocketAddr::from_str(addr)?)
}

Comment on lines +593 to +599
/// Removes ANSI escape codes from a string.
///
/// This is useful for removing the color codes from the katana output.
fn clean_ansi_escape_codes(input: &str) -> Result<Cow<'_, str>, Error> {
let re = regex::Regex::new(r"\x1b\[[0-9;]*[a-zA-Z]")?;
Ok(re.replace_all(input, ""))
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Consider using a dedicated crate to handle ANSI escape codes

Ohayo sensei! Instead of manually handling ANSI escape codes, consider using a dedicated crate like ansi_strip or strip-ansi-escapes for more robust handling.

Apply this diff to use the strip-ansi-escapes crate:

+use strip_ansi_escapes::strip.as_bytes;

 fn clean_ansi_escape_codes(input: &str) -> Result<Cow<'_, str>, Error> {
-    let re = regex::Regex::new(r"\x1b\[[0-9;]*[a-zA-Z]")?;
-    Ok(re.replace_all(input, ""))
+    let stripped = strip_ansi_escapes::strip(input.as_bytes())?;
+    Ok(Cow::Owned(String::from_utf8_lossy(&stripped).into_owned()))
 }

Don't forget to add strip-ansi-escapes = "0.1.0" to your Cargo.toml.

Committable suggestion was skipped due to low confidence.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (1)
.github/workflows/ci.yml (1)

43-49: Ohayo, sensei! These changes look great!

The modifications to the test job streamline the testing process and align well with the PR objectives. Here's a breakdown of the improvements:

  1. Adding /tmp/bins to PATH allows direct access to katana and sozo binaries.
  2. Setting execute permissions ensures the binaries can be run.
  3. Extracting tar files to /tmp/ provides necessary test data.
  4. Simplifying the cargo llvm-cov nextest command suggests a more straightforward test execution.

These changes should make the CI process more efficient and easier to maintain.

Consider adding a comment explaining the purpose of the tar file extractions for better clarity:

+ # Extract test databases
  tar -xzf spawn-and-move-db.tar.gz -C /tmp/
  tar -xzf types-test-db.tar.gz -C /tmp/
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 00072e4 and 3dbbf08.

📒 Files selected for processing (1)
  • .github/workflows/ci.yml (1 hunks)
🧰 Additional context used

Copy link

codecov bot commented Oct 10, 2024

Codecov Report

Attention: Patch coverage is 94.91525% with 3 lines in your changes missing coverage. Please review.

Project coverage is 68.55%. Comparing base (f56c4f9) to head (3dbbf08).
Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
bin/katana/src/cli/node.rs 0.00% 2 Missing ⚠️
crates/katana/node-bindings/src/lib.rs 97.91% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main    #2518   +/-   ##
=======================================
  Coverage   68.55%   68.55%           
=======================================
  Files         387      387           
  Lines       49950    49968   +18     
=======================================
+ Hits        34243    34258   +15     
- Misses      15707    15710    +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant