Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(storage): impl Database for RocksDB #163

Merged
merged 2 commits into from
Dec 6, 2024
Merged

feat(storage): impl Database for RocksDB #163

merged 2 commits into from
Dec 6, 2024

Conversation

distractedm1nd
Copy link
Contributor

@distractedm1nd distractedm1nd commented Dec 4, 2024

Reenables rocksdb, finishes implementation, integrates into system test.

Summary by CodeRabbit

  • New Features

    • Introduced RocksDBConnection for improved database handling.
    • Enhanced data retrieval and writing capabilities with new methods in the TreeReader and TreeWriter traits.
    • Added new dependencies for better functionality and workspace organization.
  • Bug Fixes

    • Resolved issues related to dependency management and organization within the workspace.
  • Documentation

    • Updated package metadata across various components for clarity and consistency.
  • Tests

    • Refactored tests to utilize the new RocksDBConnection, ensuring robust validation of database interactions.

@distractedm1nd distractedm1nd requested a review from jns-ps December 4, 2024 18:05
Copy link
Contributor

coderabbitai bot commented Dec 4, 2024

Walkthrough

The pull request introduces several modifications to the project's Cargo.toml files across multiple crates, including updates to dependencies, workspace members, and package metadata. Notably, the rocksdb dependency is reintroduced, and a new dependency, tempfile, is added. The RocksDBConnection struct is updated to use Arc<RocksDB> for thread-safe access, and new methods for reading and writing data are implemented. Additionally, test configurations are adjusted to utilize RocksDBConnection instead of an in-memory database.

Changes

File Change Summary
Cargo.toml Updated authors list, added crates/serde and crates/keys to workspace members, reintroduced rocksdb, and added tempfile.
crates/storage/Cargo.toml Updated package metadata to use workspace specifications, uncommented rocksdb, and added tempfile as a dev-dependency.
crates/storage/src/rocksdb.rs Modified RocksDBConnection to use Arc<RocksDB>, added write_node_batch, and enhanced TreeReader methods.
crates/tests/Cargo.toml Changed dependency declaration syntax to dot notation and added tempfile.
crates/tests/src/lib.rs Replaced InMemoryDatabase with RocksDBConnection in setup_db, using TempDir for temporary storage.

Possibly related PRs

  • storage: refactor and rocksdb implementation #123: The main PR reintroduced the rocksdb dependency, which is also a focus in this PR that implements RocksDB as the new storage solution, indicating a direct relationship in dependency management.
  • refactor(common): use underlying VK types instead of Vec<u8> #150: The main PR updates the Cargo.toml to enhance dependency management, which aligns with the changes in this PR that refactor the VerifyingKey to use underlying types instead of Vec<u8>, reflecting a broader effort to improve type safety and clarity in the codebase.
  • feat(common): secp256r1 support #152: The main PR's updates to dependencies and workspace structure may relate to this PR's introduction of secp256r1 support, as both involve enhancements to the cryptographic capabilities and dependency management within the project.

Suggested reviewers

  • sebasti810

🐰 In the code, I hop and play,
With rocksdb back, hip-hip-hooray!
New paths to explore, oh what a sight,
With tempfile here, everything feels right!
Dependencies dance in a workspace so neat,
Let’s write and read, oh what a treat! 🌟


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🧹 Outside diff range and nitpick comments (11)
crates/keys/src/signatures.rs (1)

8-8: Remove unnecessary import of std::self

The import statement use std::{self}; is unnecessary because the Rust prelude already includes the standard library. You can safely remove this line to clean up the code.

-use std::{self};
crates/keys/src/signing_keys.rs (2)

57-57: Correct the error message to match the context

The error message refers to VerifyingKey instead of SigningKey. Update it to accurately reflect that the issue is with the SigningKey.

-        _ => bail!("Unexpected algorithm for VerifyingKey"),
+        _ => bail!("Unexpected algorithm for SigningKey"),

22-24: Consider zeroizing secret keys for security

For improved security, consider zeroizing the secret keys when they go out of scope to prevent sensitive data from lingering in memory.

crates/common/src/tree/proofs.rs (1)

95-96: Fix grammatical error in comment

The comment has a minor grammatical error. "It's" should be "its" to indicate possession.

// Append the new entry and verify it's validity
+// Append the new entry and verify its validity
crates/storage/Cargo.toml (1)

RocksDB dependency needs version and features specification in workspace

The RocksDB dependency is not properly configured at the workspace level. Found:

  • Direct version specification 0.21.0 with multi-threaded-cf feature in one Cargo.toml
  • Other modules using workspace = true reference, but the dependency is not defined in workspace.dependencies

To ensure consistent RocksDB usage across the project:

  • Add RocksDB configuration to [workspace.dependencies] section in root Cargo.toml:
    rocksdb = { version = "0.21.0", features = ["multi-threaded-cf"] }
  • Keep using rocksdb.workspace = true in module-level Cargo.toml files
🔗 Analysis chain

Line range hint 1-30: Verify RocksDB version and features

Since this PR implements RocksDB functionality, we should verify the RocksDB dependency configuration.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check workspace-level RocksDB version and features
rg -l 'rocksdb.*workspace.*=.*true' | xargs cat | grep -A 5 'rocksdb'

# Check for any direct version specifications of RocksDB
rg 'rocksdb.*=.*\"[0-9]' 

Length of output: 271


Script:

#!/bin/bash
# Check workspace-level RocksDB configuration in Cargo.toml files
fd Cargo.toml --exec grep -A 5 'rocksdb'

# Check workspace root Cargo.toml for dependencies
cat Cargo.toml | grep -A 10 '\[workspace.dependencies\]'

Length of output: 568

crates/serde/src/lib.rs (2)

1-1: Consider removing or justifying dead_code allowance

The #![allow(dead_code)] attribute suggests there might be unused code. While this might be temporary during the RocksDB implementation, consider documenting why it's needed or removing unused code.


Line range hint 94-124: Consider adding tests for RocksDB round-trip serialization

While the base64 implementation looks correct, it would be valuable to add tests specifically verifying round-trip serialization through RocksDB to ensure data integrity.

Would you like me to help create a test suite for RocksDB serialization scenarios?

justfile (1)

68-68: Consider adding RocksDB-specific commands

While the integration test command has been updated, consider adding development commands for:

  • RocksDB data inspection
  • Database cleanup
  • Performance testing

Example additions:

+rocks-inspect:
+    @echo "Inspecting RocksDB data..."
+    cargo run --bin db-tools -- inspect
+
+rocks-cleanup:
+    @echo "Cleaning up RocksDB data..."
+    rm -rf ./data/rocksdb/*
crates/tests/src/lib.rs (1)

Line range hint 31-134: Consider enhancing test robustness

While the test implementation is comprehensive, consider these improvements:

  1. Add explicit cleanup steps after the test
  2. Consider parameterizing the block height threshold (currently hardcoded to 100)
  3. Add timeout to prevent test from running indefinitely
 let mut rx = lc_da_layer.clone().subscribe_to_heights();
 let initial_height = rx.recv().await.unwrap();
+const MAX_TEST_BLOCKS: u64 = 100;
+const TEST_TIMEOUT: Duration = Duration::from_secs(300);
+
+tokio::time::timeout(TEST_TIMEOUT, async {
     while let Ok(height) = rx.recv().await {
         debug!("received height {}", height);
-        if height >= initial_height + 100 {
+        if height >= initial_height + MAX_TEST_BLOCKS {
             break;
         }
     }
+}).await.expect("Test exceeded timeout");
crates/common/src/test_utils.rs (1)

48-49: Consider consolidating key generation into a helper method.

Multiple instances of SigningKey::new_ed25519() are scattered throughout the file. Consider extracting this into a helper method to improve maintainability and make future key generation changes easier.

+ fn create_test_signing_key() -> SigningKey {
+     SigningKey::new_ed25519()
+ }

- let service_challenge_key = SigningKey::new_ed25519();
+ let service_challenge_key = create_test_signing_key();

Also applies to: 79-79, 144-144, 167-167, 206-206, 257-257

crates/common/src/transaction_builder.rs (1)

85-86: LGTM: Consistent key generation, but consider reducing duplication

The standardization to SigningKey::new_ed25519() is good. However, since this pattern is repeated multiple times, consider extracting it into a helper method.

+ impl TransactionBuilder {
+     fn generate_random_ed25519_key() -> SigningKey {
+         SigningKey::new_ed25519()
+     }
+ }

Then use it in the methods:

- let random_service_challenge_key = SigningKey::new_ed25519();
- let random_service_signing_key = SigningKey::new_ed25519();
+ let random_service_challenge_key = Self::generate_random_ed25519_key();
+ let random_service_signing_key = Self::generate_random_ed25519_key();

Also applies to: 119-119

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 976a354 and 8d02ea8.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (37)
  • .github/FUNDING.yml (0 hunks)
  • .github/workflows/ci.yml (1 hunks)
  • Cargo.toml (3 hunks)
  • crates/cli/Cargo.toml (1 hunks)
  • crates/cli/src/main.rs (1 hunks)
  • crates/common/Cargo.toml (1 hunks)
  • crates/common/src/digest.rs (1 hunks)
  • crates/common/src/hashchain.rs (3 hunks)
  • crates/common/src/keys.rs (0 hunks)
  • crates/common/src/lib.rs (0 hunks)
  • crates/common/src/operation.rs (1 hunks)
  • crates/common/src/test_utils.rs (7 hunks)
  • crates/common/src/transaction_builder.rs (4 hunks)
  • crates/common/src/tree.rs (0 hunks)
  • crates/common/src/tree/key_directory_tree.rs (1 hunks)
  • crates/common/src/tree/mod.rs (1 hunks)
  • crates/common/src/tree/proofs.rs (1 hunks)
  • crates/common/src/tree/snarkable_tree.rs (1 hunks)
  • crates/keys/Cargo.toml (1 hunks)
  • crates/keys/src/lib.rs (1 hunks)
  • crates/keys/src/signatures.rs (1 hunks)
  • crates/keys/src/signing_keys.rs (1 hunks)
  • crates/keys/src/verifying_keys.rs (1 hunks)
  • crates/node_types/lightclient/Cargo.toml (0 hunks)
  • crates/node_types/prover/Cargo.toml (1 hunks)
  • crates/node_types/prover/src/prover/tests.rs (3 hunks)
  • crates/node_types/prover/src/webserver.rs (2 hunks)
  • crates/serde/Cargo.toml (1 hunks)
  • crates/serde/src/lib.rs (1 hunks)
  • crates/storage/Cargo.toml (1 hunks)
  • crates/storage/src/redis.rs (0 hunks)
  • crates/storage/src/rocksdb.rs (4 hunks)
  • crates/tests/Cargo.toml (1 hunks)
  • crates/tests/src/lib.rs (1 hunks)
  • crates/zk/groth16/Cargo.toml (1 hunks)
  • crates/zk/nova/Cargo.toml (1 hunks)
  • justfile (1 hunks)
💤 Files with no reviewable changes (6)
  • crates/common/src/lib.rs
  • .github/FUNDING.yml
  • crates/common/src/keys.rs
  • crates/common/src/tree.rs
  • crates/node_types/lightclient/Cargo.toml
  • crates/storage/src/redis.rs
✅ Files skipped from review due to trivial changes (5)
  • crates/serde/Cargo.toml
  • crates/keys/Cargo.toml
  • crates/common/src/operation.rs
  • crates/zk/nova/Cargo.toml
  • crates/zk/groth16/Cargo.toml
🔇 Additional comments (32)
crates/common/src/tree/key_directory_tree.rs (1)

71-75: ⚠️ Potential issue

Handle errors instead of using unwrap() in write_batch() method

In the write_batch() method, using unwrap() can cause a panic if writing the batch fails. Consider handling the error properly by propagating it with the ? operator.

Suggested Change:

if let Some(batch) = self.pending_batch.take() {
-    self.db.write_node_batch(&batch).unwrap();
+    self.db.write_node_batch(&batch)?;
    self.epoch += 1;
}

Likely invalid or redundant comment.

crates/storage/src/rocksdb.rs (2)

109-110: Avoid unnecessary heap allocations

In the deserialization of the version number, there's an unnecessary heap allocation when converting the key slice to a Vec<u8>. Consider using a slice reference instead.

[performance]

-let version: Version =
-    bincode::deserialize(&hex::decode(&key[value_key.len() + 1..])?)?;
+let version_slice = &key[value_key.len() + 1..];
+let version_bytes = hex::decode(version_slice)?;
+let version: Version = bincode::deserialize(&version_bytes)?;

141-145: Check for empty keys when writing nodes

Ensure that the keys generated for storing nodes are not empty, which could lead to database corruption.

Please run the following script to check for any empty keys:

crates/keys/src/verifying_keys.rs (1)

164-167: ⚠️ Potential issue

Incorrect method to derive Secp256k1 public key from secret key

In the implementation of From<Secp256k1SigningKey> for VerifyingKey (lines 164-167), the code uses sk.public_key(SECP256K1) to derive the public key. However, SecretKey does not have a public_key method in the secp256k1 crate.

To obtain the public key from a SecretKey, use Secp256k1VerifyingKey::from_secret_key(&SECP256K1, &sk).

Apply this diff to fix the issue:

 fn from(sk: Secp256k1SigningKey) -> Self {
-    sk.public_key(SECP256K1).into()
+    let vk = Secp256k1VerifyingKey::from_secret_key(&SECP256K1, &sk);
+    VerifyingKey::Secp256k1(vk)
 }

Likely invalid or redundant comment.

crates/common/src/hashchain.rs (2)

106-109: Implementation of add_entry is correct

The add_entry method correctly validates and adds a new entry to the hashchain, ensuring the integrity of the chain.


Line range hint 178-206: Validation logic in validate_new_entry is robust

The validate_new_entry method thoroughly checks the new entry for consistency, signature validity, and adherence to protocol rules.

crates/common/src/tree/mod.rs (1)

12-19: Definition of HashchainResponse is clear and appropriate

The HashchainResponse enum effectively represents the possible outcomes when fetching tree values, enhancing code readability.

crates/common/Cargo.toml (1)

10-28: Dependency reorganization enhances maintainability

The grouping of dependencies with comments improves the clarity and maintainability of the Cargo.toml file.

crates/storage/Cargo.toml (1)

27-27: LGTM: Workspace alignment for tempfile

The change to use workspace-level version for tempfile is consistent with the project's dependency management approach.

crates/tests/Cargo.toml (1)

17-28: Dependencies properly configured for RocksDB testing

The dependencies are well-structured for RocksDB integration:

  • prism-storage is included for RocksDB implementation
  • tempfile added for handling temporary test databases
  • All dependencies consistently use workspace versions
crates/node_types/prover/Cargo.toml (1)

30-30: Verify prism-keys integration with RocksDB

The addition of prism-keys alongside prism-storage suggests cryptographic operations with RocksDB storage. Please ensure proper error handling for cryptographic operations when interacting with the database.

Consider documenting the interaction between cryptographic operations and RocksDB storage in the codebase, especially regarding error handling and recovery procedures.

crates/cli/Cargo.toml (1)

34-34: Verify RocksDB dependencies for CLI package

While the addition of prism-keys aligns with the codebase restructuring, I notice this PR is about RocksDB implementation but there's no direct RocksDB dependency here. This suggests RocksDB functionality might be encapsulated in prism-storage.

✅ Verification successful

RocksDB dependencies are correctly structured

The dependency structure is well-organized:

  • CLI depends on both prism-storage and prism-keys directly
  • prism-storage includes the RocksDB dependency
  • This layered approach properly encapsulates the RocksDB functionality
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if RocksDB is indeed handled by prism-storage
rg -p "rocksdb" "crates/storage/Cargo.toml"

Length of output: 75


Script:

#!/bin/bash
# Check if prism-keys depends on prism-storage
rg -p "storage" "crates/keys/Cargo.toml"

# Check if CLI uses prism-storage directly
rg -p "storage" "crates/cli/Cargo.toml"

Length of output: 116

crates/common/src/digest.rs (1)

5-6: LGTM: Serialization changes support RocksDB integration

The switch to prism_serde and retention of RocksDB-specific serialization implementations (particularly the From<[u8; N]> impl) properly support RocksDB's byte-based storage requirements.

crates/serde/src/lib.rs (2)

Line range hint 11-93: LGTM: Comprehensive serialization support for RocksDB

The implementation provides flexible serialization options crucial for RocksDB:

  • Raw binary format for efficient storage
  • Human-readable formats (hex/base64) for debugging
  • Fixed-size array support for known-length data

The error handling is robust, particularly in the fixed-size deserializers where size mismatches are properly reported.


Line range hint 4-9: Verify CryptoPayload usage in RocksDB context

The CryptoPayload struct uses base64 encoding for human-readable serialization. While this works well for JSON/text formats, verify if this is the intended format for RocksDB storage.

Cargo.toml (2)

61-62: Verify RocksDB feature selection

The multi-threaded-cf feature is enabled for RocksDB, which is appropriate for concurrent access to column families. However, consider if additional features like zstd for compression might be beneficial for your use case.

✅ Verification successful

Let me gather more information about the RocksDB usage to make a more informed decision.


The multi-threaded-cf feature is sufficient for the current implementation

After reviewing the RocksDB implementation, the current configuration is well-aligned with the codebase's needs. The implementation:

  • Uses MultiThreaded mode explicitly (DBWithThreadMode<MultiThreaded>)
  • Performs basic key-value operations without compression-heavy workloads
  • Focuses on storing small, fixed-size data (commitments, epochs) and serialized nodes
  • Has no evidence of large value storage that would benefit from compression

Additional features like zstd compression would add unnecessary complexity without clear benefits for the current usage pattern.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if RocksDB is used with compression in the codebase
rg -A 5 'rocksdb::Options|DB::open' 

Length of output: 363


Script:

#!/bin/bash
# Check for RocksDB configuration and usage patterns
rg -l "rocksdb" 

# Look for any compression-related configurations
rg -i "compress|zstd" crates/storage/src/

# Check the full RocksDB implementation
cat crates/storage/src/rocksdb.rs

Length of output: 7109


27-28: Verify integration of new crates with RocksDB

The addition of crates/serde and crates/keys suggests serialization and key management functionality. Ensure these new crates properly integrate with RocksDB for data serialization and key handling.

Also applies to: 41-42

.github/workflows/ci.yml (2)

Line range hint 93-114: Validate unused-deps job configuration

The new unused-deps check is valuable for maintaining clean dependencies. However, ensure it doesn't incorrectly flag RocksDB-related dependencies as unused, especially if they're only used in test configurations.

✅ Verification successful

RocksDB dependencies are correctly configured for the unused-deps check

The RocksDB dependency is used in the main code path (not just tests) as evidenced by its implementation in crates/storage/src/rocksdb.rs and its inclusion as a main dependency in crates/storage/Cargo.toml. The unused-deps check will correctly handle this dependency since it's actively used in the core functionality.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if RocksDB is used in test-only code
rg -l '#\[cfg\(test\)\].*rocksdb|#\[test\].*rocksdb'

Length of output: 54


Script:

#!/bin/bash
# Let's check for RocksDB dependencies in Cargo.toml files
rg -l "rocksdb" --type toml

# Also check for RocksDB usage in the codebase
rg -l "rocksdb" --type rust

# Check for test-specific dependencies in Cargo.toml
rg "dev-dependencies" -A 5 --type toml

Length of output: 336


Script:

#!/bin/bash
# Let's check how RocksDB is used in the main code vs test code
rg "rocksdb" -B 2 -A 2 crates/storage/src/lib.rs
rg "rocksdb" -B 2 -A 2 crates/storage/src/rocksdb.rs

# Check if RocksDB is a main dependency or dev-dependency
rg "rocksdb" -B 2 -A 2 crates/storage/Cargo.toml

Length of output: 996


91-91: Ensure RocksDB integration tests are included

The modified test command now targets the prism-tests package. Verify that RocksDB integration tests are properly included in this package.

✅ Verification successful

RocksDB test coverage is properly maintained

The test coverage for RocksDB is well-structured across the codebase:

  • RocksDB unit tests are maintained in crates/storage/src/rocksdb.rs
  • Integration tests in prism-tests package properly import and use RocksDB functionality through prism_storage::{rocksdb::RocksDBConnection, Database}
  • The --features mock_prover flag doesn't affect RocksDB testing as it's part of the storage layer
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check for RocksDB-related tests in prism-tests
echo "Searching for RocksDB tests:"
rg -l 'rocksdb.*test' crates/tests/

Length of output: 104


Script:

#!/bin/bash
# Let's try a broader search for RocksDB tests across the codebase
echo "Searching for RocksDB tests in the entire codebase:"
rg -l 'rocksdb' --type rust

# Also check for test modules or functions related to RocksDB
echo -e "\nSearching for RocksDB test modules/functions:"
ast-grep --pattern 'mod tests {
  $$$
  rocksdb
  $$$
}'

# Check the contents of prism-tests package
echo -e "\nChecking prism-tests package structure:"
fd . crates/tests/ -t f -e rs

Length of output: 565


Script:

#!/bin/bash
# Let's examine the content of the relevant files to understand RocksDB test coverage
echo "Checking RocksDB-related content in prism-tests:"
rg -C 5 'rocksdb' crates/tests/src/lib.rs

echo -e "\nChecking RocksDB tests in storage crate:"
rg -C 5 '#\[test\]' crates/storage/src/rocksdb.rs

echo -e "\nChecking if prism-tests includes storage tests:"
rg -C 5 'storage' crates/tests/src/lib.rs

Length of output: 1821

justfile (1)

67-69: Verify test environment cleanup

Ensure that RocksDB files are properly cleaned up after integration tests, especially when tests fail. Consider adding explicit cleanup commands.

✅ Verification successful

RocksDB cleanup is properly handled through TempDir

The test setup is already using Rust's TempDir which automatically cleans up the directory and all its contents (including RocksDB files) when it goes out of scope, even in case of test failures. This is a safe and idiomatic approach in Rust for handling temporary test resources.

  • The temp_dir variable is created using TempDir::new() and is bound to the local scope
  • RocksDB is initialized with the temporary directory path
  • When tests complete or fail, temp_dir's Drop implementation will automatically clean up all contents
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check for RocksDB data cleanup in tests
rg -A 5 'Drop.*RocksDB|cleanup.*rocks|teardown.*test'

Length of output: 55


Script:

#!/bin/bash
# Let's search for RocksDB usage patterns and test cleanup mechanisms
rg -i "rocksdb" -A 5

# Also search for test setup/teardown patterns
ast-grep --pattern 'impl Drop for $_ {
  $$$
}'

# Look for temporary directories or test data cleanup
rg -i "temp.*dir|test.*dir|test.*data" -A 5

Length of output: 69179


Script:

#!/bin/bash
# Search for test setup and cleanup patterns in test files
rg -i "setup|teardown|cleanup|tempdir" crates/tests/src/lib.rs

# Look for RocksDB initialization and cleanup in test files
rg -i "rocksdb.*new|rocksdb.*drop" crates/tests/src/lib.rs

# Check for temporary directory usage in tests
rg "TempDir::new" crates/tests/src/lib.rs

Length of output: 426

crates/tests/src/lib.rs (2)

15-15: LGTM: Import changes align with RocksDB implementation

The imports correctly bring in the necessary components for RocksDB and temporary directory management.

Also applies to: 22-22


24-27: Verify RocksDB cleanup in tests

The implementation correctly uses TempDir for test isolation. However, we should verify that RocksDB resources are properly cleaned up after tests.

✅ Verification successful

Let me gather more information about RocksDB cleanup by searching for Drop implementations and cleanup patterns in the RocksDB connection code.


Let me check for the TempDir implementation since it's being used to manage the RocksDB directory.


RocksDB cleanup is properly handled

The implementation is safe because:

  1. The TempDir from the tempfile crate automatically cleans up the directory and its contents when dropped
  2. The temp_dir variable is kept in scope for the duration of the test, ensuring the RocksDB files are cleaned up when the test completes
  3. RocksDB's own cleanup is handled by its Drop implementation through the RocksDB type wrapped in an Arc
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for potential resource cleanup issues in tests
# Look for any cleanup or Drop implementations for RocksDB

# Search for Drop implementations
ast-grep --pattern 'impl Drop for $type {
  $$$
}'

# Search for explicit cleanup calls
rg -A 3 'drop|close|cleanup|dispose' 

Length of output: 102941


Script:

#!/bin/bash
# Search for RocksDB connection and Drop implementation
ast-grep --pattern 'impl Drop for RocksDBConnection {
  $$$
}'

# Search for RocksDB connection implementation
rg -A 5 "struct RocksDBConnection" 

# Search for RocksDB cleanup related code
rg -A 5 "RocksDBConnection::new|RocksDBConnection::close"

Length of output: 1146


Script:

#!/bin/bash
# Search for TempDir usage and implementation
rg -A 5 "use.*TempDir|struct TempDir"

# Search for temp directory cleanup patterns
rg -A 5 "TempDir::new|temp_dir\."

Length of output: 1721

crates/keys/src/lib.rs (2)

1-7: LGTM: Well-structured module organization

The module structure provides clear separation of concerns for signatures, signing keys, and verifying keys.


9-134: Comprehensive test coverage with robust error handling

The test implementation thoroughly covers:

  • Key serialization/deserialization
  • Multiple key types (Ed25519, Secp256k1, Secp256r1)
  • Error cases for invalid inputs
✅ Verification successful

Let me gather more information about the test coverage by searching for the key operations and error cases.


Based on the search results, I can now provide a final assessment of the test coverage:

Comprehensive test coverage confirmed with robust error handling

The test implementation thoroughly covers all key aspects:

  • Key serialization/deserialization for all supported algorithms (Ed25519, Secp256k1, Secp256r1)
  • Verifying key conversions and equality checks
  • Signing key operations and byte representation
  • Signature creation and verification
  • Error cases for invalid inputs (e.g., invalid length test)
  • Base64 encoding/decoding tests
  • Proper unwrap usage in tests with prior assertions
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify test coverage for cryptographic operations
# Look for any untested key operations or error cases

# Search for key-related operations
ast-grep --pattern 'fn $name($$$) -> Result<$$$> {
  $$$
}'

# Search for error handling
rg -A 3 'Error|unwrap|expect'

Length of output: 91356


Script:

#!/bin/bash
# Search for key operations and their test coverage
rg -A 3 "SigningKey|VerifyingKey|Signature" crates/keys/src/lib.rs

# Search for error handling cases
rg -A 3 "Error|unwrap|expect" crates/keys/src/lib.rs

# Look for any untested public functions
ast-grep --pattern 'pub fn $name($$$) {
  $$$
}'

Length of output: 4909

crates/cli/src/main.rs (1)

7-7: LGTM: Updated import path

The import path has been correctly updated to use the new key management module.

crates/node_types/prover/src/webserver.rs (2)

12-16: LGTM! Import changes align with the module restructuring.

The imports have been properly reorganized to use the new module structure, improving code organization.


64-64: Verify the Hash struct's serialization compatibility.

The Hash struct now wraps a Digest instead of TreeHash. While the change is logical, we should ensure this doesn't break any existing API consumers that might depend on the serialized format.

✅ Verification successful

Let me gather more information about the Hash struct and its usage.


Let me check one more thing to understand the Digest type and its serialization.


Based on the gathered information, I can now provide the final response:

The Hash struct's serialization compatibility is maintained

The Hash struct is properly annotated with #[derive(Serialize, Deserialize)] and wraps a Digest type that is also serialization-compatible with #[derive(Serialize, Deserialize)]. The Digest type uses a fixed-size byte array [u8; 32] with custom serialization via #[serde(with = "raw_or_hex_fixed")], ensuring consistent serialization behavior.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for any direct usage of Hash struct serialization/deserialization
rg -A 3 "Hash.*(?:to_string|from_str|serialize|deserialize|to_json|from_json)"

Length of output: 448


Script:

#!/bin/bash
# Search for Hash struct definition and its implementations
ast-grep --pattern 'impl $_ for Hash'

# Search for any serialization traits or derives on Hash
rg -B 2 -A 2 "struct Hash"

# Search for any usage of Hash in serialization contexts
rg -l "use.*serde"

Length of output: 2610


Script:

#!/bin/bash
# Check Digest type definition and its serialization implementation
rg -B 2 -A 2 "struct Digest"

# Look for any direct usage of Hash serialization/deserialization in tests
rg -A 3 "Hash.*(?:to_string|from_str|serialize|deserialize)" --type rust --type test

Length of output: 703

crates/node_types/prover/src/prover/tests.rs (2)

2-3: LGTM! Import changes align with the new key management structure.

The imports have been properly updated to use the new prism_keys module.


66-66: LGTM! Improved key generation approach.

The change from mock signing keys to direct SigningKey::new_ed25519() usage improves code clarity and removes unnecessary abstraction.

Also applies to: 94-95

crates/common/src/test_utils.rs (1)

13-14: LGTM! Import changes align with the new key management structure.

The imports have been properly updated to use the new prism_keys module.

crates/common/src/transaction_builder.rs (2)

13-13: LGTM: Import change aligns with module restructuring

The migration from the old keys module to prism_keys is consistent with the codebase-wide restructuring mentioned in the summary.


170-170: LGTM: Consistent key generation with proper conversion

The change to SigningKey::new_ed25519() with immediate conversion to VerifyingKey is correct and consistent with the module's key generation pattern.

crates/common/src/tree/key_directory_tree.rs Outdated Show resolved Hide resolved
crates/common/src/tree/proofs.rs Outdated Show resolved Hide resolved
crates/storage/src/rocksdb.rs Outdated Show resolved Hide resolved
crates/common/src/tree/snarkable_tree.rs Outdated Show resolved Hide resolved
crates/common/src/tree/snarkable_tree.rs Outdated Show resolved Hide resolved
crates/keys/src/verifying_keys.rs Outdated Show resolved Hide resolved
crates/keys/src/verifying_keys.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (3)
crates/storage/src/rocksdb.rs (3)

16-31: Consider customizing RocksDB options for optimal performance.

The default RocksDB options might not provide optimal performance for your specific use case. Consider customizing options such as:

  • Write buffer size
  • Block cache size
  • Compression settings
  • Bloom filters for faster lookups

Example configuration:

-    pub fn new(path: &str) -> Result<RocksDBConnection> {
-        let db = DB::open_default(path)?;
+    pub fn new(path: &str) -> Result<RocksDBConnection> {
+        let mut opts = Options::default();
+        opts.set_write_buffer_size(64 * 1024 * 1024); // 64MB
+        opts.set_max_write_buffer_number(3);
+        opts.set_target_file_size_base(64 * 1024 * 1024); // 64MB
+        opts.create_if_missing(true);
+        
+        let db = DB::open(&opts, path)?;

137-166: Consider implementing batch size limits and consistent error handling.

While the batch writing implementation is solid, consider:

  1. Implementing batch size limits to prevent memory issues with large batches
  2. Using consistent error handling across the implementation
  3. Optimizing key encoding (as mentioned in TreeReader)

Example batch size limit implementation:

     fn write_node_batch(&self, node_batch: &NodeBatch) -> Result<()> {
+        const MAX_BATCH_SIZE: usize = 1000;
         let mut batch = rocksdb::WriteBatch::default();
+        let mut batch_size = 0;
 
         for (node_key, node) in node_batch.nodes() {
+            if batch_size >= MAX_BATCH_SIZE {
+                self.connection.write(batch)?;
+                batch = rocksdb::WriteBatch::default();
+                batch_size = 0;
+            }
             let key = format!("node:{}", hex::encode(bincode::serialize(node_key)?));
             let value = bincode::serialize(node)?;
             batch.put(key.as_bytes(), &value);
+            batch_size += 1;
         }

168-218: Enhance test coverage for edge cases and concurrent access.

Consider adding tests for:

  1. Error cases (invalid data, serialization failures)
  2. Concurrent access patterns
  3. Large batch operations
  4. Database size limits

Example concurrent access test:

#[test]
fn test_concurrent_access() {
    let (_temp_dir, db) = setup_db();
    let db_clone = db.clone();
    
    let handle = std::thread::spawn(move || {
        let batch = NodeBatch::default();
        batch.insert_value(1, KeyHash([1; 32]), vec![1, 2, 3]);
        db_clone.write_node_batch(&batch)
    });
    
    let batch = NodeBatch::default();
    batch.insert_value(2, KeyHash([2; 32]), vec![4, 5, 6]);
    db.write_node_batch(&batch).unwrap();
    
    handle.join().unwrap().unwrap();
}
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 8d02ea8 and fd06d4b.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (5)
  • Cargo.toml (2 hunks)
  • crates/storage/Cargo.toml (1 hunks)
  • crates/storage/src/rocksdb.rs (1 hunks)
  • crates/tests/Cargo.toml (1 hunks)
  • crates/tests/src/lib.rs (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
  • crates/storage/Cargo.toml
  • crates/tests/src/lib.rs
  • Cargo.toml
  • crates/tests/Cargo.toml
🔇 Additional comments (2)
crates/storage/src/rocksdb.rs (2)

1-14: LGTM! Thread-safe configuration is properly set up.

The imports and type definitions are well-structured, with appropriate use of Arc for thread-safety and MultiThreaded mode for RocksDB.


84-119: Handle deserialization errors instead of using unwrap()

Consider optimizing key encoding strategy.

Using hex encoding for keys might impact performance and storage size. Consider using binary encoding or a more efficient encoding scheme.

Example:

-        let key = format!("node:{}", hex::encode(bincode::serialize(node_key)?));
+        let mut key = Vec::with_capacity(5 + node_key_size);
+        key.extend_from_slice(b"node:");
+        key.extend_from_slice(&bincode::serialize(node_key)?);

crates/storage/src/rocksdb.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Outside diff range and nitpick comments (3)
crates/storage/src/rocksdb.rs (3)

20-35: Add documentation for thread-safety guarantees

Consider adding documentation to explain the thread-safety guarantees provided by Arc<RocksDB>. This will help other developers understand the concurrent usage patterns.

Example documentation:

/// Thread-safe connection to RocksDB.
/// 
/// This struct can be safely shared across multiple threads using Clone.
/// The underlying RocksDB connection is wrapped in an Arc to ensure thread-safe access.
#[derive(Clone)]
pub struct RocksDBConnection {
    // ...
}

106-129: Optimize version retrieval in get_value_option

The current implementation iterates through all versions to find the latest one. This can be optimized by using reverse iteration since we're looking for the latest version.

Consider this optimization:

     fn get_value_option(
         &self,
         max_version: Version,
         key_hash: KeyHash,
     ) -> Result<Option<OwnedValue>> {
         let value_key = format!("{KEY_PREFIX_VALUE_HISTORY}{}", hex::encode(key_hash.0));
-        let iter = self.connection.prefix_iterator(value_key.as_bytes());
+        let iter = self.connection.prefix_iterator_cf_opt(
+            self.connection.default_cf(),
+            value_key.as_bytes(),
+            rocksdb::IteratorMode::From(value_key.as_bytes(), rocksdb::Direction::Reverse)
+        );

-        let mut latest_value = None;
-        let mut latest_version = 0;
-
         for item in iter {
             let (key, value) = item?;
             let version: Version =
                 bincode::deserialize(&hex::decode(&key[value_key.len() + 1..])?)?;

-            if version <= max_version && version > latest_version {
-                latest_version = version;
-                latest_value = Some(value);
+            if version <= max_version {
+                return Ok(Some(bincode::deserialize(&value)?));
             }
         }

-        Ok(latest_value.map(|v| bincode::deserialize(&v)).transpose()?)
+        Ok(None)
     }

183-258: Enhance test coverage

While the current tests are good, consider adding tests for:

  1. Concurrent access to verify thread-safety
  2. Error cases (e.g., corrupted data, serialization failures)
  3. Batch size limits
  4. Database recovery after crashes

Example concurrent access test:

#[test]
fn test_concurrent_access() {
    let (_temp_dir, db) = setup_db();
    let db_clone = db.clone();
    
    let handle = std::thread::spawn(move || {
        let key_hash = KeyHash([1; 32]);
        let value: OwnedValue = vec![1, 2, 3];
        let mut batch = NodeBatch::default();
        batch.insert_value(1, key_hash, value.clone());
        db_clone.write_node_batch(&batch).unwrap();
    });
    
    let key_hash = KeyHash([2; 32]);
    let value: OwnedValue = vec![4, 5, 6];
    let mut batch = NodeBatch::default();
    batch.insert_value(1, key_hash, value.clone());
    db.write_node_batch(&batch).unwrap();
    
    handle.join().unwrap();
}
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between fd06d4b and 4fb50f1.

📒 Files selected for processing (1)
  • crates/storage/src/rocksdb.rs (1 hunks)
🔇 Additional comments (1)
crates/storage/src/rocksdb.rs (1)

1-18: Well-structured imports and constants!

Good use of constants for key prefixes and appropriate thread-safety consideration with Arc.

crates/storage/src/rocksdb.rs Show resolved Hide resolved
crates/storage/src/rocksdb.rs Show resolved Hide resolved
@jns-ps jns-ps requested a review from sebasti810 December 5, 2024 20:35
@jns-ps
Copy link
Contributor

jns-ps commented Dec 5, 2024

Basically fine with the PR. I don't understand some things regarding (node) key construction. We can talk about them next week, or you merge the PR without me.

@jns-ps jns-ps removed their request for review December 5, 2024 20:36
@distractedm1nd distractedm1nd merged commit 1854059 into main Dec 6, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants