Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs(swiftide): documented file swiftide/src/integrations/redis/node_cache.rs #29

Merged
merged 1 commit into from
Jun 13, 2024

Conversation

bosun-ai[bot]
Copy link
Contributor

@bosun-ai bosun-ai bot commented Jun 13, 2024

Added documentation per your request.

References issue #13

Copy link
Contributor Author

bosun-ai bot commented Jun 13, 2024

The run history:

Goal:
Document code file swiftide/src/integrations/redis/node_cache.rs

Prompt:
Task: Write idiomatic in-file documentation for the provided rust code file.

Guidelines:

  1. Idiomatic Documentation: Ensure the documentation is idiomatic to rust.
  2. Document all Functions and Classes: Document all public functions, methods, and classes. Include an overview for each class explaining its role and design choice.
  3. Document Signature: Document all function and method signatures as per the language's documentation style. Include both the parameters and the return value.
  4. Explain Purpose: Explain the purpose of each function, method, and class. Focus on why each exists and the problem it solves. Focus on answering the "why" over the "what".
  5. Exception Documentation: Specify what exceptions are raised and under which circumstances.
  6. Usage Examples: Do NOT provide any examples
  7. Performance Notes: If relevant, include notes on performance considerations and any critical trade-offs made.
  8. Edge Case and Error Handling: Explain how the code handles edge cases or errors and the rationale behind these approaches.
  9. Top Level Documentation: Include a top-level overview of the file and its purpose, ideomatic to rust.
  10. Tonality: Use a professional and informative tone throughout the documentation. Remember, the documentation is for other developers and refrain from using fluffy language.

Context: The path to the code file to be documented is provided below.

path: swiftide/src/integrations/redis/node_cache.rs
file content:

use std::fmt::Debug;
use tokio::sync::RwLock;

use anyhow::{Context as _, Result};
use async_trait::async_trait;

use crate::{ingestion::IngestionNode, traits::NodeCache};

pub struct RedisNodeCache {
    client: redis::Client,
    connection_manager: RwLock<Option<redis::aio::ConnectionManager>>,
    key_prefix: String,
}

impl RedisNodeCache {
    pub fn try_from_url(url: impl AsRef<str>, prefix: impl AsRef<str>) -> Result<Self> {
        let client = redis::Client::open(url.as_ref()).context("Failed to open redis client")?;
        Ok(Self {
            client,
            connection_manager: RwLock::new(None),
            key_prefix: prefix.as_ref().to_string(),
        })
    }

    // Connectionmanager is meant to be cloned
    async fn lazy_connect(&self) -> Option<redis::aio::ConnectionManager> {
        if self.connection_manager.read().await.is_none() {
            let result = self.client.get_connection_manager().await;
            if let Err(e) = result {
                tracing::error!("Failed to get connection manager: {}", e);
                return None;
            }
            let mut cm = self.connection_manager.write().await;
            *cm = result.ok();
        }

        self.connection_manager.read().await.clone()
    }

    fn key_for_node(&self, node: &IngestionNode) -> String {
        format!("{}:{}", self.key_prefix, node.calculate_hash())
    }

    #[allow(dead_code)]
    // Testing only, super inefficient
    async fn reset_cache(&self) {
        if let Some(mut cm) = self.lazy_connect().await {
            let keys: Vec<String> = redis::cmd("KEYS")
                .arg(format!("{}:*", self.key_prefix))
                .query_async(&mut cm)
                .await
                .expect("Could not get keys");

            for key in &keys {
                let _: usize = redis::cmd("DEL")
                    .arg(key)
                    .query_async(&mut cm)
                    .await
                    .expect("Failed to reset cache");
            }
        }
    }
}

// Redis CM does not implement debug
impl Debug for RedisNodeCache {
    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
        f.debug_struct("Redis")
            .field("client", &self.client)
            .finish()
    }
}

#[async_trait]
impl NodeCache for RedisNodeCache {
    // false -> not cached, expect node to be processed
    // true -> cached, expect node to be skipped
    #[tracing::instrument(skip_all, name = "node_cache.redis.get", fields(hit))]
    async fn get(&self, node: &IngestionNode) -> bool {
        let cache_result = if let Some(mut cm) = self.lazy_connect().await {
            let result = redis::cmd("EXISTS")
                .arg(self.key_for_node(node))
                .query_async(&mut cm)
                .await;

            match result {
                Ok(1) => true,
                Ok(0) => false,
                Err(e) => {
                    tracing::error!("Failed to check node cache: {}", e);
                    false
                }
                _ => {
                    tracing::error!("Unexpected response from redis");
                    false
                }
            }
        } else {
            false
        };

        tracing::Span::current().record("hit", cache_result);

        cache_result
    }

    #[tracing::instrument(skip_all, name = "node_cache.redis.get")]
    async fn set(&self, node: &IngestionNode) {
        if let Some(mut cm) = self.lazy_connect().await {
            let result: Result<(), redis::RedisError> = redis::cmd("SET")
                .arg(self.key_for_node(node))
                .arg(1)
                .query_async(&mut cm)
                .await;

            if let Err(e) = result {
                tracing::error!("Failed to set node cache: {}", e);
            }
        }
    }
}

#[cfg(test)]
mod tests {
    use super::*;
    use std::collections::HashMap;
    use testcontainers::runners::AsyncRunner;

    #[test_log::test(tokio::test)]
    async fn test_redis_cache() {
        let redis = testcontainers::GenericImage::new("redis", "7.2.4")
            .with_exposed_port(6379)
            .with_wait_for(testcontainers::core::WaitFor::message_on_stdout(
                "Ready to accept connections",
            ))
            .start()
            .await
            .expect("Redis started");

        let host = redis.get_host().await.unwrap();
        let port = redis.get_host_port_ipv4(6379).await.unwrap();
        let cache = RedisNodeCache::try_from_url(&format!("redis://{host}:{port}"), "test")
            .expect("Could not build redis client");
        cache.reset_cache().await;

        let node = IngestionNode {
            id: Some(1),
            path: "test".into(),
            chunk: "chunk".into(),
            vector: None,
            metadata: HashMap::new(),
        };

        let before_cache = cache.get(&node).await;
        assert!(!before_cache);

        cache.set(&node).await;
        let after_cache = cache.get(&node).await;
        assert!(after_cache);
    }
}

Constraints:

  • Try to get to a working solution of the goal as fast as possible
  • Use write_file to write to a file, do not include it in your thoughts and reasoning
  • Always write the full file with all the code. Otherwise code will be missing and I will lose my job.
  • If existing comments are still valid, use them in your response
  • Only use information from the provided context.
  • Never remove or modify any code, only add documentation to the existing code.

Process:

  • Think step by step
  • Read the code and understand what it does
  • Search for and read related files.
  • If the intent of code is unclear, feel free to search and read related files. But do not make any changes to them.
  • Write documentation for the code file using the provided tools.

Response Format:

  • Always respond with your thoughts and reasoning for your actions in one or two sentences.
  • Use the write_file function to write the full cod with documentation to the file. Do not include it in your response.
  • Once the goal is achieved, respond with a single STOP_CONVERSATION.

Information:
Certainly! Documenting a Rust code file effectively helps improve code readability and maintainability. Here are some steps to document a Rust code file idiomatically:

1. File Header Documentation

  • File Overview: At the top of the file, include a brief summary of what the file does, its purpose, and its role within the project.
//! # File Name
//! A brief description of the file, outlining its purpose and role in the project.
//! For example, "This file implements the main logic for the XYZ feature, handling input parsing
//! and output generation for the XYZ functionality."

2. Module-Level Documentation

  • Module Comment: If the file defines a module, document the module using the //! syntax.
//! This module provides utilities for handling XYZ functionality.
//! It includes functions for parsing XYZ data and generating appropriate responses.

3. Function Documentation

  • Function Overview: For each function, provide a description of what the function does.
  • Arguments and Return Values: List and describe each argument and the return value.
/// Parses the input string and returns a Result containing either the parsed
/// data or an error.
///
/// # Arguments
///
/// * `input` - A string slice that holds the input to be parsed.
///
/// # Example
///
/// ```
/// let result = parse_input("example input");
/// assert!(result.is_ok());
/// ```
///
/// # Errors
///
/// Returns an error if the input cannot be parsed.
pub fn parse_input(input: &str) -> Result<ParsedData, ParseError> {
    // function implementation
}

4. Struct and Enum Documentation

  • Type Overview: Document structs and enums with a summary of what they represent.
  • Field Descriptions: Document each field within a struct or each variant within an enum.
/// Represents a user in the system.
///
/// This struct contains the user's ID, name, and email address.
pub struct User {
    /// The unique identifier for the user.
    id: u32,
    /// The user's full name.
    name: String,
    /// The user's email address.
    email: String,
}

/// Represents possible status values for a task.
pub enum TaskStatus {
    /// The task is pending and has not been started.
    Pending,
    /// The task is currently in progress.
    InProgress,
    /// The task has been completed.
    Completed,
}

5. Trait Documentation

  • Trait Overview: Describe the trait and its intended use.
  • Method Descriptions: Document each method within the trait.
/// Defines the behavior for types that can be converted to a string.
pub trait ToString {
    /// Converts the type to a string.
    ///
    /// # Example
    ///
    /// ```
    /// let s = my_type.to_string();
    /// assert_eq!(s, "expected string");
    /// ```
    fn to_string(&self) -> String;
}

6. Using Markdown for Formatting

  • Use Markdown for formatting within documentation comments, such as headings, lists, and code blocks.
/// This function does the following:
/// 
/// 1. Parses the input.
/// 2. Validates the parsed data.
/// 3. Returns the result.
///
/// # Example
///
/// ```
/// let result = process_data("input data");
/// assert!(result.is_ok());
/// ```
pub fn process_data(input: &str) -> Result<ParsedData, ProcessError> {
    // function implementation
}

7. Linking to Documentation

  • Link to other relevant documentation using Markdown links for further reading or related references.
/// Returns an iterator over the lines of the input string.
///
/// This is a convenience function that wraps [`input.split('\n')`].
///
/// [`input.split('\n')`]: https://doc.rust-lang.org/std/string/struct.String.html#method.split
pub fn lines(input: &str) -> impl Iterator<Item = &str> {
    input.split('\n')
}

8. Deprecation Notices

  • If a function or type is deprecated, use the #[deprecated] attribute and provide an explanation.
#[deprecated(
    since = "1.2.3",
    note = "Please use `new_function` instead."
)]
/// This function is deprecated. Use `new_function` instead.
///
/// # Example
///
/// ```
/// let result = old_function();
/// assert!(result.is_ok());
/// ```
pub fn old_function() -> Result<(), ()> {
    // function implementation
}

9. Testing Documentation with rustdoc

  • Ensure that all examples in the documentation comments are correct and work as expected by running cargo test which also runs doc tests.

By following these steps, you can create well-documented, readable, and maintainable Rust code that is easy for other developers to understand and use.

Information:

Additional information

Question

  1. What does the swiftide project do? In natural language, what is the role of swiftide/src/integrations/redis/node_cache.rs in that context?
  2. Where is the swiftide/src/integrations/redis/node_cache.rs file used and how is it used there? Provide full file names, real examples and why it works the way it does.

Answer

1. What does the Swiftide project do? In natural language, what is the role of swiftide/src/integrations/redis/node_cache.rs in that context?

The Swiftide project is a fast, efficient, and asynchronous file ingestion and processing system designed for Research Augmented Generation (RAG) systems. It aims to improve performance, stability, and ease-of-use compared to other Python-based tooling, with ingestion performance enhanced from multiple tens of minutes to just a few seconds by focusing on parallel processing and asynchronous operations.

In this context, the role of swiftide/src/integrations/redis/node_cache.rs is to provide caching functionality using Redis for nodes that have been processed. This helps in optimizing performance by ensuring that nodes which have already been ingested and processed are not redundantly processed again.

2. Where is the swiftide/src/integrations/redis/node_cache.rs file used and how is it used there? Provide full file names, real examples and why it works the way it does.

The swiftide/src/integrations/redis/node_cache.rs file is used within the Swiftide project, particularly in the context of ingestion pipelines to filter out previously processed nodes. This file defines the RedisNodeCache structure, which is then integrated into the ingestion pipeline.

For instance, in the example file swiftide/examples/ingest_codebase.rs, RedisNodeCache is utilized as follows:

use swiftide::{
    ingestion,
    integrations::{self, qdrant::Qdrant, redis::RedisNodeCache},
    loaders::FileLoader,
    transformers::{ChunkCode, MetadataQACode, OpenAIEmbed},
};

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    tracing_subscriber::fmt::init();
    
    let redis_url = std::env::var("REDIS_URL")
        .as_deref()
        .unwrap_or("redis://localhost:6379")
        .to_owned();
    
    ingestion::IngestionPipeline::from_loader(FileLoader::new(".").with_extensions(&["rs"]))
        .filter_cached(RedisNodeCache::try_from_url(redis_url, "swiftide-examples")?)
        .then(MetadataQACode::new(openai_client.clone()))
        .then_chunk(ChunkCode::try_for_language_and_chunk_size("rust", 10..2048)?)
        .then_in_batch(10, OpenAIEmbed::new(openai_client.clone()))
        .store_with(
            Qdrant::try_from_url(qdrant_url)?
                .batch_size(50)
                .vector_size(1536)
                .collection_name("swiftide-examples".to_string())
                .build()?,
        )
        .run()
        .await?;
    Ok(())
}

Here, RedisNodeCache is instantiated and applied as a filter within the pipeline using filter_cached. This ensures that the pipeline skips processing nodes that are already present in the cache. The RedisNodeCache's methods like get and set (defined in swiftide/src/integrations/redis/node_cache.rs) check for the existence of nodes in the Redis cache and updates the cache with new nodes respectively.

The RedisNodeCache works effectively by maintaining a cache of node hashes, allowing the pipeline to quickly determine whether a node should be processed or skipped, thereby enhancing the overall efficiency and reducing redundancy in processing.


Summary:

Summary

The task is to document the Rust file swiftide/src/integrations/redis/node_cache.rs idiomatically, focusing on commenting on all functions, methods, and classes. The aim is to ensure that the documentation provides a clear, professional overview of what each part of the code does, its parameters, return values, edge case handling, and exceptions raised.

Relevant context

The Swiftide project is an optimized asynchronous file ingestion and processing system designed for Research Augmented Generation (RAG) systems, significantly enhancing ingestion performance. The file swiftide/src/integrations/redis/node_cache.rs provides caching functionality using Redis for nodes that have been previously processed, optimizing the performance by skipping redundant processing.

Examples

In swiftide/examples/ingest_codebase.rs, RedisNodeCache is used as follows:

use swiftide::{
    ingestion,
    integrations::{self, qdrant::Qdrant, redis::RedisNodeCache},
    loaders::FileLoader,
    transformers::{ChunkCode, MetadataQACode, OpenAIEmbed},
};

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    tracing_subscriber::fmt::init();
    
    let redis_url = std::env::var("REDIS_URL")
        .as_deref()
        .unwrap_or("redis://localhost:6379")
        .to_owned();
    
    ingestion::IngestionPipeline::from_loader(FileLoader::new(".").with_extensions(&["rs"]))
        .filter_cached(RedisNodeCache::try_from_url(redis_url, "swiftide-examples")?)
        .then(MetadataQACode::new(openai_client.clone()))
        .then_chunk(ChunkCode::try_for_language_and_chunk_size("rust", 10..2048)?)
        .then_in_batch(10, OpenAIEmbed::new(openai_client.clone()))
        .store_with(
            Qdrant::try_from_url(qdrant_url)?
                .batch_size(50)
                .vector_size(1536)
                .collection_name("swiftide-examples".to_string())
                .build()?,
        )
        .run()
        .await?;
    Ok(())
}

In this example, RedisNodeCache is instantiated and integrated into the ingestion pipeline using filter_cached to skip nodes already present in the cache.

Expectations

  • Document all functions, methods, and classes in swiftide/src/integrations/redis/node_cache.rs.
  • Explain the purpose, parameters, return values, exceptions raised, and edge case handling for each function and method.
  • Provide an overview of the RedisNodeCache class, including its design and its role within the Swiftide project.
  • Ensure the documentation is professional, clear, and thorough.

Reflection

To ensure a perfect job:

  • Understand the role of RedisNodeCache within the Swiftide project thoroughly.
  • Be precise and detailed while documenting each part of the code.
  • Ensure the documentation is idiomatic to Rust and adheres to professional standards.

Suggested next steps

  1. Read through swiftide/src/integrations/redis/node_cache.rs to understand the code.
  2. Document each public function, method, and class, following Rust documentation guidelines.
  3. Ensure that the documentation explains the purpose, parameters, return values, exceptions, and edge cases.
  4. Verify the documentation's correctness and that it aligns with the Swiftide project's context and use.

ToolCall#call_jhrBgmnxPbhHhJGy0GcKKi7d#read_file
{"path":"swiftide/src/integrations/redis/node_cache.rs"}:
read_file

ToolResponse#call_jhrBgmnxPbhHhJGy0GcKKi7d:
use std::fmt::Debug;
use tokio::sync::RwLock;

use anyhow::{Context as _, Result};
use async_trait::async_trait;

use crate::{ingestion::IngestionNode, traits::NodeCache};

pub struct RedisNodeCache {
client: redis::Client,
connection_manager: RwLock<Optionredis::aio::ConnectionManager>,
key_prefix: String,
}

impl RedisNodeCache {
pub fn try_from_url(url: impl AsRef, prefix: impl AsRef) -> Result {
let client = redis::Client::open(url.as_ref()).context("Failed to open redis client")?;
Ok(Self {
client,
connection_manager: RwLock::new(None),
key_prefix: prefix.as_ref().to_string(),
})
}

// Connectionmanager is meant to be cloned
async fn lazy_connect(&self) -> Option<redis::aio::ConnectionManager> {
    if self.connection_manager.read().await.is_none() {
        let result = self.client.get_connection_manager().await;
        if let Err(e) = result {
            tracing::error!("Failed to get connection manager: {}", e);
            return None;
        }
        let mut cm = self.connection_manager.write().await;
        *cm = result.ok();
    }

    self.connection_manager.read().await.clone()
}

fn key_for_node(&self, node: &IngestionNode) -> String {
    format!("{}:{}", self.key_prefix, node.calculate_hash())
}

#[allow(dead_code)]
// Testing only, super inefficient
async fn reset_cache(&self) {
    if let Some(mut cm) = self.lazy_connect().await {
        let keys: Vec<String> = redis::cmd("KEYS")
            .arg(format!("{}:*", self.key_prefix))
            .query_async(&mut cm)
            .await
            .expect("Could not get keys");

        for key in &keys {
            let _: usize = redis::cmd("DEL")
                .arg(key)
                .query_async(&mut cm)
                .await
                .expect("Failed to reset cache");
        }
    }
}

}

// Redis CM does not implement debug
impl Debug for RedisNodeCache {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
f.debug_struct("Redis")
.field("client", &self.client)
.finish()
}
}

#[async_trait]
impl NodeCache for RedisNodeCache {
// false -> not cached, expect node to be processed
// true -> cached, expect node to be skipped
#[tracing::instrument(skip_all, name = "node_cache.redis.get", fields(hit))]
async fn get(&self, node: &IngestionNode) -> bool {
let cache_result = if let Some(mut cm) = self.lazy_connect().await {
let result = redis::cmd("EXISTS")
.arg(self.key_for_node(node))
.query_async(&mut cm)
.await;

        match result {
            Ok(1) => true,
            Ok(0) => false,
            Err(e) => {
                tracing::error!("Failed to check node cache: {}", e);
                false
            }
            _ => {
                tracing::error!("Unexpected response from redis");
                false
            }
        }
    } else {
        false
    };

    tracing::Span::current().record("hit", cache_result);

    cache_result
}

#[tracing::instrument(skip_all, name = "node_cache.redis.get")]
async fn set(&self, node: &IngestionNode) {
    if let Some(mut cm) = self.lazy_connect().await {
        let result: Result<(), redis::RedisError> = redis::cmd("SET")
            .arg(self.key_for_node(node))
            .arg(1)
            .query_async(&mut cm)
            .await;

        if let Err(e) = result {
            tracing::error!("Failed to set node cache: {}", e);
        }
    }
}

}

#[cfg(test)]
mod tests {
use super::*;
use std::collections::HashMap;
use testcontainers::runners::AsyncRunner;

#[test_log::test(tokio::test)]
async fn test_redis_cache() {
    let redis = testcontainers::GenericImage::new("redis", "7.2.4")
        .with_exposed_port(6379)
        .with_wait_for(testcontainers::core::WaitFor::message_on_stdout(
            "Ready to accept connections",
        ))
        .start()
        .await
        .expect("Redis started");

    let host = redis.get_host().await.unwrap();
    let port = redis.get_host_port_ipv4(6379).await.unwrap();
    let cache = RedisNodeCache::try_from_url(&format!("redis://{host}:{port}"), "test")
        .expect("Could not build redis client");
    cache.reset_cache().await;

    let node = IngestionNode {
        id: Some(1),
        path: "test".into(),
        chunk: "chunk".into(),
        vector: None,
        metadata: HashMap::new(),
    };

    let before_cache = cache.get(&node).await;
    assert!(!before_cache);

    cache.set(&node).await;
    let after_cache = cache.get(&node).await;
    assert!(after_cache);
}

}


Output:
The file swiftide/src/integrations/redis/node_cache.rs provides a Redis-based caching mechanism for nodes in the Swiftide project. The RedisNodeCache struct is the main component, which interacts with Redis to store and retrieve cached nodes, optimizing the ingestion process by skipping already processed nodes.

Let's document the code step-by-step:

  1. Top-level Documentation: Provide an overview of the file and its purpose.
  2. Struct Documentation: Document the RedisNodeCache struct, explaining its fields and their roles.
  3. Function Documentation: Document each function and method, explaining their purpose, parameters, return values, exceptions, and edge cases.

Let's proceed with documenting the file.

use std::fmt::Debug;
use tokio::sync::RwLock;

use anyhow::{Context as _, Result};
use async_trait::async_trait;

use crate::{ingestion::IngestionNode, traits::NodeCache};

/// `RedisNodeCache` provides a caching mechanism for nodes using Redis.
/// It helps in optimizing the ingestion process by skipping nodes that have already been processed.
/// 
/// # Fields
/// 
/// * `client` - The Redis client used to interact with the Redis server.
/// * `connection_manager` - Manages the Redis connections asynchronously.
/// * `key_prefix` - A prefix used for keys stored in Redis to avoid collisions.
pub struct RedisNodeCache {
    client: redis::Client,
    connection_manager: RwLock<Option<redis::aio::ConnectionManager>>,
    key_prefix: String,
}

impl RedisNodeCache {
    /// Creates a new `RedisNodeCache` instance from a given Redis URL and key prefix.
    /// 
    /// # Parameters
    /// 
    /// * `url` - The URL of the Redis server.
    /// * `prefix` - The prefix to be used for keys stored in Redis.
    /// 
    /// # Returns
    /// 
    /// A `Result` containing the `RedisNodeCache` instance or an error if the client could not be created.
    /// 
    /// # Errors
    /// 
    /// Returns an error if the Redis client cannot be opened.
    pub fn try_from_url(url: impl AsRef<str>, prefix: impl AsRef<str>) -> Result<Self> {
        let client = redis::Client::open(url.as_ref()).context("Failed to open redis client")?;
        Ok(Self {
            client,
            connection_manager: RwLock::new(None),
            key_prefix: prefix.as_ref().to_string(),
        })
    }

    /// Lazily connects to the Redis server and returns the connection manager.
    /// 
    /// # Returns
    /// 
    /// An `Option` containing the `ConnectionManager` if the connection is successful, or `None` if it fails.
    /// 
    /// # Errors
    /// 
    /// Logs an error and returns `None` if the connection manager cannot be obtained.
    async fn lazy_connect(&self) -> Option<redis::aio::ConnectionManager> {
        if self.connection_manager.read().await.is_none() {
            let result = self.client.get_connection_manager().await;
            if let Err(e) = result {
                tracing::error!("Failed to get connection manager: {}", e);
                return None;
            }
            let mut cm = self.connection_manager.write().await;
            *cm = result.ok();
        }

        self.connection_manager.read().await.clone()
    }

    /// Generates a Redis key for a given node using the key prefix and the node's hash.
    /// 
    /// # Parameters
    /// 
    /// * `node` - The node for which the key is to be generated.
    /// 
    /// # Returns
    /// 
    /// A `String` representing the Redis key for the node.
    fn key_for_node(&self, node: &IngestionNode) -> String {
        format!("{}:{}", self.key_prefix, node.calculate_hash())
    }

    /// Resets the cache by deleting all keys with the specified prefix.
    /// This function is intended for testing purposes and is inefficient for production use.
    /// 
    /// # Errors
    /// 
    /// Panics if the keys cannot be retrieved or deleted.
    #[allow(dead_code)]
    async fn reset_cache(&self) {
        if let Some(mut cm) = self.lazy_connect().await {
            let keys: Vec<String> = redis::cmd("KEYS")
                .arg(format!("{}:*", self.key_prefix))
                .query_async(&mut cm)
                .await
                .expect("Could not get keys");

            for key in &keys {
                let _: usize = redis::cmd("DEL")
                    .arg(key)
                    .query_async(&mut cm)
                    .await
                    .expect("Failed to reset cache");
            }
        }
    }
}

// Redis CM does not implement debug
impl Debug for RedisNodeCache {
    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
        f.debug_struct("Redis")
            .field("client", &self.client)
            .finish()
    }
}

#[async_trait]
impl NodeCache for RedisNodeCache {
    /// Checks if a node is present in the cache.
    /// 
    /// # Parameters
    /// 
    /// * `node` - The node to be checked in the cache.
    /// 
    /// # Returns
    /// 
    /// `true` if the node is present in the cache, `false` otherwise.
    /// 
    /// # Errors
    /// 
    /// Logs an error and returns `false` if the cache check fails.
    #[tracing::instrument(skip_all, name = "node_cache.redis.get", fields(hit))]
    async fn get(&self, node: &IngestionNode) -> bool {
        let cache_result = if let Some(mut cm) = self.lazy_connect().await {
            let result = redis::cmd("EXISTS")
                .arg(self.key_for_node(node))
                .query_async(&mut cm)
                .await;

            match result {
                Ok(1) => true,
                Ok(0) => false,
                Err(e) => {
                    tracing::error!("Failed to check node cache: {}", e);
                    false
                }
                _ => {
                    tracing::error!("Unexpected response from redis");
                    false
                }
            }
        } else {
            false
        };

        tracing::Span::current().record("hit", cache_result);

        cache_result
    }

    /// Sets a node in the cache.
    /// 
    /// # Parameters
    /// 
    /// * `node` - The node to be set in the cache.
    /// 
    /// # Errors
    /// 
    /// Logs an error if the node cannot be set in the cache.
    #[tracing::instrument(skip_all, name = "node_cache.redis.get")]
    async fn set(&self, node: &IngestionNode) {
        if let Some(mut cm) = self.lazy_connect().await {
            let result: Result<(), redis::RedisError> = redis::cmd("SET")
                .arg(self.key_for_node(node))
                .arg(1)
                .query_async(&mut cm)
                .await;

            if let Err(e) = result {
                tracing::error!("Failed to set node cache: {}", e);
            }
        }
    }
}

#[cfg(test)]
mod tests {
    use super::*;
    use std::collections::HashMap;
    use testcontainers::runners::AsyncRunner;

    /// Tests the `RedisNodeCache` implementation.
    #[test_log::test(tokio::test)]
    async fn test_redis_cache() {
        let redis = testcontainers::GenericImage::new("redis", "7.2.4")
            .with_exposed_port(6379)
            .with_wait_for(testcontainers::core::WaitFor::message_on_stdout(
                "Ready to accept connections",
            ))
            .start()
            .await
            .expect("Redis started");

        let host = redis.get_host().await.unwrap();
        let port = redis.get_host_port_ipv4(6379).await.unwrap();
        let cache = RedisNodeCache::try_from_url(&format!("redis://{host}:{port}"), "test")
            .expect("Could not build redis client");
        cache.reset_cache().await;

        let node = IngestionNode {
            id: Some(1),
            path: "test".into(),
            chunk: "chunk".into(),
            vector: None,
            metadata: HashMap::new(),
        };

        let before_cache = cache.get(&node).await;
        assert!(!before_cache);

        cache.set(&node).await;
        let after_cache = cache.get(&node).await;
        assert!(after_cache);
    }
}

Now, let's write this documented code back to the file.

ToolCall#call_1QVmDYPbsdOPtVKOEmldaCtV#write_file
{"path":"swiftide/src/integrations/redis/node_cache.rs","content":"use std::fmt::Debug;\nuse tokio::sync::RwLock;\n\nuse anyhow::{Context as _, Result};\nuse async_trait::async_trait;\n\nuse crate::{ingestion::IngestionNode, traits::NodeCache};\n\n/// RedisNodeCache provides a caching mechanism for nodes using Redis.\n/// It helps in optimizing the ingestion process by skipping nodes that have already been processed.\n/// \n/// # Fields\n/// \n/// * client - The Redis client used to interact with the Redis server.\n/// * connection_manager - Manages the Redis connections asynchronously.\n/// * key_prefix - A prefix used for keys stored in Redis to avoid collisions.\npub struct RedisNodeCache {\n client: redis::Client,\n connection_manager: RwLock<Optionredis::aio::ConnectionManager>,\n key_prefix: String,\n}\n\nimpl RedisNodeCache {\n /// Creates a new RedisNodeCache instance from a given Redis URL and key prefix.\n /// \n /// # Parameters\n /// \n /// * url - The URL of the Redis server.\n /// * prefix - The prefix to be used for keys stored in Redis.\n /// \n /// # Returns\n /// \n /// A Result containing the RedisNodeCache instance or an error if the client could not be created.\n /// \n /// # Errors\n /// \n /// Returns an error if the Redis client cannot be opened.\n pub fn try_from_url(url: impl AsRef, prefix: impl AsRef) -> Result {\n let client = redis::Client::open(url.as_ref()).context("Failed to open redis client")?;\n Ok(Self {\n client,\n connection_manager: RwLock::new(None),\n key_prefix: prefix.as_ref().to_string(),\n })\n }\n\n /// Lazily connects to the Redis server and returns the connection manager.\n /// \n /// # Returns\n /// \n /// An Option containing the ConnectionManager if the connection is successful, or None if it fails.\n /// \n /// # Errors\n /// \n /// Logs an error and returns None if the connection manager cannot be obtained.\n async fn lazy_connect(&self) -> Optionredis::aio::ConnectionManager {\n if self.connection_manager.read().await.is_none() {\n let result = self.client.get_connection_manager().await;\n if let Err(e) = result {\n tracing::error!("Failed to get connection manager: {}", e);\n return None;\n }\n let mut cm = self.connection_manager.write().await;\n cm = result.ok();\n }\n\n self.connection_manager.read().await.clone()\n }\n\n /// Generates a Redis key for a given node using the key prefix and the node's hash.\n /// \n /// # Parameters\n /// \n /// * node - The node for which the key is to be generated.\n /// \n /// # Returns\n /// \n /// A String representing the Redis key for the node.\n fn key_for_node(&self, node: &IngestionNode) -> String {\n format!("{}:{}", self.key_prefix, node.calculate_hash())\n }\n\n /// Resets the cache by deleting all keys with the specified prefix.\n /// This function is intended for testing purposes and is inefficient for production use.\n /// \n /// # Errors\n /// \n /// Panics if the keys cannot be retrieved or deleted.\n #[allow(dead_code)]\n async fn reset_cache(&self) {\n if let Some(mut cm) = self.lazy_connect().await {\n let keys: Vec = redis::cmd("KEYS")\n .arg(format!("{}:", self.key_prefix))\n .query_async(&mut cm)\n .await\n .expect("Could not get keys");\n\n for key in &keys {\n let : usize = redis::cmd("DEL")\n .arg(key)\n .query_async(&mut cm)\n .await\n .expect("Failed to reset cache");\n }\n }\n }\n}\n\n// Redis CM does not implement debug\nimpl Debug for RedisNodeCache {\n fn fmt(&self, f: &mut std::fmt::Formatter<'>) -> std::fmt::Result {\n f.debug_struct("Redis")\n .field("client", &self.client)\n .finish()\n }\n}\n\n#[async_trait]\nimpl NodeCache for RedisNodeCache {\n /// Checks if a node is present in the cache.\n /// \n /// # Parameters\n /// \n /// * node - The node to be checked in the cache.\n /// \n /// # Returns\n /// \n /// true if the node is present in the cache, false otherwise.\n /// \n /// # Errors\n /// \n /// Logs an error and returns false if the cache check fails.\n #[tracing::instrument(skip_all, name = "node_cache.redis.get", fields(hit))]\n async fn get(&self, node: &IngestionNode) -> bool {\n let cache_result = if let Some(mut cm) = self.lazy_connect().await {\n let result = redis::cmd("EXISTS")\n .arg(self.key_for_node(node))\n .query_async(&mut cm)\n .await;\n\n match result {\n Ok(1) => true,\n Ok(0) => false,\n Err(e) => {\n tracing::error!("Failed to check node cache: {}", e);\n false\n }\n _ => {\n tracing::error!("Unexpected response from redis");\n false\n }\n }\n } else {\n false\n };\n\n tracing::Span::current().record("hit", cache_result);\n\n cache_result\n }\n\n /// Sets a node in the cache.\n /// \n /// # Parameters\n /// \n /// * node - The node to be set in the cache.\n /// \n /// # Errors\n /// \n /// Logs an error if the node cannot be set in the cache.\n #[tracing::instrument(skip_all, name = "node_cache.redis.get")]\n async fn set(&self, node: &IngestionNode) {\n if let Some(mut cm) = self.lazy_connect().await {\n let result: Result<(), redis::RedisError> = redis::cmd("SET")\n .arg(self.key_for_node(node))\n .arg(1)\n .query_async(&mut cm)\n .await;\n\n if let Err(e) = result {\n tracing::error!("Failed to set node cache: {}", e);\n }\n }\n }\n}\n\n#[cfg(test)]\nmod tests {\n use super::*;\n use std::collections::HashMap;\n use testcontainers::runners::AsyncRunner;\n\n /// Tests the RedisNodeCache implementation.\n #[test_log::test(tokio::test)]\n async fn test_redis_cache() {\n let redis = testcontainers::GenericImage::new("redis", "7.2.4")\n .with_exposed_port(6379)\n .with_wait_for(testcontainers::core::WaitFor::message_on_stdout(\n "Ready to accept connections",\n ))\n .start()\n .await\n .expect("Redis started");\n\n let host = redis.get_host().await.unwrap();\n let port = redis.get_host_port_ipv4(6379).await.unwrap();\n let cache = RedisNodeCache::try_from_url(&format!("redis://{host}:{port}"), "test")\n .expect("Could not build redis client");\n cache.reset_cache().await;\n\n let node = IngestionNode {\n id: Some(1),\n path: "test".into(),\n chunk: "chunk".into(),\n vector: None,\n metadata: HashMap::new(),\n };\n\n let before_cache = cache.get(&node).await;\n assert!(!before_cache);\n\n cache.set(&node).await;\n let after_cache = cache.get(&node).await;\n assert!(after_cache);\n }\n}\n"}:
write_file

ToolResponse#call_1QVmDYPbsdOPtVKOEmldaCtV:
Succesfully wrote file to swiftide/src/integrations/redis/node_cache.rs

Output:
STOP_CONVERSATION

@timonv timonv merged commit 29fce74 into master Jun 13, 2024
1 check passed
@timonv timonv deleted the generated/35e80a09-23aa-471b-af2e-6723b2041941 branch June 13, 2024 19:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant