Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed Up the Leaderboard #60

Merged
merged 2 commits into from
Aug 26, 2024
Merged

Speed Up the Leaderboard #60

merged 2 commits into from
Aug 26, 2024

Conversation

neilnaveen
Copy link
Member

@neilnaveen neilnaveen commented Aug 26, 2024

  • Changed the leaderboard to use heap-sort instead of dumping the data and then sorting
  • Batch queried all data used by ParseAndExecute, heavily speeding up the leaderboard

Summary by CodeRabbit

  • New Features

    • Introduced batch retrieval methods for nodes and caches to enhance data handling efficiency.
    • Added new methods for managing cache retrieval in both MockStorage and RedisStorage, improving performance.
    • Enhanced ParseAndExecute function for improved handling of complex node relationships.
  • Bug Fixes

    • Improved error handling for cache retrieval processes to ensure failures are reported accurately.
  • Refactor

    • Updated the control flow in the Run method to utilize a heap structure for sorting queries based on output lengths, improving performance.
    • Refactored querying logic to support batch processing for better performance and simplified error handling.

- Changed the leaderboard to use heap-sort instead of dumping the data
  and then sorting
- Batch queried all data used by ParseAndExecute, heavily speeding up
  the leaderboard
Copy link
Contributor

coderabbitai bot commented Aug 26, 2024

Walkthrough

The changes involve significant enhancements to data retrieval and query processing across the codebase. Key modifications include the introduction of batch retrieval methods for nodes and caches, optimizations of the Run methods, and improvements in cache management strategies. New data structures, such as queryHeap, were added for efficient sorting, while error handling was enhanced in multiple components. Overall, the updates streamline operations and improve performance.

Changes

Files Change Summary
cmd/leaderboard/custom/custom.go Modified Run method for batch node retrieval, introduced GetCaches, and implemented a heap for sorting query results.
cmd/query/query.go Enhanced Run method to retrieve all keys, nodes, and caches with improved error handling during cache retrieval.
pkg/graph/graph.go Added BatchQueryDependents and BatchQueryDependencies for efficient batch processing of dependents and dependencies.
pkg/graph/mockGraph.go Introduced GetCaches method in MockStorage for batch retrieval of caches.
pkg/graph/parser.go Modified ParseAndExecute to accept additional parameters for nodes and caches, and refactored querying logic for dependents and dependencies.
pkg/graph/storage.go Added GetCaches method to the Storage interface for batch cache retrieval.
pkg/storages/redis_storage.go Changed cache management methods to utilize lists instead of sets; added GetCaches method for efficient batch retrieval using Redis pipelines.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant Command
    participant Storage
    participant Cache

    User->>Command: Execute Run()
    Command->>Storage: GetNodes()
    Storage-->>Command: Return Nodes
    Command->>Cache: GetCaches()
    Cache-->>Command: Return Caches
    Command->>Command: Process Queries
    Command->>User: Return Results
Loading

🐇 In the world of code, so bright and new,
Changes abound, like morning dew.
Nodes and caches, now gathered in packs,
Efficiency blooms, no looking back!
With each little tweak, our code hops along,
A dance of improvement, a sweet, joyful song!
🌼✨


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share
Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between c8dbd67 and e059c03.

Files selected for processing (8)
  • cmd/leaderboard/custom/custom.go (3 hunks)
  • cmd/query/query.go (2 hunks)
  • pkg/graph/graph.go (3 hunks)
  • pkg/graph/mockGraph.go (1 hunks)
  • pkg/graph/parser.go (3 hunks)
  • pkg/graph/parser_test.go (3 hunks)
  • pkg/graph/storage.go (1 hunks)
  • pkg/storages/redis_storage.go (3 hunks)
Additional context used
golangci-lint
cmd/query/query.go

37-37: ineffectual assignment to err

(ineffassign)


39-39: ineffectual assignment to err

(ineffassign)

Additional comments not posted (25)
pkg/graph/storage.go (1)

15-15: LGTM!

The addition of the GetCaches method to the Storage interface is well-implemented and aligns with the existing methods. This will enhance the functionality by allowing batch retrieval of cache entries.

The code changes are approved.

pkg/graph/parser_test.go (3)

42-44: Verify the impact of reordering the Cache function call.

The Cache function call is moved to a later point in the test sequence. Ensure that this reordering does not affect the state of the storage variable before the main execution logic is tested.

Verify the test results to ensure that the reordering does not introduce any issues.


87-98: LGTM!

The changes to the ParseAndExecute function signature and the retrieval of keys, nodes, and caches from the storage object enhance the function's ability to handle more complex scenarios during execution.

The code changes are approved.


99-99: LGTM!

The update to the ParseAndExecute function call ensures that the function has more context about the data it operates on, improving its robustness and error handling.

The code changes are approved.

cmd/query/query.go (1)

41-49: LGTM!

The changes to retrieve caches associated with the keys, introduce new error handling for the cache retrieval process, and update the ParseAndExecute function call improve the method's capability to handle more complex scenarios by providing it with more context about the data it operates on, thereby improving its robustness and error handling.

The code changes are approved.

cmd/leaderboard/custom/custom.go (8)

4-4: LGTM!

The import statement for container/heap is necessary for the heap operations.

The code changes are approved.


47-50: LGTM!

Batch retrieval of nodes using GetNodes enhances performance by reducing the number of calls to the storage layer.

The code changes are approved.


52-55: LGTM!

Batch retrieval of caches using GetCaches enhances performance by reducing the number of calls to the storage layer.

The code changes are approved.


62-63: LGTM!

Initialization of queryHeap using heap.Init is necessary for managing heap operations.

The code changes are approved.


70-70: LGTM!

The additional parameters in the ParseAndExecute call allow for more informed decision-making during query execution.

The code changes are approved.


75-77: LGTM!

Using a heap structure for sorting queries based on the length of their outputs is efficient.

The code changes are approved.


80-83: LGTM!

Popping elements from the heap ensures that the queries are ordered correctly based on their output lengths.

The code changes are approved.


125-142: LGTM!

Encapsulating heap operations within the queryHeap type enhances code readability and maintainability.

The code changes are approved.

pkg/graph/mockGraph.go (1)

145-159: LGTM!

The GetCaches method is correctly implemented and enhances the functionality of MockStorage by allowing retrieval of multiple caches based on their IDs.

The code changes are approved.

pkg/graph/parser.go (6)

10-11: LGTM!

The additional parameters in the ParseAndExecute function signature allow for more complex operations involving node management and caching.

The code changes are approved.


15-18: LGTM!

The nameToIDs mapping facilitates the retrieval of node IDs based on their names, streamlining the process of querying dependents and dependencies.

The code changes are approved.


52-73: LGTM!

The handling of tokens prefixed with "dependents" or "dependencies" introduces a more structured approach to managing node relationships.

The code changes are approved.


76-83: LGTM!

Utilizing batch queries for dependents and dependencies reduces the number of individual queries made to the storage backend, improving performance.

The code changes are approved.


112-113: LGTM!

Filtering nodes based on their type ensures that only relevant nodes are included in the query results.

The code changes are approved.


100-100: LGTM!

Error handling and operator application are crucial for the correct execution of the script.

The code changes are approved.

Also applies to: 103-105

pkg/storages/redis_storage.go (3)

Line range hint 102-113: LGTM!

The function correctly retrieves and parses data from a list using LRANGE. The error handling is appropriate.

The code changes are approved.


121-123: LGTM!

The function correctly adds data to a list using RPUSH. The error handling is appropriate.

The code changes are approved.


202-233: LGTM!

The function efficiently retrieves multiple cache entries using a Redis pipeline. The error handling and data parsing are appropriate.

The code changes are approved.

pkg/graph/graph.go (2)

268-285: LGTM!

The function correctly handles batch queries for dependents, with appropriate error handling and result collection.

The code changes are approved.


305-322: LGTM!

The function correctly handles batch queries for dependencies, with appropriate error handling and result collection.

The code changes are approved.

cmd/query/query.go Outdated Show resolved Hide resolved
cmd/query/query.go Outdated Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between e059c03 and 4783aba.

Files selected for processing (1)
  • cmd/query/query.go (2 hunks)
Additional comments not posted (6)
cmd/query/query.go (6)

14-14: Import statement for github.com/goccy/go-json added.

The import statement for github.com/goccy/go-json has been added. Ensure that this library is necessary and used appropriately in the code.

Verify that the github.com/goccy/go-json library is necessary and used appropriately in the code.


37-40: Retrieve all keys from storage and handle errors.

The code retrieves all keys from the storage and handles errors appropriately. This is a good practice to ensure that the subsequent operations have the necessary data.

The code changes are approved.


42-45: Retrieve nodes from storage and handle errors.

The code retrieves nodes from the storage using the keys and handles errors appropriately. This ensures that the necessary data is available for the subsequent operations.

The code changes are approved.


47-50: Retrieve caches from storage and handle errors with detailed messages.

The code retrieves caches from the storage using the keys and handles errors with detailed messages. This improves the robustness and debuggability of the code.

The code changes are approved.


51-54: Retrieve cache stack from storage and handle errors.

The code retrieves the cache stack from the storage and handles errors appropriately. This ensures that the necessary data is available for the subsequent operations.

The code changes are approved.


55-57: Update ParseAndExecute function call with additional context.

The ParseAndExecute function call has been updated to include nodes, caches, and a boolean indicating whether the cache stack is empty. This enhances the method's capability to handle more complex scenarios by providing it with more context about the data it operates on.

The code changes are approved.

@neilnaveen neilnaveen merged commit 28e5730 into main Aug 26, 2024
1 check passed
@neilnaveen neilnaveen deleted the neil/speedUpLeaderboard branch August 26, 2024 21:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants