Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add data visualization for Anthropic #432

Open
wants to merge 12 commits into
base: dev
Choose a base branch
from
Open

feat: Add data visualization for Anthropic #432

wants to merge 12 commits into from

Conversation

Vasilije1990
Copy link
Contributor

@Vasilije1990 Vasilije1990 commented Jan 10, 2025

Summary by CodeRabbit

  • New Features

    • Added visualization tool for knowledge graph.
    • Introduced functions for named entity recognition and part-of-speech tagging.
  • Improvements

    • Enhanced server's image processing capabilities.
    • Added functionality to retrieve the most recent PNG file.
    • Integrated image handling with PIL library.
    • Enabled saving graph visualizations as PNG images with timestamps.
    • Updated documentation with debugging and installation reset instructions.
  • Dependency Updates

    • Updated various dependencies to newer versions.
    • Specified minimum Python version requirement.

Copy link
Contributor

coderabbitai bot commented Jan 10, 2025

Walkthrough

The pull request introduces enhanced image processing and visualization capabilities for the Cognee knowledge graph system. The changes span several files, primarily cognee_mcp/server.py and cognee/shared/utils.py. Modifications include the addition of a new "visualize" tool for graph visualization, the ability to save visualizations as timestamped PNG files, and a function to retrieve the most recently created PNG file. Additionally, updates to the README and pyproject.toml ensure proper documentation and dependency management.

Changes

File Change Summary
cognee_mcp/server.py - Added get_freshest_png() function to retrieve the latest PNG file
- Introduced "visualize" tool in handle_list_tools
- Updated handle_call_tool to support visualization
cognee/shared/utils.py - Added get_entities() and extract_pos_tags() functions
- Updated create_cognee_style_network_with_logo() for PNG export
cognee-mcp/README.md - Added debugging and installation reset instructions
cognee-mcp/pyproject.toml - Updated Python version requirement and several package dependencies
cognee/__init__.py - Replaced import of visualize with visualize_graph
pyproject.toml - Specified version for fastapi-users and removed nltk dependency
.github/workflows/dockerhub.yml - Updated workflow name and steps for building and pushing Docker images

Possibly related PRs

  • Code graph pipeline improvements and fixes #414: Modifications to extract_chunks_from_documents function include a max_tokens parameter, similar to the new max_tokens parameter added in the get_freshest_png function in the main PR.
  • Feature/cog 971 preparing swe bench run #424: This PR also modifies the extract_chunks_from_documents function to include a max_tokens parameter, aligning with the changes made in the main PR regarding token management in the get_freshest_png function.

Suggested labels

run-checks

Suggested reviewers

  • hajdul88

Poem

🐰 A rabbit's tale of graphs so bright,
Visualizing knowledge with pure delight,
PNG images, fresh and new,
Timestamps dancing, a graphing breakthrough!
Cognee's magic, now clear to see! 🖼️


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@Vasilije1990 Vasilije1990 changed the title Add data visualization for Anthropic feat: Add data visualization for Anthropic Jan 10, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e983c21 and daf2d54.

📒 Files selected for processing (2)
  • cognee-mcp/cognee_mcp/server.py (3 hunks)
  • cognee/shared/utils.py (3 hunks)
🧰 Additional context used
🪛 GitHub Actions: ruff format
cognee-mcp/cognee_mcp/server.py

[warning] File requires formatting. Code does not conform to Ruff formatting standards.

🔇 Additional comments (1)
cognee-mcp/cognee_mcp/server.py (1)

92-101: LGTM: Tool definition looks good

The "visualize" tool is properly defined with a clear description and appropriate schema.

🧰 Tools
🪛 GitHub Actions: ruff format

[warning] File requires formatting. Code does not conform to Ruff formatting standards.

Comment on lines 13 to 14
from PIL import Image
from PIL import Image as PILImage
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Remove duplicate PIL Image import

The Image module from PIL is imported twice with different aliases. This is redundant and could cause confusion.

-from PIL import Image
-from PIL import Image as PILImage
+from PIL import Image
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
from PIL import Image
from PIL import Image as PILImage
from PIL import Image
🧰 Tools
🪛 GitHub Actions: ruff format

[warning] File requires formatting. Code does not conform to Ruff formatting standards.

cognee-mcp/cognee_mcp/server.py Outdated Show resolved Hide resolved
cognee-mcp/cognee_mcp/server.py Outdated Show resolved Hide resolved
Comment on lines 447 to 454
# Get the latest Unix timestamp as an integer
timestamp = int(time.time())

# Construct your filename
filename = f"{timestamp}.png"

export_png(p, filename=filename)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add error handling for PNG export

The PNG export functionality lacks error handling and cleanup of old files.

     # Get the latest Unix timestamp as an integer
     timestamp = int(time.time())

     # Construct your filename
     filename = f"{timestamp}.png"

-    export_png(p, filename=filename)
+    try:
+        # Cleanup old PNG files to prevent disk space issues
+        cleanup_old_pngs(directory=".", keep_latest=5)
+        
+        # Export the new PNG
+        export_png(p, filename=filename)
+    except Exception as e:
+        logging.error(f"Failed to export PNG: {str(e)}")
+        raise

Consider adding a helper function to cleanup old PNG files:

def cleanup_old_pngs(directory: str, keep_latest: int = 5):
    """Cleanup old PNG files, keeping only the N latest files."""
    png_files = [f for f in os.listdir(directory) if f.endswith('.png')]
    if len(png_files) <= keep_latest:
        return
        
    # Sort by timestamp in filename
    sorted_files = sorted(png_files, key=lambda x: int(x.replace(".png", "")))
    
    # Remove older files
    for f in sorted_files[:-keep_latest]:
        try:
            os.remove(os.path.join(directory, f))
        except OSError as e:
            logging.warning(f"Failed to remove old PNG file {f}: {str(e)}")

Vasilije1990 and others added 2 commits January 11, 2025 16:57
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🔭 Outside diff range comments (1)
cognee-mcp/cognee_mcp/server.py (1)

Fix import statements organization

The file has the following formatting issues:

  • Remove duplicate PIL import: from PIL import Image as PILImage is redundant since Image is already imported
  • Organize imports according to PEP 8 style guide:
    1. Standard library imports
    2. Third-party imports
    3. Local application imports
🔗 Analysis chain

Line range hint 1-230: Fix formatting issues

The file requires formatting according to Ruff's standards. Please run the formatter:

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

ruff format cognee-mcp/cognee_mcp/server.py

Length of output: 140


Script:

#!/bin/bash
# Check for basic formatting patterns
echo "=== Checking import statements ==="
rg "^import|^from.*import" cognee-mcp/cognee_mcp/server.py

echo -e "\n=== Checking blank lines between functions ==="
rg "^def|^async def" -A 1 cognee-mcp/cognee_mcp/server.py

echo -e "\n=== Checking indentation ==="
rg "^[[:space:]]+[^[:space:]]" cognee-mcp/cognee_mcp/server.py

Length of output: 8158

🧰 Tools
🪛 GitHub Actions: ruff format

[warning] File requires formatting. Code does not conform to Ruff's formatting standards.

🧹 Nitpick comments (1)
cognee-mcp/cognee_mcp/server.py (1)

105-130: Add docstring to document timestamp format requirement

The function assumes PNG files are named with timestamps but this requirement isn't documented.

 def get_freshest_png(directory: str) -> Image.Image:
+    """
+    Retrieve the most recently created PNG file from a directory.
+    
+    Args:
+        directory (str): Path to directory containing PNG files
+        
+    Returns:
+        Image.Image: PIL Image object of the most recent PNG
+        
+    Raises:
+        FileNotFoundError: If directory doesn't exist or no PNG files found
+        ValueError: If PNG filenames don't follow timestamp format (e.g., '1673185134.png')
+        IOError: If PNG file can't be opened
+        
+    Note:
+        PNG files must be named with Unix timestamps (e.g., '1673185134.png')
+    """
     if not os.path.exists(directory):
🧰 Tools
🪛 GitHub Actions: ruff format

[warning] File requires formatting. Code does not conform to Ruff's formatting standards.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between daf2d54 and 7b0bfe9.

📒 Files selected for processing (1)
  • cognee-mcp/cognee_mcp/server.py (3 hunks)
🧰 Additional context used
🪛 GitHub Actions: ruff format
cognee-mcp/cognee_mcp/server.py

[warning] File requires formatting. Code does not conform to Ruff's formatting standards.

⏰ Context from checks skipped due to timeout of 90000ms (4)
  • GitHub Check: run_notebook_test / test
  • GitHub Check: run_notebook_test / test
  • GitHub Check: run_notebook_test / test
  • GitHub Check: docker-compose-test
🔇 Additional comments (1)
cognee-mcp/cognee_mcp/server.py (1)

13-14: Remove duplicate PIL Image import

The Image module from PIL is imported twice with different aliases. This is redundant and could cause confusion.

-from PIL import Image
-from PIL import Image as PILImage
+from PIL import Image
🧰 Tools
🪛 GitHub Actions: ruff format

[warning] File requires formatting. Code does not conform to Ruff's formatting standards.

cognee-mcp/cognee_mcp/server.py Outdated Show resolved Hide resolved
with redirect_stdout(fnull), redirect_stderr(fnull):
try:
await cognee.visualize
img = get_freshest_png(".")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Use configured output directory instead of current directory

Using "." as the directory path is brittle and depends on the current working directory when the server starts. Consider using a configured output directory.

-                    img = get_freshest_png(".")
+                    img = get_freshest_png(cognee.config.visualization_output_dir)

You'll need to ensure the output directory is properly configured in the Cognee configuration.

Committable suggestion skipped: line range outside the PR's diff.

🧰 Tools
🪛 GitHub Actions: ruff format

[warning] File requires formatting. Code does not conform to Ruff's formatting standards.

Comment on lines +92 to +101
types.Tool(
name="visualize",
description="Visualize the knowledge graph.",
inputSchema={
"type": "object",
"properties": {
"query": {"type": "string"},
},
},
),
Copy link
Contributor

@coderabbitai coderabbitai bot Jan 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Clarify query parameter requirement in visualization tool schema

The tool's schema defines a "query" property but doesn't mark it as required. Either:

  1. Mark it as required if visualization needs a query parameter
  2. Document that it's optional if visualization can work without a query
         types.Tool(
             name="visualize",
             description="Visualize the knowledge graph.",
             inputSchema={
                 "type": "object",
                 "properties": {
                     "query": {"type": "string"},
                 },
+                "required": ["query"],  # Add this if query is required
             },
         ),
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
types.Tool(
name="visualize",
description="Visualize the knowledge graph.",
inputSchema={
"type": "object",
"properties": {
"query": {"type": "string"},
},
},
),
types.Tool(
name="visualize",
description="Visualize the knowledge graph.",
inputSchema={
"type": "object",
"properties": {
"query": {"type": "string"},
},
"required": ["query"], # Add this if query is required
},
),
🧰 Tools
🪛 GitHub Actions: ruff format

[warning] File requires formatting. Code does not conform to Ruff's formatting standards.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good suggestion from coderabbit

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@borisarzentar Thank you for confirming! I'm glad the suggestion was helpful.

⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢀⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⠀⠀⢀⣴⣿⣿⣷⣄⠀⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⠀⢠⣿⣿⣿⣿⣿⣿⣆⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⢠⣿⣿⣿⣿⣿⣿⣿⣿⡄⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⢠⣿⣿⣿⡿⢿⣿⡿⠟⢻⣷⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⣰⣿⣿⠟⠉⠀⢸⣿⠀⠀⢸⣿⣇⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⣰⣿⣿⠋⠀⠀⠀⢸⣿⠀⠀⢸⣿⣿⡄⠀⠀⠀⠀⠀
⠀⠀⠀⣰⣿⣿⠃⠀⠀⠀⠀⢸⣿⠀⠀⢸⣿⣿⣷⠀⠀⠀⠀⠀
⠀⠀⢠⣿⣿⠇⠀⠀⠀⠀⠀⢸⣿⠀⠀⢸⣿⣿⣿⡄⠀⠀⠀⠀
⠀⢠⣿⣿⡏⠀⠀⠀⠀⠀⠀⢸⣿⠀⠀⢸⣿⣿⣿⣷⡀⠀⠀⠀
⢠⣿⣿⡟⠀⠀⠀⠀⠀⠀⠀⢸⣿⠀⠀⢸⣿⣿⣿⣿⣷⡀⠀⠀
⣿⣿⡟⠀⠀⠀⠀⠀⠀⠀⠀⢸⣿⠀⠀⢸⣿⣿⣿⣿⣿⣿⡄⠀
⠛⠋⠀⠀⠀⠀⠀⠀⠀⠀⠀⢸⣿⠀⠀⢸⣿⣿⣿⣿⣿⣿⣿⠀

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
@@ -10,6 +10,8 @@
from cognee.shared.data_models import KnowledgeGraph
from mcp.server import NotificationOptions, Server
from mcp.server.models import InitializationOptions
from PIL import Image
from PIL import Image as PILImage
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double import here

Comment on lines +92 to +101
types.Tool(
name="visualize",
description="Visualize the knowledge graph.",
inputSchema={
"type": "object",
"properties": {
"query": {"type": "string"},
},
},
),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good suggestion from coderabbit

with open(os.devnull, "w") as fnull:
with redirect_stdout(fnull), redirect_stderr(fnull):
try:
await cognee.visualize()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should visualize receive the parameter where to put the image it generates? Then we can send that path to get_freshest_png to retrieve the image.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (4)
cognee-mcp/README.md (2)

88-91: Minor grammar fix needed in documentation.

Add "the" before "debugger" for better readability.

-To use debugger, run:
+To use the debugger, run:
🧰 Tools
🪛 LanguageTool

[uncategorized] ~88-~88: You might be missing the article “the” here.
Context: ...y Restart your Claude desktop. To use debugger, run: ```bash npx @modelcontextprotocol...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)


93-94: Add command description for clarity.

The reset installation command would benefit from a brief description of when to use it.

-To reset the installation
-uv sync --dev --all-extras --reinstall 
+To reset the installation (useful when dependencies are corrupted):
+```bash
+uv sync --dev --all-extras --reinstall
+```
cognee-mcp/cognee_mcp/server.py (1)

6-7: Remove unused logging import.

The logging import from sqlalchemy.testing.plugin.plugin_base is not used in the code.

-from sqlalchemy.testing.plugin.plugin_base import logging
cognee/shared/utils.py (1)

Line range hint 377-494: Consider removing duplicate logo embedding.

The logo is embedded twice in the visualization, at both bottom-right and top-left positions. This might be unnecessary and could affect performance. Consider keeping only one logo instance unless there's a specific requirement for two logos.

    logging.info("Embedding logo into visualization...")
    embed_logo(p, layout_scale, logo_alpha, "bottom_right")
-    embed_logo(p, layout_scale, logo_alpha, "top_left")
🧰 Tools
🪛 Ruff (0.8.2)

429-429: Local variable filename is assigned to but never used

Remove assignment to unused variable filename

(F841)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cf4737b and 047948a.

⛔ Files ignored due to path filters (2)
  • cognee-mcp/uv.lock is excluded by !**/*.lock
  • poetry.lock is excluded by !**/*.lock
📒 Files selected for processing (6)
  • cognee-mcp/README.md (1 hunks)
  • cognee-mcp/cognee_mcp/server.py (3 hunks)
  • cognee-mcp/pyproject.toml (4 hunks)
  • cognee/__init__.py (1 hunks)
  • cognee/shared/utils.py (4 hunks)
  • pyproject.toml (1 hunks)
🧰 Additional context used
🪛 LanguageTool
cognee-mcp/README.md

[uncategorized] ~88-~88: You might be missing the article “the” here.
Context: ...y Restart your Claude desktop. To use debugger, run: ```bash npx @modelcontextprotocol...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)

🪛 Ruff (0.8.2)
cognee/shared/utils.py

429-429: Local variable filename is assigned to but never used

Remove assignment to unused variable filename

(F841)

🔇 Additional comments (6)
cognee/__init__.py (1)

7-7: LGTM! Verify function usage across codebase.

The renamed import to visualize_graph is more descriptive of its purpose.

Let's verify all usages of the old and new function names:

✅ Verification successful

✅ Function rename verification complete

The visualize to visualize_graph rename is consistently applied across the codebase. The new function is properly used in cognee-mcp/cognee_mcp/server.py and no instances of the old function name remain.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for any remaining usages of old 'visualize' function
rg -l "\.visualize\(" 

# Search for usages of new 'visualize_graph' function
rg -l "\.visualize_graph\("

Length of output: 85


Script:

#!/bin/bash
# Check the implementation in server.py
rg "visualize_graph" cognee-mcp/cognee_mcp/server.py -A 2 -B 2

Length of output: 273

cognee-mcp/pyproject.toml (2)

94-94: LGTM! Script entry point is properly configured.

The cognee script entry point is correctly defined.


6-6: Verify compatibility with major version updates.

Several dependencies have significant version jumps:

  • pandas (2.0.3 → 2.2.3)
  • instructor (1.5.2 → 1.7.2)
  • litellm (1.49.1 → 1.57.2)

Let's check for any breaking changes in the major updates:

Also applies to: 9-9, 24-25, 28-28, 35-35, 47-48, 54-54, 59-59

✅ Verification successful

Dependencies are compatible with current usage

The codebase uses basic, stable features from all updated libraries:

  • pandas: Simple DataFrame creation and iteration
  • instructor: Basic litellm integration
  • litellm: Standard completion API calls
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for pandas DataFrame operations that might be affected
rg "pd\.DataFrame" -A 3

# Search for instructor decorators and usage
rg "@instruction" -A 3

# Search for litellm configurations
rg "litellm\.completion" -A 3

Length of output: 1927

pyproject.toml (1)

66-66: LGTM! Version alignment with cognee-mcp.

The fastapi-users version is properly pinned and aligned with cognee-mcp's requirements.

cognee-mcp/cognee_mcp/server.py (1)

94-103: Clarify query parameter requirement in visualization tool schema.

The tool's schema defines a "query" property but doesn't mark it as required. Either mark it as required if visualization needs a query parameter, or document that it's optional.

cognee/shared/utils.py (1)

425-432: 🛠️ Refactor suggestion

Remove unused variable and add PNG export.

The filename variable is created but never used. Additionally, the PNG export functionality is incomplete.

    # Get the latest Unix timestamp as an integer
    timestamp = int(time.time())

    # Construct your filename
    filename = f"{timestamp}.png"
+
+    # Export the visualization as PNG
+    try:
+        export_png(p, filename=filename)
+    except Exception as e:
+        logging.error(f"Failed to export PNG: {str(e)}")
+        raise

Likely invalid or redundant comment.

🧰 Tools
🪛 Ruff (0.8.2)

429-429: Local variable filename is assigned to but never used

Remove assignment to unused variable filename

(F841)

Comment on lines +198 to +212
elif name == "visualize":
with open(os.devnull, "w") as fnull:
with redirect_stdout(fnull), redirect_stderr(fnull):
try:
result = await cognee.visualize_graph()
results = retrieved_edges_to_string(result)

return [
types.TextContent(
type="text",
text=results,
)
]
except (FileNotFoundError, IOError, ValueError) as e:
raise ValueError(f"Failed to create visualization: {str(e)}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Return visualization output instead of text content.

The visualization tool returns text content (types.TextContent) despite being described as a tool to "Visualize the knowledge graph." Consider returning the visualization as an image using types.ImageContent or as an embedded resource using types.EmbeddedResource.

                    return [
-                       types.TextContent(
-                           type="text",
-                           text=results,
-                       )
+                       types.ImageContent(
+                           type="image",
+                           data=await cognee.visualize_graph(),
+                           format="png"
+                       )
                    ]

Committable suggestion skipped: line range outside the PR's diff.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🔭 Outside diff range comments (1)
cognee/shared/utils.py (1)

Line range hint 1-526: Format code using Ruff

The pipeline indicates that the file requires formatting using the Ruff formatter.

Run the following command to format the code:

ruff format cognee/shared/utils.py
🧰 Tools
🪛 GitHub Actions: ruff format

[warning] File requires formatting using Ruff formatter

♻️ Duplicate comments (1)
cognee/shared/utils.py (1)

453-460: ⚠️ Potential issue

Add error handling for PNG export

The PNG export functionality lacks error handling and cleanup of old files.

A past review has already suggested adding error handling and cleanup functionality for PNG exports. The same issues still apply here.

🧰 Tools
🪛 Ruff (0.8.2)

457-457: Local variable filename is assigned to but never used

Remove assignment to unused variable filename

(F841)

🪛 GitHub Actions: ruff lint

[error] 457-457: Local variable filename is assigned to but never used

🪛 GitHub Actions: ruff format

[warning] File requires formatting using Ruff formatter

🧹 Nitpick comments (2)
cognee/shared/utils.py (2)

44-47: Optimize NLTK resource downloads

Downloading NLTK resources on every function call is inefficient. Consider moving these downloads to a module initialization or a dedicated setup function.

+def setup_nltk_resources():
+    """Download required NLTK resources if not already present."""
+    try:
+        nltk.download("words", quiet=True)
+        nltk.download("punkt", quiet=True)
+        nltk.download("averaged_perceptron_tagger", quiet=True)
+        nltk.download("maxent_ne_chunker", quiet=True)
+    except Exception as e:
+        logging.error(f"Failed to download NLTK resources: {str(e)}")
+        raise
+
 def extract_pos_tags(sentence):
     """Extract Part-of-Speech (POS) tags for words in a sentence."""
-    # Ensure that the necessary NLTK resources are downloaded
-    nltk.download("words", quiet=True)
-    nltk.download("punkt", quiet=True)
-    nltk.download("averaged_perceptron_tagger", quiet=True)
🧰 Tools
🪛 GitHub Actions: ruff format

[warning] File requires formatting using Ruff formatter


Line range hint 405-526: Consider splitting visualization code into smaller components

The create_cognee_style_network_with_logo function is handling too many responsibilities, making it hard to maintain and test.

Consider:

  1. Extracting logo embedding into a separate class
  2. Creating a dedicated NetworkVisualizer class
  3. Separating layout and styling logic
  4. Moving PNG export functionality to a dedicated service

This will improve maintainability, testability, and reusability of the visualization components.

🧰 Tools
🪛 Ruff (0.8.2)

457-457: Local variable filename is assigned to but never used

Remove assignment to unused variable filename

(F841)

🪛 GitHub Actions: ruff lint

[error] 457-457: Local variable filename is assigned to but never used

🪛 GitHub Actions: ruff format

[warning] File requires formatting using Ruff formatter

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3ba98b2 and ad07bae.

📒 Files selected for processing (1)
  • cognee/shared/utils.py (5 hunks)
🧰 Additional context used
🪛 Ruff (0.8.2)
cognee/shared/utils.py

457-457: Local variable filename is assigned to but never used

Remove assignment to unused variable filename

(F841)

🪛 GitHub Actions: ruff lint
cognee/shared/utils.py

[error] 457-457: Local variable filename is assigned to but never used

🪛 GitHub Actions: ruff format
cognee/shared/utils.py

[warning] File requires formatting using Ruff formatter

🪛 GitHub Actions: test | weaviate
cognee/shared/utils.py

[error] 14-14: Missing required dependency: Module 'nltk' not found. Please install the required package using 'poetry add nltk'.

🪛 GitHub Actions: test | milvus
cognee/shared/utils.py

[error] 14-14: Missing required dependency: Module 'nltk' not found. Please install the required package using 'pip install nltk' or add it to poetry dependencies.

🪛 GitHub Actions: test | neo4j
cognee/shared/utils.py

[error] 14-14: Missing required dependency: Module 'nltk' not found. Please install the package using poetry or pip.

🪛 GitHub Actions: test | qdrant
cognee/shared/utils.py

[error] 14-14: Missing required dependency: Module 'nltk' not found. Please install the nltk package.

🪛 GitHub Actions: test | deduplication
cognee/shared/utils.py

[error] 14-14: Missing required dependency: Module 'nltk' not found. Please install the package using poetry add nltk or add it to pyproject.toml.

🪛 GitHub Actions: test | pgvector
cognee/shared/utils.py

[error] 14-14: Missing required dependency: Module 'nltk' not found. Please install the nltk package.

⏰ Context from checks skipped due to timeout of 90000ms (8)
  • GitHub Check: run_multimedia_example_test / test
  • GitHub Check: run_notebook_test / test
  • GitHub Check: run_dynamic_steps_example_test / test
  • GitHub Check: run_notebook_test / test
  • GitHub Check: run_notebook_test / test
  • GitHub Check: run_notebook_test / test
  • GitHub Check: profiler
  • GitHub Check: docker-compose-test

@@ -13,7 +13,7 @@
import tiktoken
import nltk
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Add NLTK to project dependencies

Multiple pipeline failures indicate that NLTK is not properly declared as a project dependency.

Add NLTK to your project dependencies by:

  1. Adding it to pyproject.toml:
[tool.poetry.dependencies]
nltk = "^3.8.1"
  1. Or installing via poetry:
poetry add nltk
🧰 Tools
🪛 GitHub Actions: ruff format

[warning] File requires formatting using Ruff formatter

🪛 GitHub Actions: test | weaviate

[error] 14-14: Missing required dependency: Module 'nltk' not found. Please install the required package using 'poetry add nltk'.

🪛 GitHub Actions: test | milvus

[error] 14-14: Missing required dependency: Module 'nltk' not found. Please install the required package using 'pip install nltk' or add it to poetry dependencies.

🪛 GitHub Actions: test | neo4j

[error] 14-14: Missing required dependency: Module 'nltk' not found. Please install the package using poetry or pip.

🪛 GitHub Actions: test | qdrant

[error] 14-14: Missing required dependency: Module 'nltk' not found. Please install the nltk package.

🪛 GitHub Actions: test | deduplication

[error] 14-14: Missing required dependency: Module 'nltk' not found. Please install the package using poetry add nltk or add it to pyproject.toml.

🪛 GitHub Actions: test | pgvector

[error] 14-14: Missing required dependency: Module 'nltk' not found. Please install the nltk package.

Comment on lines +34 to +38
def get_entities(tagged_tokens):
nltk.download("maxent_ne_chunker", quiet=True)
from nltk.chunk import ne_chunk

return ne_chunk(tagged_tokens)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add docstring and error handling for get_entities()

The function lacks a docstring explaining its purpose and parameters. Also, NLTK resource downloads should be handled with error checking.

 def get_entities(tagged_tokens):
+    """Extract named entities from POS-tagged tokens using NLTK's ne_chunk.
+    
+    Args:
+        tagged_tokens: A list of POS-tagged tokens from nltk.pos_tag()
+    
+    Returns:
+        A tree containing chunks of named entities
+    """
+    try:
         nltk.download("maxent_ne_chunker", quiet=True)
         from nltk.chunk import ne_chunk
+    except Exception as e:
+        logging.error(f"Failed to download NLTK resources: {str(e)}")
+        raise

         return ne_chunk(tagged_tokens)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def get_entities(tagged_tokens):
nltk.download("maxent_ne_chunker", quiet=True)
from nltk.chunk import ne_chunk
return ne_chunk(tagged_tokens)
def get_entities(tagged_tokens):
"""Extract named entities from POS-tagged tokens using NLTK's ne_chunk.
Args:
tagged_tokens: A list of POS-tagged tokens from nltk.pos_tag()
Returns:
A tree containing chunks of named entities
"""
try:
nltk.download("maxent_ne_chunker", quiet=True)
from nltk.chunk import ne_chunk
except Exception as e:
logging.error(f"Failed to download NLTK resources: {str(e)}")
raise
return ne_chunk(tagged_tokens)
🧰 Tools
🪛 GitHub Actions: ruff format

[warning] File requires formatting using Ruff formatter

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
cognee-mcp/README.md (3)

40-42: Add language specifications to code blocks.

The code blocks should specify their language for proper syntax highlighting.

Apply these changes:

-```
+```bash
cd ~/Library/Application\ Support/Claude/

- +bash
nano claude_desktop_config.json



Also applies to: 46-48

<details>
<summary>🧰 Tools</summary>

<details>
<summary>🪛 Markdownlint (0.37.0)</summary>

40-40: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)

</details>

</details>

---

`44-45`: **Improve configuration file instructions.**

The instructions about adding paths and API key could be more specific and include a warning about securing sensitive information.

Consider adding:
- A note about keeping the API key secure
- Examples of correct path formatting
- Warning about not committing the config file with API keys

---

`101-105`: **Enhance development workflow documentation.**

The development workflow steps could be more detailed to prevent confusion.

Consider:
1. Explaining why each step is necessary
2. Adding expected output or success indicators
3. Mentioning potential troubleshooting steps

</blockquote></details>

</blockquote></details>

<details>
<summary>📜 Review details</summary>

**Configuration used: CodeRabbit UI**
**Review profile: CHILL**
**Plan: Pro**

<details>
<summary>📥 Commits</summary>

Reviewing files that changed from the base of the PR and between ad07bae9a781c909c67658261d579de5834ea3b1 and 61118dda0353158ed949d04acf95867d30011d47.

</details>

<details>
<summary>📒 Files selected for processing (1)</summary>

* `cognee-mcp/README.md` (2 hunks)

</details>

<details>
<summary>🧰 Additional context used</summary>

<details>
<summary>🪛 Markdownlint (0.37.0)</summary>

<details>
<summary>cognee-mcp/README.md</summary>

40-40: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)

---

46-46: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)

</details>

</details>

</details>

</details>

<!-- This is an auto-generated comment by CodeRabbit for review status -->


To use debugger, run:
```bash
npx @modelcontextprotocol/inspector uv --directory /Users/name/folder run cognee
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Replace hardcoded paths with placeholders.

The instructions contain specific user paths that won't work for other users.

Apply these changes:

-npx @modelcontextprotocol/inspector uv --directory /Users/name/folder run cognee
+npx @modelcontextprotocol/inspector uv --directory /Users/{username}/path/to/folder run cognee

-npx @modelcontextprotocol/inspector uv --directory /Users/vasilije/cognee/cognee-mcp run cognee
+npx @modelcontextprotocol/inspector uv --directory /Users/{username}/cognee/cognee-mcp run cognee

Also applies to: 105-105

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
.github/workflows/dockerhub.yml (2)

44-45: Consider using dynamic cache tags.

The current cache configuration uses a static buildcache tag which could lead to race conditions in parallel builds. Consider using a branch-specific cache tag.

-          cache-from: type=registry,ref=cognee/cognee:buildcache
-          cache-to: type=registry,ref=cognee/cognee:buildcache,mode=max
+          cache-from: type=registry,ref=cognee/cognee:buildcache-${{ github.ref_name }}
+          cache-to: type=registry,ref=cognee/cognee:buildcache-${{ github.ref_name }},mode=max

26-35: Enhance image metadata.

Consider adding more metadata to improve image discoverability and traceability:

      - name: Extract metadata
        id: meta
        uses: docker/metadata-action@v5
        with:
          images: cognee/cognee
          tags: |
            type=ref,event=branch
            type=sha,prefix={{branch}}-
            type=raw,value=latest,enable={{is_default_branch}}
+          labels: |
+            org.opencontainers.image.title=Cognee
+            org.opencontainers.image.description=Knowledge graph system with visualization capabilities
+            maintainer=Cognee Team
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 61118dd and 933d21a.

📒 Files selected for processing (1)
  • .github/workflows/dockerhub.yml (1 hunks)
🧰 Additional context used
🪛 actionlint (1.7.4)
.github/workflows/dockerhub.yml

48-48: property "build" is not defined in object type {meta: {conclusion: string; outcome: string; outputs: {annotations: string; bake-file: string; bake-file-annotations: string; bake-file-labels: string; bake-file-tags: string; json: string; labels: string; tags: string; version: string}}}

(expression)

🪛 yamllint (1.35.1)
.github/workflows/dockerhub.yml

[error] 48-48: no new line character at the end of file

(new-line-at-end-of-file)

⏰ Context from checks skipped due to timeout of 90000ms (9)
  • GitHub Check: run_simple_example_test / test
  • GitHub Check: run_multimedia_example_test / test
  • GitHub Check: run_notebook_test / test
  • GitHub Check: run_dynamic_steps_example_test / test
  • GitHub Check: run_notebook_test / test
  • GitHub Check: run_notebook_test / test
  • GitHub Check: run_notebook_test / test
  • GitHub Check: profiler
  • GitHub Check: docker-compose-test
🔇 Additional comments (1)
.github/workflows/dockerhub.yml (1)

1-8: LGTM! Good addition of the dev branch.

The workflow trigger configuration is well-structured, allowing for testing in the dev branch before changes reach main.

Comment on lines +47 to +48
- name: Image digest
run: echo ${{ steps.build.outputs.digest }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix the image digest step.

There are two issues to address:

  1. The digest step references an undefined step ID 'build'
  2. The file is missing a newline at the end

Apply this diff to fix both issues:

      - name: Image digest
-        run: echo ${{ steps.build.outputs.digest }}
+        run: echo ${{ steps.meta.outputs.json }}
+
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- name: Image digest
run: echo ${{ steps.build.outputs.digest }}
- name: Image digest
run: echo ${{ steps.meta.outputs.json }}
🧰 Tools
🪛 actionlint (1.7.4)

48-48: property "build" is not defined in object type {meta: {conclusion: string; outcome: string; outputs: {annotations: string; bake-file: string; bake-file-annotations: string; bake-file-labels: string; bake-file-tags: string; json: string; labels: string; tags: string; version: string}}}

(expression)

🪛 yamllint (1.35.1)

[error] 48-48: no new line character at the end of file

(new-line-at-end-of-file)

Comment on lines 9 to 13
jobs:
docker-build-and-push:
runs-on: ubuntu-latest

steps:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add explicit permissions configuration.

Consider adding explicit permissions to follow the principle of least privilege:

  docker-build-and-push:
    runs-on: ubuntu-latest
+    permissions:
+      contents: read
+      packages: write
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
jobs:
docker-build-and-push:
runs-on: ubuntu-latest
steps:
jobs:
docker-build-and-push:
runs-on: ubuntu-latest
permissions:
contents: read
packages: write
steps:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants