Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add data visualization for Anthropic #432

Open
wants to merge 12 commits into
base: dev
Choose a base branch
from
37 changes: 37 additions & 0 deletions cognee-mcp/cognee_mcp/server.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,8 @@
from cognee.shared.data_models import KnowledgeGraph
from mcp.server import NotificationOptions, Server
from mcp.server.models import InitializationOptions
from PIL import Image
from PIL import Image as PILImage
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Remove duplicate PIL Image import

The Image module from PIL is imported twice with different aliases. This is redundant and could cause confusion.

-from PIL import Image
-from PIL import Image as PILImage
+from PIL import Image
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
from PIL import Image
from PIL import Image as PILImage
from PIL import Image
🧰 Tools
🪛 GitHub Actions: ruff format

[warning] File requires formatting. Code does not conform to Ruff formatting standards.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double import here


server = Server("cognee-mcp")

Expand Down Expand Up @@ -87,9 +89,36 @@ async def handle_list_tools() -> list[types.Tool]:
},
},
),
types.Tool(
name="visualize",
description="Visualize the knowledge graph.",
inputSchema={
"type": "object",
"properties": {
"query": {"type": "string"},
},
},
),
Comment on lines +94 to +103
Copy link
Contributor

@coderabbitai coderabbitai bot Jan 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Clarify query parameter requirement in visualization tool schema

The tool's schema defines a "query" property but doesn't mark it as required. Either:

  1. Mark it as required if visualization needs a query parameter
  2. Document that it's optional if visualization can work without a query
         types.Tool(
             name="visualize",
             description="Visualize the knowledge graph.",
             inputSchema={
                 "type": "object",
                 "properties": {
                     "query": {"type": "string"},
                 },
+                "required": ["query"],  # Add this if query is required
             },
         ),
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
types.Tool(
name="visualize",
description="Visualize the knowledge graph.",
inputSchema={
"type": "object",
"properties": {
"query": {"type": "string"},
},
},
),
types.Tool(
name="visualize",
description="Visualize the knowledge graph.",
inputSchema={
"type": "object",
"properties": {
"query": {"type": "string"},
},
"required": ["query"], # Add this if query is required
},
),
🧰 Tools
🪛 GitHub Actions: ruff format

[warning] File requires formatting. Code does not conform to Ruff's formatting standards.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good suggestion from coderabbit

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@borisarzentar Thank you for confirming! I'm glad the suggestion was helpful.

⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢀⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⠀⠀⢀⣴⣿⣿⣷⣄⠀⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⠀⢠⣿⣿⣿⣿⣿⣿⣆⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⢠⣿⣿⣿⣿⣿⣿⣿⣿⡄⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⢠⣿⣿⣿⡿⢿⣿⡿⠟⢻⣷⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⣰⣿⣿⠟⠉⠀⢸⣿⠀⠀⢸⣿⣇⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⣰⣿⣿⠋⠀⠀⠀⢸⣿⠀⠀⢸⣿⣿⡄⠀⠀⠀⠀⠀
⠀⠀⠀⣰⣿⣿⠃⠀⠀⠀⠀⢸⣿⠀⠀⢸⣿⣿⣷⠀⠀⠀⠀⠀
⠀⠀⢠⣿⣿⠇⠀⠀⠀⠀⠀⢸⣿⠀⠀⢸⣿⣿⣿⡄⠀⠀⠀⠀
⠀⢠⣿⣿⡏⠀⠀⠀⠀⠀⠀⢸⣿⠀⠀⢸⣿⣿⣿⣷⡀⠀⠀⠀
⢠⣿⣿⡟⠀⠀⠀⠀⠀⠀⠀⢸⣿⠀⠀⢸⣿⣿⣿⣿⣷⡀⠀⠀
⣿⣿⡟⠀⠀⠀⠀⠀⠀⠀⠀⢸⣿⠀⠀⢸⣿⣿⣿⣿⣿⣿⡄⠀
⠛⠋⠀⠀⠀⠀⠀⠀⠀⠀⠀⢸⣿⠀⠀⢸⣿⣿⣿⣿⣿⣿⣿⠀

]


def get_freshest_png(directory: str) -> Image.Image:
# List all files in 'directory' that end with .png
files = [f for f in os.listdir(directory) if f.endswith(".png")]
if not files:
raise FileNotFoundError("No PNG files found in the given directory.")

# Sort by integer value of the filename (minus the '.png')
# Example filename: 1673185134.png -> integer 1673185134
files_sorted = sorted(files, key=lambda x: int(x.replace(".png", "")))

# The "freshest" file has the largest timestamp
freshest_filename = files_sorted[-1]
freshest_path = os.path.join(directory, freshest_filename)

# Open the image with PIL and return the PIL Image object
return Image.open(freshest_path)
Vasilije1990 marked this conversation as resolved.
Show resolved Hide resolved

@server.call_tool()
async def handle_call_tool(
name: str, arguments: dict | None
Expand Down Expand Up @@ -154,6 +183,14 @@ async def handle_call_tool(
text="Pruned",
)
]

elif name == "visualize":
with open(os.devnull, "w") as fnull:
with redirect_stdout(fnull), redirect_stderr(fnull):
"""Create a thumbnail from an image"""
await cognee.visualize
img = get_freshest_png(".")
return types.Image(data=img.tobytes(), format="png")
Vasilije1990 marked this conversation as resolved.
Show resolved Hide resolved
else:
raise ValueError(f"Unknown tool: {name}")

Expand Down
11 changes: 10 additions & 1 deletion cognee/shared/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@
import tiktoken
import nltk
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Add NLTK to project dependencies

Multiple pipeline failures indicate that NLTK is not properly declared as a project dependency.

Add NLTK to your project dependencies by:

  1. Adding it to pyproject.toml:
[tool.poetry.dependencies]
nltk = "^3.8.1"
  1. Or installing via poetry:
poetry add nltk
🧰 Tools
🪛 GitHub Actions: ruff format

[warning] File requires formatting using Ruff formatter

🪛 GitHub Actions: test | weaviate

[error] 14-14: Missing required dependency: Module 'nltk' not found. Please install the required package using 'poetry add nltk'.

🪛 GitHub Actions: test | milvus

[error] 14-14: Missing required dependency: Module 'nltk' not found. Please install the required package using 'pip install nltk' or add it to poetry dependencies.

🪛 GitHub Actions: test | neo4j

[error] 14-14: Missing required dependency: Module 'nltk' not found. Please install the package using poetry or pip.

🪛 GitHub Actions: test | qdrant

[error] 14-14: Missing required dependency: Module 'nltk' not found. Please install the nltk package.

🪛 GitHub Actions: test | deduplication

[error] 14-14: Missing required dependency: Module 'nltk' not found. Please install the package using poetry add nltk or add it to pyproject.toml.

🪛 GitHub Actions: test | pgvector

[error] 14-14: Missing required dependency: Module 'nltk' not found. Please install the nltk package.

import base64

import time

import logging
import sys
Expand Down Expand Up @@ -396,6 +396,7 @@ async def create_cognee_style_network_with_logo(

from bokeh.embed import file_html
from bokeh.resources import CDN
from bokeh.io import export_png

logging.info("Converting graph to serializable format...")
G = await convert_to_serializable_graph(G)
Expand Down Expand Up @@ -443,6 +444,14 @@ async def create_cognee_style_network_with_logo(
)
p.add_tools(hover_tool)

# Get the latest Unix timestamp as an integer
timestamp = int(time.time())

# Construct your filename
filename = f"{timestamp}.png"

export_png(p, filename=filename)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add error handling for PNG export

The PNG export functionality lacks error handling and cleanup of old files.

     # Get the latest Unix timestamp as an integer
     timestamp = int(time.time())

     # Construct your filename
     filename = f"{timestamp}.png"

-    export_png(p, filename=filename)
+    try:
+        # Cleanup old PNG files to prevent disk space issues
+        cleanup_old_pngs(directory=".", keep_latest=5)
+        
+        # Export the new PNG
+        export_png(p, filename=filename)
+    except Exception as e:
+        logging.error(f"Failed to export PNG: {str(e)}")
+        raise

Consider adding a helper function to cleanup old PNG files:

def cleanup_old_pngs(directory: str, keep_latest: int = 5):
    """Cleanup old PNG files, keeping only the N latest files."""
    png_files = [f for f in os.listdir(directory) if f.endswith('.png')]
    if len(png_files) <= keep_latest:
        return
        
    # Sort by timestamp in filename
    sorted_files = sorted(png_files, key=lambda x: int(x.replace(".png", "")))
    
    # Remove older files
    for f in sorted_files[:-keep_latest]:
        try:
            os.remove(os.path.join(directory, f))
        except OSError as e:
            logging.warning(f"Failed to remove old PNG file {f}: {str(e)}")

logging.info(f"Saving visualization to {output_filename}...")
html_content = file_html(p, CDN, title)
with open(output_filename, "w") as f:
Expand Down
Loading