-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: metrics in neo4j adapter [COG-1082] #487
Conversation
…-tokens-to-metric-table
…add-num-tokens-to-metric-table
WalkthroughThis pull request updates task instantiation, graph management, and timestamp handling across different modules. In the API layer, the default task for Changes
Sequence Diagram(s)sequenceDiagram
participant C as Client
participant N as Neo4jAdapter
participant G as Graph Database Service
C->>N: Call graph_exists(graph_name)
N->>G: Query available graph names
G-->>N: Return list of graphs
N-->>C: Return existence status
C->>N: Call project_entire_graph(graph_name)
N->>G: Request projection of all nodes & relationships
G-->>N: Return in-memory projected graph
N-->>C: Provide projected graph
C->>N: Call drop_graph(graph_name)
N->>G: Execute graph drop command
G-->>N: Confirm deletion
N-->>C: Return drop confirmation
Possibly related PRs
Suggested reviewers
Poem
Tip 🌐 Web search-backed reviews and chat
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
💤 Files with no reviewable changes (1)
⏰ Context from checks skipped due to timeout of 90000ms (5)
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
…og-1082-metrics-in-networkx-adapter
…g-1082-metrics-in-neo4j-adapter
e89c9b9
to
27feae8
Compare
27feae8
to
af8e798
Compare
…g-1082-metrics-in-neo4j-adapter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Caution
Inline review comments failed to post. This is likely due to GitHub's limits when posting large numbers of comments.
Actionable comments posted: 2
🧹 Nitpick comments (8)
cognee/infrastructure/databases/graph/neo4j_driver/adapter.py (4)
573-577
: Add logging after dropping the graph.
It might be helpful to log whether the graph was successfully dropped or was absent, for better traceability in production.async def drop_graph(self, graph_name="myGraph"): if await self.graph_exists(graph_name): drop_query = f"CALL gds.graph.drop('{graph_name}');" await self.query(drop_query) + logger.debug(f"Dropped graph '{{graph_name}}' successfully.")
633-636
: Diameter not yet implemented.
If diameter is critical, consider GDS Shortest Path or BFS expansions. Let us know if you’d like assistance with a workable approach.
637-642
: Average shortest path not yet implemented.
Likewise, GDS offers built-in algorithms for average path length. Let us know if you’d like to integrate it.
643-645
: Average clustering not yet implemented.
For completeness, you may explore GDS or external libraries to compute clustering.cognee/infrastructure/databases/graph/graph_db_interface.py (1)
59-59
: Document the new parameterinclude_optional
.
Adding a short docstring describing its usage will help future maintainers understand which metrics are impacted by this flag.cognee/modules/data/methods/store_descriptive_metrics.py (1)
26-29
: Add validation for the include_optional parameter.Consider adding validation for the
include_optional
parameter to ensure it's a boolean value.async def store_descriptive_metrics(data_points: list[DataPoint], include_optional: bool): + if not isinstance(include_optional, bool): + raise ValueError("include_optional must be a boolean value") db_engine = get_relational_engine() graph_engine = await get_graph_engine() graph_metrics = await graph_engine.get_graph_metrics(include_optional)cognee/api/v1/cognify/cognify_v2.py (1)
168-168
: Consider making include_optional configurable.The
include_optional
parameter is hardcoded toTrue
. Consider making this configurable through the cognify config to allow flexibility in whether optional metrics are computed.- Task(store_descriptive_metrics, include_optional=True), + Task(store_descriptive_metrics, include_optional=cognee_config.include_optional_metrics),cognee/infrastructure/databases/graph/networkx/adapter.py (1)
416-422
: Improve error handling in clustering coefficient calculation.The current implementation swallows exception details. Consider logging the full exception traceback for better debugging.
def _get_avg_clustering(graph): try: return nx.average_clustering(nx.DiGraph(graph)) except Exception as e: - logger.warning("Failed to calculate clustering coefficient: %s", e) + logger.warning("Failed to calculate clustering coefficient", exc_info=True) return None
🛑 Comments failed to post (2)
cognee/infrastructure/databases/graph/neo4j_driver/adapter.py (1)
647-649:
⚠️ Potential issuePotential index/key error in node/edge data extraction.
nodes[0]["nodes"]
oredges[0]["elements"]
might raise an exception if the query returns an empty list or no matching keys. Consider validating non-empty results.num_nodes = len(nodes[0].get("nodes", [])) if nodes and "nodes" in nodes[0] else 0 num_edges = len(edges[0].get("elements", [])) if edges and "elements" in edges[0] else 0📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.num_nodes = len(nodes[0].get("nodes", [])) if nodes and "nodes" in nodes[0] else 0 num_edges = len(edges[0].get("elements", [])) if edges and "elements" in edges[0] else 0
cognee/infrastructure/databases/graph/networkx/adapter.py (1)
442-447: 🛠️ Refactor suggestion
Use None instead of -1 for missing optional metrics.
Using -1 as a sentinel value for missing optional metrics could be misleading as it might be interpreted as a valid metric value. Consider using None instead.
optional_metrics = { - "num_selfloops": -1, - "diameter": -1, - "avg_shortest_path_length": -1, - "avg_clustering": -1, + "num_selfloops": None, + "diameter": None, + "avg_shortest_path_length": None, + "avg_clustering": None, }📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.optional_metrics = { "num_selfloops": None, "diameter": None, "avg_shortest_path_length": None, "avg_clustering": None, }
81a4aa3
to
f2ad1d4
Compare
a1ffeca
to
3e67828
Compare
|
GitGuardian id | GitGuardian status | Secret | Commit | Filename | |
---|---|---|---|---|---|
9573981 | Triggered | Generic Password | 91b42ab | .env.template | View secret |
🛠 Guidelines to remediate hardcoded secrets
- Understand the implications of revoking this secret by investigating where it is used in your code.
- Replace and store your secret safely. Learn here the best practices.
- Revoke and rotate this secret.
- If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.
To avoid such incidents in the future consider
- following these best practices for managing and storing secrets including API keys and other credentials
- install secret detection on pre-commit to catch secret before it leaves your machine and ease remediation.
🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.
Description
DCO Affirmation
I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin
Summary by CodeRabbit
New Features
Refactor