Skip to content

Conversation

@ParagGhatage
Copy link
Collaborator

@ParagGhatage ParagGhatage commented Jul 3, 2025

This PR introduces the store_and_send node at the final stage of the LangGraph pipeline. It performs the following key tasks:

  1. Takes pipeline state as input.
  2. Chunks the cleaned_text for better embedding efficiency.
  3. Generates embeddings for each chunk using embedding model.
  4. Stores the embeddings along with metadata in Pinecone vector DB.
  5. Sends the entire updated state to the frontend for UI rendering or further use.

🔧 How it Works

  • Input: state dictionary, expected to include:

    • article_id
    • cleaned_text
    • keywords
    • sentiment
    • fact_check
    • perspective, counter_perspective, etc.
  • Steps:

    1. Text is split into chunks.
    2. Embedding for each chunk is created with a unique ID.
    3. Each embedding is saved in Pinecone under a namespace keyed by article_id.
    4. Full state is returned for final use in the frontend.

✅ Checklist

  • Chunking logic tested
  • Embedding creation verified
  • Pinecone upsert successful
  • Final state returned correctly

Summary by CodeRabbit

Summary by CodeRabbit

  • New Features

    • Automated factual claim extraction and verification using AI and SerpAPI web search.
    • Structured counter-perspective generation with detailed reasoning and AI-based scoring.
    • Utilities for data chunking, text embedding, vector storage, and unique article ID generation.
    • Pinecone vector store integration for efficient embedding storage and retrieval.
  • Improvements

    • Enhanced sentiment analysis and perspective scoring with advanced AI models.
    • Improved error handling and structured outputs across the fact-checking pipeline.
    • Updated control flow with typed state management for robustness.
    • Refined prompt templates and structured output parsing for perspective generation.
    • Frontend loading now awaits analysis API response before starting progress animation.
    • Analysis results displayed on results page by loading stored session data.
    • Added CORS middleware to backend for cross-origin requests.
    • Introduced backend startup script and Dockerfile for streamlined deployment.
  • Chores

    • Added dependencies for AI, web search, vector storage, and NLP tools.
    • Automatically download required NLTK data packages if missing.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jul 3, 2025

Walkthrough

This update adds a fact-checking pipeline with LLM-based claim extraction and verification using Groq API, integrated with SerpAPI web search. The LangGraph workflow is enhanced with typed states and updated nodes for sentiment, perspective generation, and judgment using LLMs. Vector storage is implemented via chunking, embedding, and Pinecone integration. NLTK dependencies and prompt templates are introduced.

Changes

File(s) Change Summary
app/modules/facts_check/llm_processing.py Added LLM claim extraction and fact verification functions using Groq API.
app/modules/facts_check/web_search.py Added SerpAPI-based web search function with API key validation.
app/modules/langgraph_builder.py Updated build_langgraph with typed state, replaced node functions, used add_conditional_edges, and updated terminal edge.
app/modules/langgraph_nodes/fact_check.py Replaced placeholder search with run_fact_check_pipeline call; updated error handling and output structure.
app/modules/langgraph_nodes/generate_perspective.py Replaced LangChain LLMChain with structured Groq LLM and new prompt; added Pydantic output model; changed input formatting.
app/modules/langgraph_nodes/judge.py Replaced dummy scoring with Groq LLM scoring using prompt; added error handling and parsing of numeric score.
app/modules/langgraph_nodes/sentiment.py Reduced max tokens in Groq chat call; normalized sentiment output to lowercase.
app/modules/langgraph_nodes/store_and_send.py Replaced save_to_vector_db with chunking, embedding, and storing steps; added detailed error handling and prints.
app/modules/scraper/cleaner.py Added NLTK import and auto-download of 'stopwords' and 'punkt_tab' corpora if missing.
app/modules/vector_store/chunk_rag_data.py Added chunk_rag_data function to create chunks from article and facts with validation and metadata.
app/modules/vector_store/embed.py Added embedding function using SentenceTransformer model for text chunks.
app/db/vector_store.py Added Pinecone client initialization, index creation/check, and index connection with error handling.
app/utils/fact_check_utils.py Added run_fact_check_pipeline to orchestrate claim extraction, web search, and fact verification with delays.
app/utils/generate_chunk_id.py Added generate_id utility to create deterministic SHA-256 based IDs for text.
app/utils/prompt_templates.py Added generation_prompt template for generating counter-perspectives with JSON output format.
app/utils/store_vectors.py Added store function to upsert vectors into Pinecone index with logging and error handling.
pyproject.toml Added dependencies: duckduckgo-search, google-search-results, langchain-community, langchain-groq, nltk, pinecone, pinecone-client, sentence-transformers.
frontend/app/analyze/loading/page.tsx Refactored useEffect to await API call before starting progress and step intervals; added error handling and redirects.
frontend/app/analyze/results/page.tsx Added state to load and display analysis results from sessionStorage as formatted JSON above header.
frontend/package.json Added axios dependency for HTTP requests.
new-backend/main.py Added CORS middleware with permissive settings to FastAPI app.
new-backend/start.sh Added startup shell script to sync environment and run backend server.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant LangGraph
    participant FactCheckNode
    participant LLMProcessing
    participant WebSearch
    participant FactVerifier
    participant PerspectiveGen
    participant Judge
    participant Chunker

    User->>LangGraph: Submit article
    LangGraph->>FactCheckNode: Process state
    FactCheckNode->>LLMProcessing: Extract claims
    LLMProcessing->>FactCheckNode: Return claims
    FactCheckNode->>WebSearch: Search for each claim
    WebSearch->>FactCheckNode: Return search results
    FactCheckNode->>FactVerifier: Verify claims with evidence
    FactVerifier->>FactCheckNode: Return verdicts
    FactCheckNode->>LangGraph: Return facts
    LangGraph->>PerspectiveGen: Generate counter-perspective
    PerspectiveGen->>LangGraph: Return perspective
    LangGraph->>Judge: Score perspective
    Judge->>LangGraph: Return score
    LangGraph->>Chunker: Chunk data for storage
    Chunker->>LangGraph: Return chunks
    LangGraph->>User: Send results
Loading

Possibly related PRs

  • AOSSIE-Org/Perspective#99: Establishes the initial LangGraph workflow and dummy fact-check node, which are now replaced by the new LLM-based fact-checking pipeline in this PR.

Poem

In the warren of code, a new path unfolds,
With LLMs and search, the truth it beholds.
Claims are extracted, then checked with care,
Perspectives generated, judgments fair.
Chunks for the future, all neat and compact—
This bunny’s delighted: the facts are intact! 🐇✨


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 93c6793 and e595a45.

📒 Files selected for processing (3)
  • frontend/app/analyze/loading/page.tsx (2 hunks)
  • new-backend/Dockerfile (1 hunks)
  • new-backend/main.py (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • new-backend/Dockerfile
🚧 Files skipped from review as they are similar to previous changes (2)
  • new-backend/main.py
  • frontend/app/analyze/loading/page.tsx
✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 18

🧹 Nitpick comments (12)
new-backend/app/modules/langgraph_nodes/sentiment.py (1)

35-35: Consider increasing max_tokens for reliability.

While 3 tokens works for single-word responses, it might be too restrictive if the model occasionally includes punctuation or formatting. Consider using 5-10 tokens for better reliability.

-            max_tokens=3,
+            max_tokens=5,
new-backend/app/utils/prompt_templates.py (1)

4-5: Fix string formatting issue.

There's an unnecessary line break in the string that creates awkward formatting.

-You are an AI assistant that generates a well-reasoned '
-'counter-perspective to a given article.
+You are an AI assistant that generates a well-reasoned counter-perspective to a given article.
new-backend/app/modules/langgraph_nodes/store_and_send.py (1)

8-8: Remove debug print statement.

The debug print statement should be removed from production code as it may expose sensitive state information.

-        print(state)
new-backend/app/modules/scraper/cleaner.py (1)

2-2: Consider if NLTK import is actually needed.

The NLTK library is imported but not used anywhere in the current clean_extracted_text function, which only uses regex operations. If NLTK functionality is planned for future use, consider adding a comment explaining the intended usage.

new-backend/app/modules/vector_store/chunk_rag_data.py (1)

27-27: Fix typo in metadata key.

The key "explaination" should be "explanation" for correct spelling.

-                    "explaination": fact["explaination"],
+                    "explanation": fact["explanation"],
new-backend/app/modules/langgraph_nodes/fact_check.py (1)

6-10: Consider removing redundant validation.

The function validates cleaned_text presence but then passes the entire state to run_fact_check_pipeline. Since the pipeline should handle missing text internally, this validation might be redundant.

Consider either removing this check or ensuring the pipeline actually requires this validation:

 def run_fact_check(state):
     try:
-        text = state.get("cleaned_text")
-
-        if not text:
-            raise ValueError("Missing or empty 'cleaned_text' in state")
-
         verifications, error_message = run_fact_check_pipeline(state)
new-backend/app/modules/langgraph_nodes/judge.py (1)

39-43: Consider validating score range before clamping.

The current approach silently clamps out-of-range values, which might hide unexpected LLM outputs. Consider explicit validation for better error visibility.

-        m = re.search(r"\b(\d{1,3})\b", raw)
+        m = re.search(r"\b(\d{1,3})\b", raw)
         if not m:
             raise ValueError(f"Couldn't parse a score from: '{raw}'")
 
-        score = max(0, min(100, int(m.group(1))))
+        score = int(m.group(1))
+        if not 0 <= score <= 100:
+            print(f"Warning: Score {score} outside expected range [0-100], clamping.")
+            score = max(0, min(100, score))
new-backend/app/modules/langgraph_builder.py (1)

52-54: Remove unnecessary trailing comma.

     graph.set_entry_point(
-                    "sentiment_analysis",
+                    "sentiment_analysis"
                     )
new-backend/app/modules/langgraph_nodes/generate_perspective.py (2)

6-6: Remove redundant variable assignment.

The intermediate prompt variable is unnecessary since generation_prompt is only used once.

-prompt = generation_prompt
-
 # ... other code ...
 
-chain = prompt | structured_llm
+chain = generation_prompt | structured_llm

Also applies to: 24-24


50-50: Fix typo in error message.

-        print(f"some error occured in generate_perspective:{e}")
+        print(f"Error occurred in generate_perspective: {e}")
new-backend/app/modules/facts_check/llm_processing.py (2)

107-109: Improve markdown code block parsing to handle case variations.

The current regex only handles lowercase "json" and might miss valid markdown blocks.

-# Strip markdown code blocks if present
-            content = re.sub(r"^```json|```$", "", content).strip()
+            # Strip markdown code blocks if present (case-insensitive)
+            content = re.sub(r"^```(?:json|JSON)?|```$", "", content, flags=re.IGNORECASE).strip()

52-52: Maintain consistency in error logging format.

The codebase uses different styles for error messages - some with emojis (❌, 🔥) and some without. Consider standardizing the format.

-        print(f"Error in claim_extraction: {e}")
+        print(f"❌ Error in claim_extraction: {e}")

 # ... and ...

-        print(f"🔥 Error in fact_verification: {e}")
+        print(f"❌ Error in fact_verification: {e}")

Also applies to: 115-115, 132-132

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 501e9c2 and 30a5547.

⛔ Files ignored due to path filters (1)
  • new-backend/uv.lock is excluded by !**/*.lock
📒 Files selected for processing (14)
  • new-backend/app/modules/facts_check/llm_processing.py (1 hunks)
  • new-backend/app/modules/facts_check/web_search.py (1 hunks)
  • new-backend/app/modules/langgraph_builder.py (5 hunks)
  • new-backend/app/modules/langgraph_nodes/fact_check.py (2 hunks)
  • new-backend/app/modules/langgraph_nodes/generate_perspective.py (2 hunks)
  • new-backend/app/modules/langgraph_nodes/judge.py (1 hunks)
  • new-backend/app/modules/langgraph_nodes/sentiment.py (1 hunks)
  • new-backend/app/modules/langgraph_nodes/store_and_send.py (1 hunks)
  • new-backend/app/modules/scraper/cleaner.py (1 hunks)
  • new-backend/app/modules/vector_store/chunk_rag_data.py (1 hunks)
  • new-backend/app/utils/fact_check_utils.py (1 hunks)
  • new-backend/app/utils/generate_chunk_id.py (1 hunks)
  • new-backend/app/utils/prompt_templates.py (1 hunks)
  • new-backend/pyproject.toml (1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (4)
new-backend/app/modules/langgraph_nodes/store_and_send.py (1)
new-backend/app/modules/vector_store/chunk_rag_data.py (1)
  • chunk_rag_data (4-34)
new-backend/app/modules/langgraph_nodes/fact_check.py (1)
new-backend/app/utils/fact_check_utils.py (1)
  • run_fact_check_pipeline (8-38)
new-backend/app/modules/vector_store/chunk_rag_data.py (1)
new-backend/app/utils/generate_chunk_id.py (1)
  • generate_id (4-7)
new-backend/app/modules/langgraph_builder.py (2)
new-backend/app/modules/langgraph_nodes/sentiment.py (1)
  • run_sentiment_sdk (10-53)
new-backend/app/modules/langgraph_nodes/error_handler.py (1)
  • error_handler (3-11)
🪛 Ruff (0.11.9)
new-backend/app/modules/langgraph_nodes/store_and_send.py

9-9: Local variable chunks is assigned to but never used

Remove assignment to unused variable chunks

(F841)

🪛 Flake8 (7.2.0)
new-backend/app/modules/langgraph_nodes/store_and_send.py

[error] 9-9: local variable 'chunks' is assigned to but never used

(F841)

🔇 Additional comments (7)
new-backend/pyproject.toml (1)

10-10: All added dependencies are up to date and free of known security issues

  • duckduckgo-search (>=8.0.4)
    • Latest upstream: 2025.4.4 (released 2025-04-04)
    • No critical/high/medium vulnerabilities reported
    • Note: the package was removed from PyPI in April 2025 (non-security compliance/hardening reasons); verify your install source or consider vendoring if that poses an issue
  • langchain-community (>=0.2.0)
    • Latest upstream: 0.2.19
    • No public advisories found
  • langchain-groq (>=0.1.0)
    • Latest upstream: 0.3.4
    • No known vulnerabilities
  • nltk (>=3.5)
    • Latest upstream: 3.9.1
    • No known security issues; earlier 3.8.x releases have been yanked

No further action required unless you need to pin to exact versions or address the PyPI removal of duckduckgo-search.

new-backend/app/modules/langgraph_nodes/sentiment.py (1)

39-39: No uppercase sentiment checks detected; lowercase conversion is safe

A search for downstream usage of “Positive”, “Negative”, or “Neutral” found only:

  • The prompt in sentiment.py (“ Positive, Negative, or Neutral.”)
  • The passthrough in generate_perspective.py:
    • "sentiment": state.get("sentiment", "neutral")
  • A JSON config using "baseColor": "neutral"

No code compares against uppercase sentiment values. Lowercasing won’t break existing consumers.

new-backend/app/utils/prompt_templates.py (1)

3-32: Well-structured prompt template with clear instructions.

The prompt template is well-designed with clear instructions, appropriate placeholders, and structured JSON output format. This should work effectively for generating counter-perspectives.

new-backend/app/modules/langgraph_nodes/store_and_send.py (1)

21-24: State structure matches frontend expectations
The MyState TypedDict in langgraph_builder.py defines exactly these keys:

  • cleaned_text
  • facts
  • sentiment
  • perspective
  • score
  • retries
  • status

The store_and_send node returns {**state, "status": "success"}, preserving all existing fields and setting status correctly. No additional fields are required.

new-backend/app/modules/langgraph_nodes/fact_check.py (1)

11-20: Good integration with the new fact-checking pipeline.

The error handling structure is well-implemented and maintains consistency with the existing error response format.

new-backend/app/modules/langgraph_builder.py (2)

29-30: Method updates look good.

The changes to use run_sentiment_sdk, explicit error_handler.error_handler, add_conditional_edges, and "__end__" align with the updated API and improve code clarity.

Also applies to: 49-50, 56-61, 100-101


14-22: No type mismatch for perspective—keep as str

Verification shows that across the entire codebase:

  • The only assignments to perspective are string operations (e.g. in backend/app/services/counter_service.py).
  • There are no attribute-style accesses like obj.perspective.
  • No PerspectiveOutput class or generate_perspective function returning an object was found.

Leave the TypedDict definition as-is:

 class MyState(TypedDict):
     cleaned_text: str
     facts: list[dict]
     sentiment: str
-    perspective: dict  # proposed change—remove
+    perspective: str
     score: int
     retries: int
     status: str

Likely an incorrect or invalid review comment.

Comment on lines 1 to 2
# from app.utils.vector_store import save_to_vector_db
from app.modules.vector_store.chunk_rag_data import chunk_rag_data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Verify import path for chunk_rag_data function.

Ensure the import path is correct and the function exists at the specified location.


🏁 Script executed:

#!/bin/bash
# Verify the chunk_rag_data function exists at the imported path
fd -t f chunk_rag_data.py --exec cat {} \;

Length of output: 960


Integrate chunked data storage and clean up debug code

The import for chunk_rag_data is correct—no change needed there. The function is currently incomplete and contains leftover debug code:

• Re-enable and use your vector-store save function

- # from app.utils.vector_store import save_to_vector_db
+ from app.utils.vector_store import save_to_vector_db
  from app.modules.vector_store.chunk_rag_data import chunk_rag_data

• Remove the debug print and actually store the chunks

- chunks = chunk_rag_data(data)
- print("Chunks:", chunks)
+ chunks = chunk_rag_data(data)
+ save_to_vector_db(chunks)

• Add error handling around both chunking and storage calls to surface failures
• Ensure any unused variables are removed once storage is implemented

File to update:

  • new-backend/app/modules/langgraph_nodes/store_and_send.py
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# from app.utils.vector_store import save_to_vector_db
from app.modules.vector_store.chunk_rag_data import chunk_rag_data
# File: new-backend/app/modules/langgraph_nodes/store_and_send.py
from app.utils.vector_store import save_to_vector_db
from app.modules.vector_store.chunk_rag_data import chunk_rag_data
def store_and_send(data):
# … any pre-processing …
chunks = chunk_rag_data(data)
save_to_vector_db(chunks)
# … any post-processing …
🤖 Prompt for AI Agents
In new-backend/app/modules/langgraph_nodes/store_and_send.py at the beginning of
the file, re-enable the import for the vector-store save function by
uncommenting it. In the function where chunk_rag_data is used, remove any debug
print statements and replace them with calls to the vector-store save function
to store the chunked data. Add try-except blocks around both the chunking and
storage operations to catch and log any errors that occur. Finally, remove any
variables that become unused after implementing the storage logic to keep the
code clean.

Comment on lines +4 to +10
try:
nltk.data.find('corpora/stopwords')
nltk.data.find('corpora/punkt_tab')

except LookupError:
nltk.download('stopwords')
nltk.download('punkt_tab')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Move NLTK data initialization to avoid import-time delays.

Downloading NLTK data at module import time can significantly slow application startup. Consider moving this logic to a dedicated initialization function or lazy-loading approach.

-try:
-    nltk.data.find('corpora/stopwords')
-    nltk.data.find('corpora/punkt_tab')
-
-except LookupError:
-    nltk.download('stopwords')
-    nltk.download('punkt_tab')
+def ensure_nltk_data():
+    """Ensure required NLTK data is available, downloading if necessary."""
+    try:
+        nltk.data.find('corpora/stopwords')
+        nltk.data.find('corpora/punkt_tab')
+    except LookupError:
+        nltk.download('stopwords')
+        nltk.download('punkt_tab')

Then call ensure_nltk_data() when NLTK functionality is actually needed.

🤖 Prompt for AI Agents
In new-backend/app/modules/scraper/cleaner.py around lines 4 to 10, the NLTK
data check and download happen at import time, causing startup delays. Refactor
this code by moving the NLTK data initialization into a separate function, such
as ensure_nltk_data(), that performs the check and downloads if necessary. Then,
call this function only when NLTK functionality is required, avoiding delays
during module import.

Comment on lines +20 to +27
prompt = f"""
You are an expert evaluator. Please rate the following counter-perspective
on originality, reasoning quality, and factual grounding. Provide ONLY
a single integer score from 0 (very poor) to 100 (excellent).

=== Perspective to score ===
{text}
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Clarify whether evaluating "perspective" or "counter-perspective".

The prompt mentions "counter-perspective" but the code extracts and processes "perspective". This inconsistency could affect the scoring accuracy.


🏁 Script executed:

#!/bin/bash
# Check how perspective is used across the codebase
rg -A 3 -B 3 "counter-perspective|perspective" --type py

Length of output: 20874


Align “perspective” terminology in judge prompt

There’s a mismatch between the code’s perspective attribute (which holds the generated counter-perspective) and the prompt wording. To avoid confusion, update the wording in judge.py (or rename the field) so they match.

• File: new-backend/app/modules/langgraph_nodes/judge.py
• Change the prompt wording on line 21 from “counter-perspective” to “perspective” (or vice-versa rename the attribute to counter_perspective across your models).
• Example diff:

       prompt = f"""
-You are an expert evaluator. Please rate the following counter-perspective
+You are an expert evaluator. Please rate the following perspective
on originality, reasoning quality, and factual grounding. Provide ONLY
a single integer score from 0 (very poor) to 100 (excellent).

=== Perspective to score ===
{text}
"""

• (Optional) For full consistency, consider renaming the PerspectiveOutput.perspective field to counter_perspective in generate_perspective.py and update any downstream references (e.g. vector-store metadata).

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
prompt = f"""
You are an expert evaluator. Please rate the following counter-perspective
on originality, reasoning quality, and factual grounding. Provide ONLY
a single integer score from 0 (very poor) to 100 (excellent).
=== Perspective to score ===
{text}
"""
prompt = f"""
You are an expert evaluator. Please rate the following perspective
on originality, reasoning quality, and factual grounding. Provide ONLY
a single integer score from 0 (very poor) to 100 (excellent).
=== Perspective to score ===
{text}
"""
🤖 Prompt for AI Agents
In new-backend/app/modules/langgraph_nodes/judge.py around lines 20 to 27, the
prompt text uses "counter-perspective" while the code attribute is named
"perspective," causing inconsistency. To fix this, update the prompt wording on
line 21 to replace "counter-perspective" with "perspective" so the terminology
matches. Optionally, for full consistency, rename the attribute in related files
like generate_perspective.py and update all references accordingly.

Comment on lines +40 to +42
facts_str = "\n".join([f"Claim: {f['original_claim']}\n"
"Verdict: {f['verdict']}\nExplanation: "
"{f['explanation']}" for f in state["facts"]])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix f-string formatting bug.

Lines 41-42 are not f-strings but contain {} placeholders. This will result in literal {f['verdict']} text instead of the actual values.

-        facts_str = "\n".join([f"Claim: {f['original_claim']}\n"
-                               "Verdict: {f['verdict']}\nExplanation: "
-                               "{f['explanation']}" for f in state["facts"]])
+        facts_str = "\n".join([f"Claim: {f['original_claim']}\n"
+                               f"Verdict: {f['verdict']}\nExplanation: "
+                               f"{f['explanation']}" for f in state["facts"]])
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
facts_str = "\n".join([f"Claim: {f['original_claim']}\n"
"Verdict: {f['verdict']}\nExplanation: "
"{f['explanation']}" for f in state["facts"]])
facts_str = "\n".join([f"Claim: {f['original_claim']}\n"
f"Verdict: {f['verdict']}\nExplanation: "
f"{f['explanation']}" for f in state["facts"]])
🤖 Prompt for AI Agents
In new-backend/app/modules/langgraph_nodes/generate_perspective.py around lines
40 to 42, the string concatenation uses curly braces for variable placeholders
but the strings are not marked as f-strings, causing the placeholders to be
treated as literal text. Fix this by adding the 'f' prefix to all strings
containing placeholders so that the expressions inside curly braces are properly
evaluated and formatted.

Comment on lines +7 to +9
load_dotenv()

client = Groq(api_key=os.getenv("GROQ_API_KEY"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add validation for required API key.

The code should verify that the API key exists to fail fast with a clear error message.

 load_dotenv()
 
-client = Groq(api_key=os.getenv("GROQ_API_KEY"))
+api_key = os.getenv("GROQ_API_KEY")
+if not api_key:
+    raise ValueError("GROQ_API_KEY environment variable is not set")
+client = Groq(api_key=api_key)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
load_dotenv()
client = Groq(api_key=os.getenv("GROQ_API_KEY"))
load_dotenv()
api_key = os.getenv("GROQ_API_KEY")
if not api_key:
raise ValueError("GROQ_API_KEY environment variable is not set")
client = Groq(api_key=api_key)
🤖 Prompt for AI Agents
In new-backend/app/modules/facts_check/llm_processing.py around lines 7 to 9,
the code initializes the Groq client with an API key from the environment
without checking if the key exists. Add a validation step after loading the
environment variables to check if the "GROQ_API_KEY" is present. If it is
missing, raise an exception or exit with a clear error message indicating the
missing API key to ensure the application fails fast and clearly.

Comment on lines +125 to +129
return {
"claim": claim,
"verifications": results_list,
"status": "success",
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Remove incorrect "claim" field from return value.

The function returns "claim": claim which uses the last claim from the loop. This is incorrect when processing multiple claims. The function already returns all verifications in the list.

         return {
-            "claim": claim,
             "verifications": results_list,
             "status": "success",
         }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
return {
"claim": claim,
"verifications": results_list,
"status": "success",
}
return {
"verifications": results_list,
"status": "success",
}
🤖 Prompt for AI Agents
In new-backend/app/modules/facts_check/llm_processing.py around lines 125 to
129, remove the "claim" field from the returned dictionary because it
incorrectly uses only the last claim from the loop. Instead, return only the
"verifications" list and the "status" to accurately represent all processed
claims.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
new-backend/app/modules/langgraph_nodes/store_and_send.py (1)

9-17: Complete the implementation - chunks are created but not used.

The function creates chunks but doesn't store them in Pinecone as mentioned in the PR objectives. This is the same issue flagged in previous reviews and remains unresolved.

Based on the PR objectives, this function should:

  1. Generate embeddings for each chunk
  2. Store embeddings in Pinecone vector database
  3. Use the chunks variable that's currently unused

The current implementation is incomplete and doesn't fulfill the PR requirements.

🧹 Nitpick comments (2)
new-backend/app/modules/langgraph_nodes/store_and_send.py (2)

8-8: Remove debug print statement.

The debug print statement should be removed from production code.

-        print(state)

12-14: Improve exception handling with proper chaining.

Use raise ... from err to preserve the original exception context for better debugging.

-        except KeyError as e:
-            raise Exception(f"Missing required data field for chunking: {e}")
-        except Exception as e:
-            raise Exception(f"Failed to chunk data: {e}")
+        except KeyError as e:
+            raise Exception(f"Missing required data field for chunking: {e}") from e
+        except Exception as e:
+            raise Exception(f"Failed to chunk data: {e}") from e
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 30a5547 and a376121.

📒 Files selected for processing (3)
  • new-backend/app/modules/langgraph_nodes/store_and_send.py (1 hunks)
  • new-backend/app/modules/vector_store/chunk_rag_data.py (1 hunks)
  • new-backend/app/utils/generate_chunk_id.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • new-backend/app/modules/vector_store/chunk_rag_data.py
🧰 Additional context used
🧬 Code Graph Analysis (1)
new-backend/app/modules/langgraph_nodes/store_and_send.py (1)
new-backend/app/modules/vector_store/chunk_rag_data.py (1)
  • chunk_rag_data (4-65)
🪛 Ruff (0.11.9)
new-backend/app/modules/langgraph_nodes/store_and_send.py

10-10: Local variable chunks is assigned to but never used

Remove assignment to unused variable chunks

(F841)


12-12: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


14-14: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

🪛 Flake8 (7.2.0)
new-backend/app/modules/langgraph_nodes/store_and_send.py

[error] 10-10: local variable 'chunks' is assigned to but never used

(F841)

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
new-backend/app/modules/langgraph_nodes/store_and_send.py (1)

1-3: Complete the implementation - vector storage still missing.

The save_to_vector_db import remains commented out while chunking and embedding have been added. This confirms the incomplete implementation identified in previous reviews.

🧹 Nitpick comments (1)
new-backend/app/modules/langgraph_nodes/store_and_send.py (1)

5-5: Fix formatting: add blank line before function definition.

PEP 8 requires two blank lines before top-level function definitions.

+

 def store_and_send(state):
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a376121 and e53a7c4.

⛔ Files ignored due to path filters (1)
  • new-backend/uv.lock is excluded by !**/*.lock
📒 Files selected for processing (5)
  • new-backend/app/modules/langgraph_nodes/store_and_send.py (1 hunks)
  • new-backend/app/modules/vector_store/chunk_rag_data.py (1 hunks)
  • new-backend/app/modules/vector_store/embed.py (1 hunks)
  • new-backend/app/utils/generate_chunk_id.py (1 hunks)
  • new-backend/pyproject.toml (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • new-backend/app/modules/vector_store/chunk_rag_data.py
  • new-backend/app/utils/generate_chunk_id.py
  • new-backend/pyproject.toml
🧰 Additional context used
🧬 Code Graph Analysis (1)
new-backend/app/modules/langgraph_nodes/store_and_send.py (2)
new-backend/app/modules/vector_store/chunk_rag_data.py (1)
  • chunk_rag_data (4-73)
new-backend/app/modules/vector_store/embed.py (1)
  • embed_chunks (7-30)
🪛 Flake8 (7.2.0)
new-backend/app/modules/langgraph_nodes/store_and_send.py

[error] 5-5: expected 2 blank lines, found 1

(E302)

🪛 Ruff (0.11.9)
new-backend/app/modules/langgraph_nodes/store_and_send.py

12-12: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


14-14: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


20-20: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

🔇 Additional comments (3)
new-backend/app/modules/vector_store/embed.py (3)

1-4: Good choice of embedding model and efficient global instance.

The SentenceTransformer model "all-MiniLM-L6-v2" is a well-balanced choice for general text embedding tasks, offering good performance with reasonable computational requirements. Using a global instance is efficient and avoids reloading the model on each function call.


7-10: Well-designed function signature and early return optimization.

The function has clear type hints and handles the empty input case efficiently with an early return.


13-30: Robust validation and efficient embedding generation.

The validation logic provides clear error messages with specific indices, making debugging easier. The embedding generation efficiently processes all texts at once, and the output format with "id", "values", and "metadata" is well-structured for vector database storage.

Comment on lines +8 to +14
print(state)
try:
chunks = chunk_rag_data(state)
except KeyError as e:
raise Exception(f"Missing required data field for chunking: {e}")
except Exception as e:
raise Exception(f"Failed to chunk data: {e}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Remove debug code and improve exception handling.

The print statement should be removed for production code. Additionally, exception handling should use proper chaining to preserve the original error context.

-        print(state)
         try:
             chunks = chunk_rag_data(state)
         except KeyError as e:
-            raise Exception(f"Missing required data field for chunking: {e}")
+            raise Exception(f"Missing required data field for chunking: {e}") from e
         except Exception as e:
-            raise Exception(f"Failed to chunk data: {e}")
+            raise Exception(f"Failed to chunk data: {e}") from e
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
print(state)
try:
chunks = chunk_rag_data(state)
except KeyError as e:
raise Exception(f"Missing required data field for chunking: {e}")
except Exception as e:
raise Exception(f"Failed to chunk data: {e}")
try:
chunks = chunk_rag_data(state)
except KeyError as e:
raise Exception(f"Missing required data field for chunking: {e}") from e
except Exception as e:
raise Exception(f"Failed to chunk data: {e}") from e
🧰 Tools
🪛 Ruff (0.11.9)

12-12: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


14-14: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

🤖 Prompt for AI Agents
In new-backend/app/modules/langgraph_nodes/store_and_send.py around lines 8 to
14, remove the print(state) debug statement and update the exception handling to
use proper exception chaining by adding "from e" when re-raising exceptions,
preserving the original error context.

Comment on lines +15 to +20
try:
vectors = embed_chunks(chunks)
if vectors:
print("embedding generated successfully!")
except Exception as e:
raise Exception(f"failed to embed chunks: {e}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Improve exception handling consistency.

Apply the same exception chaining improvement here as recommended for the chunking section.

         try:
             vectors = embed_chunks(chunks)
             if vectors:
                 print("embedding generated successfully!")
         except Exception as e:
-            raise Exception(f"failed to embed chunks: {e}")
+            raise Exception(f"failed to embed chunks: {e}") from e
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
try:
vectors = embed_chunks(chunks)
if vectors:
print("embedding generated successfully!")
except Exception as e:
raise Exception(f"failed to embed chunks: {e}")
try:
vectors = embed_chunks(chunks)
if vectors:
print("embedding generated successfully!")
except Exception as e:
raise Exception(f"failed to embed chunks: {e}") from e
🧰 Tools
🪛 Ruff (0.11.9)

20-20: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

🤖 Prompt for AI Agents
In new-backend/app/modules/langgraph_nodes/store_and_send.py around lines 15 to
20, the exception handling raises a new Exception without preserving the
original traceback. Modify the except block to use "raise Exception(f'failed to
embed chunks: {e}') from e" to maintain exception chaining and improve
debugging.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (5)
new-backend/app/modules/facts_check/llm_processing.py (2)

7-9: Add validation for required API key.

The code should verify that the API key exists to fail fast with a clear error message.

 load_dotenv()
 
-client = Groq(api_key=os.getenv("GROQ_API_KEY"))
+api_key = os.getenv("GROQ_API_KEY")
+if not api_key:
+    raise ValueError("GROQ_API_KEY environment variable is not set")
+client = Groq(api_key=api_key)

119-123: Remove incorrect "claim" field from return value.

The function returns "claim": claim which uses the last claim from the loop. This is incorrect when processing multiple claims. The function already returns all verifications in the list.

         return {
-            "claim": claim,
             "verifications": results_list,
             "status": "success",
         }
new-backend/app/modules/langgraph_nodes/store_and_send.py (3)

9-9: Remove debug print statement.

Debug print statements should be removed from production code or replaced with proper logging.

Apply this diff to remove the debug code:

-        print(state)

10-15: Improve exception chaining for better error traceability.

The exception handling should preserve the original error context as recommended by the static analysis tool.

Apply this diff to improve exception chaining:

         except KeyError as e:
-            raise Exception(f"Missing required data field for chunking: {e}")
+            raise Exception(f"Missing required data field for chunking: {e}") from e
         except Exception as e:
-            raise Exception(f"Failed to chunk data: {e}")
+            raise Exception(f"Failed to chunk data: {e}") from e

16-21: Remove debug print and improve exception chaining.

The debug print statement should be removed and exception chaining should be improved.

Apply this diff to address both issues:

             if vectors:
-                print("embedding generated successfully!")
         except Exception as e:
-            raise Exception(f"failed to embed chunks: {e}")
+            raise Exception(f"failed to embed chunks: {e}") from e
🧹 Nitpick comments (3)
new-backend/app/modules/facts_check/llm_processing.py (2)

78-80: Fix missing space in system prompt.

The system prompt has a missing space that makes it grammatically incorrect.

-                            "Your job is to determine whether the given"
-                            " claim is True, False"
-                            "based on the provided web search evidence."
+                            "Your job is to determine whether the given "
+                            "claim is True or False "
+                            "based on the provided web search evidence."

107-109: Consider improving markdown stripping logic.

The current regex pattern may not handle all markdown variations. Consider using a more robust pattern or a dedicated markdown parser.

-# Strip markdown code blocks if present
-            content = re.sub(r"^```json|```$", "", content).strip()
+            # Strip markdown code blocks if present
+            content = re.sub(r"^```(?:json)?\s*|```\s*$", "", content, flags=re.MULTILINE).strip()
new-backend/app/modules/langgraph_nodes/store_and_send.py (1)

23-24: Complete implementation achieved - remove debug print.

Great work completing the vector storage implementation! The function now properly stores vectors in Pinecone as required by the PR objectives. However, the debug print should be removed.

Apply this diff to remove the debug print:

         store(vectors)
-        print("Vectors saved to Pinecone!")
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e53a7c4 and 385805f.

⛔ Files ignored due to path filters (1)
  • new-backend/uv.lock is excluded by !**/*.lock
📒 Files selected for processing (7)
  • new-backend/app/db/vector_store.py (1 hunks)
  • new-backend/app/modules/facts_check/llm_processing.py (1 hunks)
  • new-backend/app/modules/facts_check/web_search.py (1 hunks)
  • new-backend/app/modules/langgraph_nodes/store_and_send.py (1 hunks)
  • new-backend/app/utils/fact_check_utils.py (1 hunks)
  • new-backend/app/utils/store_vectors.py (1 hunks)
  • new-backend/pyproject.toml (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • new-backend/app/modules/facts_check/web_search.py
  • new-backend/pyproject.toml
  • new-backend/app/utils/fact_check_utils.py
🧰 Additional context used
🧬 Code Graph Analysis (1)
new-backend/app/modules/langgraph_nodes/store_and_send.py (3)
new-backend/app/modules/vector_store/chunk_rag_data.py (1)
  • chunk_rag_data (4-73)
new-backend/app/modules/vector_store/embed.py (1)
  • embed_chunks (7-30)
new-backend/app/utils/store_vectors.py (1)
  • store (10-32)
🪛 Ruff (0.11.9)
new-backend/app/utils/store_vectors.py

32-32: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

new-backend/app/modules/langgraph_nodes/store_and_send.py

13-13: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


15-15: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


21-21: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

new-backend/app/db/vector_store.py

14-14: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


40-41: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

🔇 Additional comments (8)
new-backend/app/modules/facts_check/llm_processing.py (1)

12-58: LGTM: Claim extraction function is well-implemented.

The function properly handles state input validation, error handling, and API interaction. The prompt structure is clear and the return format correctly augments the original state.

new-backend/app/db/vector_store.py (3)

5-7: LGTM - Proper environment variable validation.

The environment variable handling correctly validates the presence of the required API key and raises an appropriate error if missing.


16-19: LGTM - Appropriate constants for vector configuration.

The constants for index name, dimensions (384), and cosine metric are well-defined and align with the embedding model specifications mentioned in the codebase.


22-34: LGTM - Proper index creation with serverless specification.

The index creation logic correctly checks for existence before creating and uses appropriate serverless specifications for AWS US East 1.

new-backend/app/utils/store_vectors.py (2)

10-21: LGTM - Well-documented function with clear interface.

The function signature, documentation, and parameter descriptions are well-structured and clearly communicate the expected behavior and exceptions.


22-23: LGTM - Proper input validation.

The validation correctly checks for empty vectors list and raises an appropriate ValueError.

new-backend/app/modules/langgraph_nodes/store_and_send.py (2)

1-3: LGTM - Proper imports for the vector storage pipeline.

The imports correctly reference the chunking, embedding, and storage functions needed for the complete pipeline implementation.


6-37: Excellent implementation of the complete vector storage pipeline.

The function now successfully implements the full pipeline described in the PR objectives:

  1. ✅ Chunks the cleaned_text using chunk_rag_data
  2. ✅ Generates embeddings for each chunk using embed_chunks
  3. ✅ Stores embeddings in Pinecone using the store function
  4. ✅ Returns the updated state for frontend sync

This addresses the previously incomplete implementation and fulfills all the PR requirements. The error handling covers each step appropriately, and the function integrates well with the LangGraph pipeline.

Comment on lines +112 to +117
try:
parsed = json.loads(content)
except Exception as parse_err:
print(f"❌ LLM JSON parse error: {parse_err}")

results_list.append(parsed)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix undefined variable error in JSON parsing.

When JSON parsing fails, the parsed variable is not defined, but line 117 still tries to append it to results_list, which will cause a NameError.

             # Try parsing the JSON response
             try:
                 parsed = json.loads(content)
             except Exception as parse_err:
                 print(f"❌ LLM JSON parse error: {parse_err}")
+                parsed = {
+                    "verdict": "Error",
+                    "explanation": f"JSON parsing failed: {parse_err}",
+                    "original_claim": claim,
+                    "source_link": source
+                }
 
             results_list.append(parsed)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
try:
parsed = json.loads(content)
except Exception as parse_err:
print(f"❌ LLM JSON parse error: {parse_err}")
results_list.append(parsed)
# Try parsing the JSON response
try:
parsed = json.loads(content)
except Exception as parse_err:
print(f"❌ LLM JSON parse error: {parse_err}")
parsed = {
"verdict": "Error",
"explanation": f"JSON parsing failed: {parse_err}",
"original_claim": claim,
"source_link": source
}
results_list.append(parsed)
🤖 Prompt for AI Agents
In new-backend/app/modules/facts_check/llm_processing.py around lines 112 to
117, the variable 'parsed' is used outside the try-except block without
guaranteed initialization, causing a NameError if JSON parsing fails. To fix
this, initialize 'parsed' to a default value (e.g., None) before the try block
or only append 'parsed' to 'results_list' inside the try block after successful
parsing, ensuring it is defined before use.

Comment on lines +8 to +14
try:

# Initialize Pinecone client
pc = Pinecone(api_key=PINECONE_API_KEY)

except Exception as e:
raise RuntimeError(f"Error occured while intialising pinecone client:{e}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix typo and improve exception chaining.

There's a typo in the error message and the exception handling should preserve the original error context.

Apply this diff to fix the issues:

-    raise RuntimeError(f"Error occured while intialising pinecone client:{e}")
+    raise RuntimeError(f"Error occurred while initializing pinecone client: {e}") from e
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
try:
# Initialize Pinecone client
pc = Pinecone(api_key=PINECONE_API_KEY)
except Exception as e:
raise RuntimeError(f"Error occured while intialising pinecone client:{e}")
try:
# Initialize Pinecone client
pc = Pinecone(api_key=PINECONE_API_KEY)
except Exception as e:
raise RuntimeError(f"Error occurred while initializing pinecone client: {e}") from e
🧰 Tools
🪛 Ruff (0.11.9)

14-14: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

🤖 Prompt for AI Agents
In new-backend/app/db/vector_store.py around lines 8 to 14, fix the typo in the
error message by changing "occured" to "occurred" and "intialising" to
"initialising". Also, improve exception handling by using "raise
RuntimeError(...) from e" to preserve the original exception context.

Comment on lines +36 to +41
try:
# Connect to the index
index = pc.Index(INDEX_NAME)
except Exception as e:
raise RuntimeError("Error occured while "
f"connecting to the index {INDEX_NAME}:{e}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix typo and improve exception chaining.

There's a typo in the error message and the exception handling should preserve the original error context.

Apply this diff to fix the issues:

-    raise RuntimeError("Error occured while "
+    raise RuntimeError("Error occurred while "
                        f"connecting to the index {INDEX_NAME}:{e}")
+                       f"connecting to the index {INDEX_NAME}: {e}") from e

Committable suggestion skipped: line range outside the PR's diff.

🧰 Tools
🪛 Ruff (0.11.9)

40-41: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

🤖 Prompt for AI Agents
In new-backend/app/db/vector_store.py around lines 36 to 41, fix the typo in the
error message by changing "occured" to "occurred" and improve exception handling
by using "raise ... from e" to preserve the original exception context when
re-raising the RuntimeError.

Comment on lines +25 to +32
try:
index.upsert(vectors, namespace=namespace)
logger.info(f"Successfully stored {len(vectors)} "
f"vectors in namespace '{namespace}'")
except Exception as e:
logger.error("Failed to store "
f"vectors in namespace '{namespace}': {e}")
raise RuntimeError(f"Vector storage failed: {e}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Improve exception chaining for better error traceability.

The exception handling should preserve the original error context as recommended by the static analysis tool.

Apply this diff to improve exception chaining:

-        raise RuntimeError(f"Vector storage failed: {e}")
+        raise RuntimeError(f"Vector storage failed: {e}") from e
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
try:
index.upsert(vectors, namespace=namespace)
logger.info(f"Successfully stored {len(vectors)} "
f"vectors in namespace '{namespace}'")
except Exception as e:
logger.error("Failed to store "
f"vectors in namespace '{namespace}': {e}")
raise RuntimeError(f"Vector storage failed: {e}")
try:
index.upsert(vectors, namespace=namespace)
logger.info(f"Successfully stored {len(vectors)} "
f"vectors in namespace '{namespace}'")
except Exception as e:
logger.error("Failed to store "
f"vectors in namespace '{namespace}': {e}")
raise RuntimeError(f"Vector storage failed: {e}") from e
🧰 Tools
🪛 Ruff (0.11.9)

32-32: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

🤖 Prompt for AI Agents
In new-backend/app/utils/store_vectors.py around lines 25 to 32, the exception
handling catches errors but does not preserve the original traceback. Modify the
raise statement to use "raise RuntimeError(...) from e" to chain the original
exception, which improves error traceability and debugging.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (1)
frontend/app/analyze/loading/page.tsx (1)

95-102: Optimize progress animation logic.

The current progress animation increments every 100ms regardless of actual progress, which may not align with the step-based progress calculation.

Apply this diff to improve progress synchronization:

-      const progressInterval = setInterval(() => {
-        setProgress((prev) => {
-          if (prev < 100) {
-            return prev + 1
-          }
-          return prev
-        })
-      }, 100)
+      const progressInterval = setInterval(() => {
+        setProgress((prev) => {
+          const targetProgress = Math.min((currentStep + 1) * 20, 100)
+          if (prev < targetProgress) {
+            return Math.min(prev + 2, targetProgress)
+          }
+          return prev
+        })
+      }, 100)
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 385805f and a2dee5c.

⛔ Files ignored due to path filters (1)
  • frontend/package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (3)
  • frontend/app/analyze/loading/page.tsx (2 hunks)
  • frontend/app/analyze/results/page.tsx (3 hunks)
  • frontend/package.json (2 hunks)
✅ Files skipped from review due to trivial changes (1)
  • frontend/package.json
🔇 Additional comments (2)
frontend/app/analyze/results/page.tsx (1)

23-23: LGTM! State initialization is correct.

The new state variable for analysis data is properly initialized.

frontend/app/analyze/loading/page.tsx (1)

9-9: LGTM! Axios import is appropriate.

The axios library is properly imported for making HTTP requests.

Comment on lines +78 to +83
<div className="p-4">
<h1 className="text-2xl font-bold mb-4">Analysis Results</h1>
<pre className="bg-black p-4 rounded text-sm overflow-x-auto whitespace-pre-wrap">
{JSON.stringify(analysisData, null, 2)}
</pre>
</div>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Improve JSON display implementation and consider security implications.

The current implementation has several issues:

  1. The debug section is positioned above the header, breaking the layout
  2. Raw JSON display may expose sensitive information
  3. No handling for null/undefined data
  4. Poor user experience with raw JSON dump

Apply this diff to improve the implementation:

-    <div className="p-4">
-      <h1 className="text-2xl font-bold mb-4">Analysis Results</h1>
-      <pre className="bg-black p-4 rounded text-sm overflow-x-auto whitespace-pre-wrap">
-        {JSON.stringify(analysisData, null, 2)}
-      </pre>
-    </div>
+    {process.env.NODE_ENV === 'development' && analysisData && (
+      <div className="p-4 mt-16">
+        <details className="mb-4">
+          <summary className="text-sm font-medium cursor-pointer">Debug: Raw Analysis Data</summary>
+          <pre className="bg-gray-100 dark:bg-gray-800 p-4 rounded text-xs overflow-x-auto whitespace-pre-wrap mt-2">
+            {JSON.stringify(analysisData, null, 2)}
+          </pre>
+        </details>
+      </div>
+    )}

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In frontend/app/analyze/results/page.tsx around lines 78 to 83, the JSON display
is currently placed above the header, shows raw JSON which may expose sensitive
data, lacks handling for null or undefined analysisData, and results in poor
user experience. To fix this, move the JSON display below the header, add
conditional rendering to handle null or undefined analysisData gracefully,
sanitize or selectively display only non-sensitive parts of analysisData, and
replace the raw JSON dump with a more user-friendly formatted view or summary.

Comment on lines +42 to +49
const storedData = sessionStorage.getItem("analysisResult")
if (storedData) {
setAnalysisData(JSON.parse(storedData))
} else {
// fallback if user visits results page directly
// maybe redirect or show error
console.warn("No analysis result found")
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Add error handling for JSON parsing and improve user experience.

The current implementation lacks error handling for JSON parsing, which could crash the component if the stored data is malformed.

Apply this diff to add proper error handling:

-    const storedData = sessionStorage.getItem("analysisResult")
-    if (storedData) {
-      setAnalysisData(JSON.parse(storedData))
-    } else {
-      // fallback if user visits results page directly
-      // maybe redirect or show error
-      console.warn("No analysis result found")
-    }
+    const storedData = sessionStorage.getItem("analysisResult")
+    if (storedData) {
+      try {
+        setAnalysisData(JSON.parse(storedData))
+      } catch (error) {
+        console.error("Failed to parse analysis data:", error)
+        // Handle corrupted data
+        sessionStorage.removeItem("analysisResult")
+        router.push("/analyze")
+      }
+    } else {
+      // fallback if user visits results page directly
+      console.warn("No analysis result found")
+      router.push("/analyze")
+    }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const storedData = sessionStorage.getItem("analysisResult")
if (storedData) {
setAnalysisData(JSON.parse(storedData))
} else {
// fallback if user visits results page directly
// maybe redirect or show error
console.warn("No analysis result found")
}
const storedData = sessionStorage.getItem("analysisResult")
if (storedData) {
try {
setAnalysisData(JSON.parse(storedData))
} catch (error) {
console.error("Failed to parse analysis data:", error)
// Handle corrupted data
sessionStorage.removeItem("analysisResult")
router.push("/analyze")
}
} else {
// fallback if user visits results page directly
console.warn("No analysis result found")
router.push("/analyze")
}
🤖 Prompt for AI Agents
In frontend/app/analyze/results/page.tsx around lines 42 to 49, the JSON.parse
call on storedData lacks error handling, which can cause the component to crash
if the data is malformed. Wrap the JSON.parse call in a try-catch block to catch
parsing errors, log the error, and handle the failure gracefully by either
clearing the invalid data or showing an appropriate user message to improve user
experience.

Comment on lines 64 to 66
const res = await axios.post("http://localhost:8000/api/process", {
url: storedUrl,
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add request timeout and better error handling.

The API call lacks timeout configuration and detailed error handling, which could lead to poor user experience.

Apply this diff to improve the API call:

-        const res = await axios.post("http://localhost:8000/api/process", {
-          url: storedUrl,
-        })
+        const res = await axios.post(`${apiUrl}/api/process`, {
+          url: storedUrl,
+        }, {
+          timeout: 60000, // 60 seconds timeout for analysis
+          headers: {
+            'Content-Type': 'application/json'
+          }
+        })
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const res = await axios.post("http://localhost:8000/api/process", {
url: storedUrl,
})
const res = await axios.post(`${apiUrl}/api/process`, {
url: storedUrl,
}, {
timeout: 60000, // 60 seconds timeout for analysis
headers: {
'Content-Type': 'application/json'
}
})
🤖 Prompt for AI Agents
In frontend/app/analyze/loading/page.tsx around lines 64 to 66, the axios POST
request to the API lacks a timeout setting and detailed error handling. Add a
timeout option to the axios request configuration to prevent hanging requests,
and wrap the call in a try-catch block to handle errors gracefully. In the catch
block, provide meaningful error messages or fallback behavior to improve user
experience.

Comment on lines 58 to 114
const runAnalysis = async () => {
const storedUrl = sessionStorage.getItem("articleUrl")
if (storedUrl) {
setArticleUrl(storedUrl)

try {
const res = await axios.post("http://localhost:8000/api/process", {
url: storedUrl,
})

// Save response to sessionStorage
sessionStorage.setItem("analysisResult", JSON.stringify(res.data))

// optional logging
console.log("Analysis result saved")
console.log(res)
} catch (err) {
console.error("Failed to process article:", err)
router.push("/analyze") // fallback in case of error
return
}

// Progress and step simulation
const stepInterval = setInterval(() => {
setCurrentStep((prev) => {
if (prev < steps.length - 1) {
return prev + 1
} else {
clearInterval(stepInterval)
setTimeout(() => {
router.push("/analyze/results")
}, 2000)
return prev
}
})
}, 2000)

const progressInterval = setInterval(() => {
setProgress((prev) => {
if (prev < 100) {
return prev + 1
}
return prev
})
}, 100)

return () => {
clearInterval(stepInterval)
clearInterval(progressInterval)
}
} else {
// Redirect back if no URL found
router.push("/analyze")
return
}
}

runAnalysis()
}, [router])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix critical async function and cleanup issues.

The current implementation has several critical problems:

  1. The cleanup function is incorrectly placed inside the async function
  2. Intervals may not be properly cleaned up, causing memory leaks
  3. Hard-coded localhost URL won't work in production

Apply this diff to fix the async function structure and cleanup:

-  useEffect(() => {
-  const runAnalysis = async () => {
+  useEffect(() => {
+    let stepInterval: NodeJS.Timeout
+    let progressInterval: NodeJS.Timeout
+
+    const runAnalysis = async () => {
     const storedUrl = sessionStorage.getItem("articleUrl")
     if (storedUrl) {
       setArticleUrl(storedUrl)

       try {
-        const res = await axios.post("http://localhost:8000/api/process", {
+        const apiUrl = process.env.NEXT_PUBLIC_API_URL || "http://localhost:8000"
+        const res = await axios.post(`${apiUrl}/api/process`, {
           url: storedUrl,
         })

         // Save response to sessionStorage
         sessionStorage.setItem("analysisResult", JSON.stringify(res.data))

         // optional logging
         console.log("Analysis result saved")
-        console.log(res)
       } catch (err) {
         console.error("Failed to process article:", err)
         router.push("/analyze") // fallback in case of error
         return
       }

       // Progress and step simulation
-      const stepInterval = setInterval(() => {
+      stepInterval = setInterval(() => {
         setCurrentStep((prev) => {
           if (prev < steps.length - 1) {
             return prev + 1
           } else {
             clearInterval(stepInterval)
             setTimeout(() => {
               router.push("/analyze/results")
             }, 2000)
             return prev
           }
         })
       }, 2000)

-      const progressInterval = setInterval(() => {
+      progressInterval = setInterval(() => {
         setProgress((prev) => {
           if (prev < 100) {
             return prev + 1
           }
           return prev
         })
       }, 100)

-      return () => {
-        clearInterval(stepInterval)
-        clearInterval(progressInterval)
-      }
     } else {
       router.push("/analyze")
     }
-  }
+    }

-  runAnalysis()
-}, [router])
+    runAnalysis()
+
+    return () => {
+      if (stepInterval) clearInterval(stepInterval)
+      if (progressInterval) clearInterval(progressInterval)
+    }
+  }, [router])
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const runAnalysis = async () => {
const storedUrl = sessionStorage.getItem("articleUrl")
if (storedUrl) {
setArticleUrl(storedUrl)
try {
const res = await axios.post("http://localhost:8000/api/process", {
url: storedUrl,
})
// Save response to sessionStorage
sessionStorage.setItem("analysisResult", JSON.stringify(res.data))
// optional logging
console.log("Analysis result saved")
console.log(res)
} catch (err) {
console.error("Failed to process article:", err)
router.push("/analyze") // fallback in case of error
return
}
// Progress and step simulation
const stepInterval = setInterval(() => {
setCurrentStep((prev) => {
if (prev < steps.length - 1) {
return prev + 1
} else {
clearInterval(stepInterval)
setTimeout(() => {
router.push("/analyze/results")
}, 2000)
return prev
}
})
}, 2000)
const progressInterval = setInterval(() => {
setProgress((prev) => {
if (prev < 100) {
return prev + 1
}
return prev
})
}, 100)
return () => {
clearInterval(stepInterval)
clearInterval(progressInterval)
}
} else {
// Redirect back if no URL found
router.push("/analyze")
return
}
}
runAnalysis()
}, [router])
useEffect(() => {
let stepInterval: NodeJS.Timeout
let progressInterval: NodeJS.Timeout
const runAnalysis = async () => {
const storedUrl = sessionStorage.getItem("articleUrl")
if (storedUrl) {
setArticleUrl(storedUrl)
try {
const apiUrl = process.env.NEXT_PUBLIC_API_URL || "http://localhost:8000"
const res = await axios.post(`${apiUrl}/api/process`, {
url: storedUrl,
})
// Save response to sessionStorage
sessionStorage.setItem("analysisResult", JSON.stringify(res.data))
// optional logging
console.log("Analysis result saved")
} catch (err) {
console.error("Failed to process article:", err)
router.push("/analyze") // fallback in case of error
return
}
// Progress and step simulation
stepInterval = setInterval(() => {
setCurrentStep((prev) => {
if (prev < steps.length - 1) {
return prev + 1
} else {
clearInterval(stepInterval)
setTimeout(() => {
router.push("/analyze/results")
}, 2000)
return prev
}
})
}, 2000)
progressInterval = setInterval(() => {
setProgress((prev) => {
if (prev < 100) {
return prev + 1
}
return prev
})
}, 100)
} else {
router.push("/analyze")
}
}
runAnalysis()
return () => {
if (stepInterval) clearInterval(stepInterval)
if (progressInterval) clearInterval(progressInterval)
}
}, [router])
🤖 Prompt for AI Agents
In frontend/app/analyze/loading/page.tsx lines 58 to 114, the async function
runAnalysis incorrectly includes the cleanup function inside it, causing
improper interval cleanup and potential memory leaks. Also, the API URL is
hard-coded to localhost, which is not suitable for production. To fix this, move
the interval setup and cleanup logic outside the async function into a useEffect
hook, ensure intervals are cleared in the cleanup function returned by
useEffect, and replace the hard-coded URL with a configurable environment
variable or relative path to support production deployment.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a2dee5c and 921cbe8.

📒 Files selected for processing (3)
  • new-backend/app/modules/facts_check/web_search.py (1 hunks)
  • new-backend/main.py (2 hunks)
  • new-backend/start.sh (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • new-backend/start.sh
🚧 Files skipped from review as they are similar to previous changes (1)
  • new-backend/app/modules/facts_check/web_search.py
🔇 Additional comments (1)
new-backend/main.py (1)

3-3: LGTM - Clean import addition.

The import follows FastAPI conventions and is necessary for the CORS middleware functionality.

Comment on lines 14 to 21
app.add_middleware(
CORSMiddleware,
allow_origins=["*"],
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Security concern: Overly permissive CORS configuration.

The combination of allow_origins=["*"] with allow_credentials=True violates CORS security principles and creates potential security vulnerabilities. This configuration allows any origin to make credentialed requests to your API.

For production environments, consider this more secure configuration:

 app.add_middleware(
     CORSMiddleware,
-    allow_origins=["*"],
+    allow_origins=[
+        "http://localhost:3000",  # React dev server
+        "https://yourdomain.com",  # Production frontend
+    ],
     allow_credentials=True,
-    allow_methods=["*"],
+    allow_methods=["GET", "POST", "PUT", "DELETE"],
-    allow_headers=["*"],
+    allow_headers=["Content-Type", "Authorization"],
 )

Alternatively, if you need wildcard origins during development, set allow_credentials=False:

 app.add_middleware(
     CORSMiddleware,
     allow_origins=["*"],
-    allow_credentials=True,
+    allow_credentials=False,
     allow_methods=["*"],
     allow_headers=["*"],
 )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
app.add_middleware(
CORSMiddleware,
allow_origins=["*"],
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
app.add_middleware(
CORSMiddleware,
allow_origins=[
"http://localhost:3000", # React dev server
"https://yourdomain.com", # Production frontend
],
allow_credentials=True,
allow_methods=["GET", "POST", "PUT", "DELETE"],
allow_headers=["Content-Type", "Authorization"],
)
Suggested change
app.add_middleware(
CORSMiddleware,
allow_origins=["*"],
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
app.add_middleware(
CORSMiddleware,
allow_origins=["*"],
allow_credentials=False,
allow_methods=["*"],
allow_headers=["*"],
)
🤖 Prompt for AI Agents
In new-backend/main.py around lines 14 to 21, the CORS middleware is configured
with allow_origins=["*"] and allow_credentials=True, which is insecure. To fix
this, replace allow_origins=["*"] with a specific list of trusted origins for
production, or if wildcard origins are needed during development, set
allow_credentials=False to prevent credentialed requests from any origin.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
new-backend/start.sh (1)

2-2: Provide a default for PORT to avoid empty output

$PORT is unset on most local machines; printing an empty value is confusing
and FastAPI will still start on its default port. Apply the defaulting shown in
the diff above.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c19351e and 80e9609.

📒 Files selected for processing (2)
  • new-backend/main.py (2 hunks)
  • new-backend/start.sh (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • new-backend/main.py
🧰 Additional context used
🪛 Shellcheck (0.10.0)
new-backend/start.sh

[error] 1-1: Tips depend on target shell and yours is unknown. Add a shebang or a 'shell' directive.

(SC2148)

Comment on lines 1 to 3
uv sync
echo "Render PORT is: $PORT"
uv run main.py No newline at end of file
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add a shebang + defensive flags + exec to harden the script

Without an explicit shebang the runtime shell is unpredictable (SC2148).
Adding set -euo pipefail stops the build on any failure and prevents silent
errors. Re-launching with exec ensures PID 1 receives signals correctly
(e.g. in Docker).

+#!/usr/bin/env bash
+
+# Abort on error, undefined var, or failed pipeline
+set -euo pipefail
+
+# Sync Python deps
 uv sync
-echo "Render PORT is: $PORT"
-uv run main.py
+
+# Expose the port (falls back to 8000 locally)
+PORT=${PORT:-8000}
+echo "Render PORT is: $PORT"
+
+# Replace the shell with the app so signals are forwarded
+exec uv run main.py
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
uv sync
echo "Render PORT is: $PORT"
uv run main.py
#!/usr/bin/env bash
# Abort on error, undefined var, or failed pipeline
set -euo pipefail
# Sync Python deps
uv sync
# Expose the port (falls back to 8000 locally)
PORT=${PORT:-8000}
echo "Render PORT is: $PORT"
# Replace the shell with the app so signals are forwarded
exec uv run main.py
🧰 Tools
🪛 Shellcheck (0.10.0)

[error] 1-1: Tips depend on target shell and yours is unknown. Add a shebang or a 'shell' directive.

(SC2148)

🤖 Prompt for AI Agents
In new-backend/start.sh at lines 1 to 3, the script lacks a shebang and
defensive flags, which can cause unpredictable behavior and silent errors. Add a
shebang line (e.g., #!/bin/bash) at the top, include `set -euo pipefail` to stop
on errors and undefined variables, and use `exec` before the command that runs
the main process to ensure proper signal handling.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (2)
new-backend/start.sh (2)

2-2: Add the missing -u and -o pipefail defensive flags.

Only set -e is present; without -u (undefined var) and -o pipefail (fail fast in pipelines) silent errors can slip through.
This was pointed out in a previous review but not fully applied.

-set -e
+set -euo pipefail

8-9: Preface the final command with exec so PID 1 gets signals.

uv run should replace the shell so the process receives SIGTERM/SIGINT directly (important in Docker/Render deploys).
Same recommendation was given earlier but is still missing.

-uv run main.py
+exec uv run main.py
🧹 Nitpick comments (2)
new-backend/start.sh (2)

1-1: Prefer portable shebang

#!/usr/bin/env bash is more portable than hard-coding /bin/bash, which may not exist in minimal containers.

-#!/bin/bash
+#!/usr/bin/env bash

5-5: Avoid redundant package install on every start

Re-installing uv each launch slows cold-starts and may break offline images.
Consider a conditional install or vendor it in the image layer.

-pip install uv
+command -v uv >/dev/null 2>&1 || pip install --no-cache-dir -q uv
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 80e9609 and 93c6793.

📒 Files selected for processing (1)
  • new-backend/start.sh (1 hunks)

@ManavSarkar ManavSarkar merged commit cc55296 into main Jul 18, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants