`store_and_send` Node: Embedding & Pinecone Storage + Frontend Sync #105

ParagGhatage · 2025-07-03T00:15:58Z

This PR introduces the store_and_send node at the final stage of the LangGraph pipeline. It performs the following key tasks:

Takes pipeline state as input.
Chunks the cleaned_text for better embedding efficiency.
Generates embeddings for each chunk using embedding model.
Stores the embeddings along with metadata in Pinecone vector DB.
Sends the entire updated state to the frontend for UI rendering or further use.

🔧 How it Works

Input: state dictionary, expected to include:
- article_id
- cleaned_text
- keywords
- sentiment
- fact_check
- perspective, counter_perspective, etc.
Steps:
1. Text is split into chunks.
2. Embedding for each chunk is created with a unique ID.
3. Each embedding is saved in Pinecone under a namespace keyed by article_id.
4. Full state is returned for final use in the frontend.

✅ Checklist

Chunking logic tested
Embedding creation verified
Pinecone upsert successful
Final state returned correctly

Summary by CodeRabbit

New Features
- Automated factual claim extraction and verification using AI and SerpAPI web search.
- Structured counter-perspective generation with detailed reasoning and AI-based scoring.
- Utilities for data chunking, text embedding, vector storage, and unique article ID generation.
- Pinecone vector store integration for efficient embedding storage and retrieval.
Improvements
- Enhanced sentiment analysis and perspective scoring with advanced AI models.
- Improved error handling and structured outputs across the fact-checking pipeline.
- Updated control flow with typed state management for robustness.
- Refined prompt templates and structured output parsing for perspective generation.
- Frontend loading now awaits analysis API response before starting progress animation.
- Analysis results displayed on results page by loading stored session data.
- Added CORS middleware to backend for cross-origin requests.
- Introduced backend startup script and Dockerfile for streamlined deployment.
Chores
- Added dependencies for AI, web search, vector storage, and NLP tools.
- Automatically download required NLTK data packages if missing.

…case before storing

coderabbitai · 2025-07-03T00:16:04Z

Walkthrough

This update adds a fact-checking pipeline with LLM-based claim extraction and verification using Groq API, integrated with SerpAPI web search. The LangGraph workflow is enhanced with typed states and updated nodes for sentiment, perspective generation, and judgment using LLMs. Vector storage is implemented via chunking, embedding, and Pinecone integration. NLTK dependencies and prompt templates are introduced.

Changes

File(s)	Change Summary
`app/modules/facts_check/llm_processing.py`	Added LLM claim extraction and fact verification functions using Groq API.
`app/modules/facts_check/web_search.py`	Added SerpAPI-based web search function with API key validation.
`app/modules/langgraph_builder.py`	Updated `build_langgraph` with typed state, replaced node functions, used `add_conditional_edges`, and updated terminal edge.
`app/modules/langgraph_nodes/fact_check.py`	Replaced placeholder search with `run_fact_check_pipeline` call; updated error handling and output structure.
`app/modules/langgraph_nodes/generate_perspective.py`	Replaced LangChain LLMChain with structured Groq LLM and new prompt; added Pydantic output model; changed input formatting.
`app/modules/langgraph_nodes/judge.py`	Replaced dummy scoring with Groq LLM scoring using prompt; added error handling and parsing of numeric score.
`app/modules/langgraph_nodes/sentiment.py`	Reduced max tokens in Groq chat call; normalized sentiment output to lowercase.
`app/modules/langgraph_nodes/store_and_send.py`	Replaced `save_to_vector_db` with chunking, embedding, and storing steps; added detailed error handling and prints.
`app/modules/scraper/cleaner.py`	Added NLTK import and auto-download of 'stopwords' and 'punkt_tab' corpora if missing.
`app/modules/vector_store/chunk_rag_data.py`	Added `chunk_rag_data` function to create chunks from article and facts with validation and metadata.
`app/modules/vector_store/embed.py`	Added embedding function using SentenceTransformer model for text chunks.
`app/db/vector_store.py`	Added Pinecone client initialization, index creation/check, and index connection with error handling.
`app/utils/fact_check_utils.py`	Added `run_fact_check_pipeline` to orchestrate claim extraction, web search, and fact verification with delays.
`app/utils/generate_chunk_id.py`	Added `generate_id` utility to create deterministic SHA-256 based IDs for text.
`app/utils/prompt_templates.py`	Added `generation_prompt` template for generating counter-perspectives with JSON output format.
`app/utils/store_vectors.py`	Added `store` function to upsert vectors into Pinecone index with logging and error handling.
`pyproject.toml`	Added dependencies: duckduckgo-search, google-search-results, langchain-community, langchain-groq, nltk, pinecone, pinecone-client, sentence-transformers.
`frontend/app/analyze/loading/page.tsx`	Refactored useEffect to await API call before starting progress and step intervals; added error handling and redirects.
`frontend/app/analyze/results/page.tsx`	Added state to load and display analysis results from sessionStorage as formatted JSON above header.
`frontend/package.json`	Added `axios` dependency for HTTP requests.
`new-backend/main.py`	Added CORS middleware with permissive settings to FastAPI app.
`new-backend/start.sh`	Added startup shell script to sync environment and run backend server.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant LangGraph
    participant FactCheckNode
    participant LLMProcessing
    participant WebSearch
    participant FactVerifier
    participant PerspectiveGen
    participant Judge
    participant Chunker

    User->>LangGraph: Submit article
    LangGraph->>FactCheckNode: Process state
    FactCheckNode->>LLMProcessing: Extract claims
    LLMProcessing->>FactCheckNode: Return claims
    FactCheckNode->>WebSearch: Search for each claim
    WebSearch->>FactCheckNode: Return search results
    FactCheckNode->>FactVerifier: Verify claims with evidence
    FactVerifier->>FactCheckNode: Return verdicts
    FactCheckNode->>LangGraph: Return facts
    LangGraph->>PerspectiveGen: Generate counter-perspective
    PerspectiveGen->>LangGraph: Return perspective
    LangGraph->>Judge: Score perspective
    Judge->>LangGraph: Return score
    LangGraph->>Chunker: Chunk data for storage
    Chunker->>LangGraph: Return chunks
    LangGraph->>User: Send results

Possibly related PRs

AOSSIE-Org/Perspective#99: Establishes the initial LangGraph workflow and dummy fact-check node, which are now replaced by the new LLM-based fact-checking pipeline in this PR.

Poem

In the warren of code, a new path unfolds,
With LLMs and search, the truth it beholds.
Claims are extracted, then checked with care,
Perspectives generated, judgments fair.
Chunks for the future, all neat and compact—
This bunny’s delighted: the facts are intact! 🐇✨

📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 93c6793 and e595a45.

📒 Files selected for processing (3)

frontend/app/analyze/loading/page.tsx (2 hunks)
new-backend/Dockerfile (1 hunks)
new-backend/main.py (1 hunks)

✅ Files skipped from review due to trivial changes (1)

new-backend/Dockerfile

🚧 Files skipped from review as they are similar to previous changes (2)

new-backend/main.py
frontend/app/analyze/loading/page.tsx

✨ Finishing Touches

📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Explain this complex logic.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai explain this code block.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and explain its main purpose.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 18

🧹 Nitpick comments (12)

new-backend/app/modules/langgraph_nodes/sentiment.py (1)
35-35: Consider increasing max_tokens for reliability.

While 3 tokens works for single-word responses, it might be too restrictive if the model occasionally includes punctuation or formatting. Consider using 5-10 tokens for better reliability.
-            max_tokens=3,
+            max_tokens=5,
new-backend/app/utils/prompt_templates.py (1)
4-5: Fix string formatting issue.

There's an unnecessary line break in the string that creates awkward formatting.
-You are an AI assistant that generates a well-reasoned '
-'counter-perspective to a given article.
+You are an AI assistant that generates a well-reasoned counter-perspective to a given article.
new-backend/app/modules/langgraph_nodes/store_and_send.py (1)
8-8: Remove debug print statement.

The debug print statement should be removed from production code as it may expose sensitive state information.
-        print(state)
new-backend/app/modules/scraper/cleaner.py (1)

2-2: Consider if NLTK import is actually needed.

The NLTK library is imported but not used anywhere in the current clean_extracted_text function, which only uses regex operations. If NLTK functionality is planned for future use, consider adding a comment explaining the intended usage.
new-backend/app/modules/vector_store/chunk_rag_data.py (1)
27-27: Fix typo in metadata key.

The key "explaination" should be "explanation" for correct spelling.
-                    "explaination": fact["explaination"],
+                    "explanation": fact["explanation"],
new-backend/app/modules/langgraph_nodes/fact_check.py (1)
6-10: Consider removing redundant validation.

The function validates cleaned_text presence but then passes the entire state to run_fact_check_pipeline. Since the pipeline should handle missing text internally, this validation might be redundant.

Consider either removing this check or ensuring the pipeline actually requires this validation:
 def run_fact_check(state):
     try:
-        text = state.get("cleaned_text")
-
-        if not text:
-            raise ValueError("Missing or empty 'cleaned_text' in state")
-
         verifications, error_message = run_fact_check_pipeline(state)
new-backend/app/modules/langgraph_nodes/judge.py (1)
39-43: Consider validating score range before clamping.

The current approach silently clamps out-of-range values, which might hide unexpected LLM outputs. Consider explicit validation for better error visibility.
-        m = re.search(r"\b(\d{1,3})\b", raw)
+        m = re.search(r"\b(\d{1,3})\b", raw)
         if not m:
             raise ValueError(f"Couldn't parse a score from: '{raw}'")
 
-        score = max(0, min(100, int(m.group(1))))
+        score = int(m.group(1))
+        if not 0 <= score <= 100:
+            print(f"Warning: Score {score} outside expected range [0-100], clamping.")
+            score = max(0, min(100, score))
new-backend/app/modules/langgraph_builder.py (1)
52-54: Remove unnecessary trailing comma.
     graph.set_entry_point(
-                    "sentiment_analysis",
+                    "sentiment_analysis"
                     )
new-backend/app/modules/langgraph_nodes/generate_perspective.py (2)
6-6: Remove redundant variable assignment.

The intermediate prompt variable is unnecessary since generation_prompt is only used once.
-prompt = generation_prompt
-
 # ... other code ...
 
-chain = prompt | structured_llm
+chain = generation_prompt | structured_llm
Also applies to: 24-24

50-50: Fix typo in error message.
-        print(f"some error occured in generate_perspective:{e}")
+        print(f"Error occurred in generate_perspective: {e}")
new-backend/app/modules/facts_check/llm_processing.py (2)
107-109: Improve markdown code block parsing to handle case variations.

The current regex only handles lowercase "json" and might miss valid markdown blocks.
-# Strip markdown code blocks if present
-            content = re.sub(r"^```json|```$", "", content).strip()
+            # Strip markdown code blocks if present (case-insensitive)
+            content = re.sub(r"^```(?:json|JSON)?|```$", "", content, flags=re.IGNORECASE).strip()
52-52: Maintain consistency in error logging format.

The codebase uses different styles for error messages - some with emojis (❌, 🔥) and some without. Consider standardizing the format.
-        print(f"Error in claim_extraction: {e}")
+        print(f"❌ Error in claim_extraction: {e}")

 # ... and ...

-        print(f"🔥 Error in fact_verification: {e}")
+        print(f"❌ Error in fact_verification: {e}")
Also applies to: 115-115, 132-132

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 501e9c2 and 30a5547.

⛔ Files ignored due to path filters (1)

new-backend/uv.lock is excluded by !**/*.lock

📒 Files selected for processing (14)

new-backend/app/modules/facts_check/llm_processing.py (1 hunks)
new-backend/app/modules/facts_check/web_search.py (1 hunks)
new-backend/app/modules/langgraph_builder.py (5 hunks)
new-backend/app/modules/langgraph_nodes/fact_check.py (2 hunks)
new-backend/app/modules/langgraph_nodes/generate_perspective.py (2 hunks)
new-backend/app/modules/langgraph_nodes/judge.py (1 hunks)
new-backend/app/modules/langgraph_nodes/sentiment.py (1 hunks)
new-backend/app/modules/langgraph_nodes/store_and_send.py (1 hunks)
new-backend/app/modules/scraper/cleaner.py (1 hunks)
new-backend/app/modules/vector_store/chunk_rag_data.py (1 hunks)
new-backend/app/utils/fact_check_utils.py (1 hunks)
new-backend/app/utils/generate_chunk_id.py (1 hunks)
new-backend/app/utils/prompt_templates.py (1 hunks)
new-backend/pyproject.toml (1 hunks)

🧰 Additional context used

🧬 Code Graph Analysis (4)

new-backend/app/modules/langgraph_nodes/store_and_send.py (1)

new-backend/app/modules/vector_store/chunk_rag_data.py (1)

chunk_rag_data (4-34)

new-backend/app/modules/langgraph_nodes/fact_check.py (1)

new-backend/app/utils/fact_check_utils.py (1)

run_fact_check_pipeline (8-38)

new-backend/app/modules/vector_store/chunk_rag_data.py (1)

new-backend/app/utils/generate_chunk_id.py (1)

generate_id (4-7)

new-backend/app/modules/langgraph_builder.py (2)

new-backend/app/modules/langgraph_nodes/sentiment.py (1)

run_sentiment_sdk (10-53)

new-backend/app/modules/langgraph_nodes/error_handler.py (1)

error_handler (3-11)

🪛 Ruff (0.11.9)

new-backend/app/modules/langgraph_nodes/store_and_send.py

9-9: Local variable chunks is assigned to but never used

Remove assignment to unused variable chunks

(F841)

🪛 Flake8 (7.2.0)

new-backend/app/modules/langgraph_nodes/store_and_send.py

[error] 9-9: local variable 'chunks' is assigned to but never used

(F841)

🔇 Additional comments (7)

new-backend/pyproject.toml (1)

10-10: All added dependencies are up to date and free of known security issues

duckduckgo-search (>=8.0.4)
• Latest upstream: 2025.4.4 (released 2025-04-04)
• No critical/high/medium vulnerabilities reported
• Note: the package was removed from PyPI in April 2025 (non-security compliance/hardening reasons); verify your install source or consider vendoring if that poses an issue

langchain-community (>=0.2.0)
• Latest upstream: 0.2.19
• No public advisories found

langchain-groq (>=0.1.0)
• Latest upstream: 0.3.4
• No known vulnerabilities

nltk (>=3.5)
• Latest upstream: 3.9.1
• No known security issues; earlier 3.8.x releases have been yanked

No further action required unless you need to pin to exact versions or address the PyPI removal of duckduckgo-search.

new-backend/app/modules/langgraph_nodes/sentiment.py (1)

39-39: No uppercase sentiment checks detected; lowercase conversion is safe

A search for downstream usage of “Positive”, “Negative”, or “Neutral” found only:

The prompt in sentiment.py (“ Positive, Negative, or Neutral.”)

The passthrough in generate_perspective.py:

"sentiment": state.get("sentiment", "neutral")

A JSON config using "baseColor": "neutral"

No code compares against uppercase sentiment values. Lowercasing won’t break existing consumers.

new-backend/app/utils/prompt_templates.py (1)

3-32: Well-structured prompt template with clear instructions.

The prompt template is well-designed with clear instructions, appropriate placeholders, and structured JSON output format. This should work effectively for generating counter-perspectives.

new-backend/app/modules/langgraph_nodes/store_and_send.py (1)

21-24: State structure matches frontend expectations
The MyState TypedDict in langgraph_builder.py defines exactly these keys:

cleaned_text

facts

sentiment

perspective

score

retries

status

The store_and_send node returns {**state, "status": "success"}, preserving all existing fields and setting status correctly. No additional fields are required.

new-backend/app/modules/langgraph_nodes/fact_check.py (1)

11-20: Good integration with the new fact-checking pipeline.

The error handling structure is well-implemented and maintains consistency with the existing error response format.
new-backend/app/modules/langgraph_builder.py (2)

29-30: Method updates look good.

The changes to use run_sentiment_sdk, explicit error_handler.error_handler, add_conditional_edges, and "__end__" align with the updated API and improve code clarity.

Also applies to: 49-50, 56-61, 100-101

14-22: No type mismatch for perspective—keep as str

Verification shows that across the entire codebase:

The only assignments to perspective are string operations (e.g. in backend/app/services/counter_service.py).

There are no attribute-style accesses like obj.perspective.

No PerspectiveOutput class or generate_perspective function returning an object was found.

Leave the TypedDict definition as-is:
 class MyState(TypedDict):
     cleaned_text: str
     facts: list[dict]
     sentiment: str
-    perspective: dict  # proposed change—remove
+    perspective: str
     score: int
     retries: int
     status: str
Likely an incorrect or invalid review comment.

new-backend/app/utils/generate_chunk_id.py

new-backend/app/modules/langgraph_nodes/store_and_send.py

coderabbitai · 2025-07-03T00:23:27Z

new-backend/app/modules/langgraph_nodes/store_and_send.py

+# from app.utils.vector_store import save_to_vector_db
+from app.modules.vector_store.chunk_rag_data import chunk_rag_data


💡 Verification agent

🧩 Analysis chain

Verify import path for chunk_rag_data function.

Ensure the import path is correct and the function exists at the specified location.

🏁 Script executed:

#!/bin/bash # Verify the chunk_rag_data function exists at the imported path fd -t f chunk_rag_data.py --exec cat {} \;

Length of output: 960

Integrate chunked data storage and clean up debug code

The import for chunk_rag_data is correct—no change needed there. The function is currently incomplete and contains leftover debug code:

• Re-enable and use your vector-store save function

- # from app.utils.vector_store import save_to_vector_db + from app.utils.vector_store import save_to_vector_db from app.modules.vector_store.chunk_rag_data import chunk_rag_data

• Remove the debug print and actually store the chunks

- chunks = chunk_rag_data(data) - print("Chunks:", chunks) + chunks = chunk_rag_data(data) + save_to_vector_db(chunks)

• Add error handling around both chunking and storage calls to surface failures
• Ensure any unused variables are removed once storage is implemented

File to update:

new-backend/app/modules/langgraph_nodes/store_and_send.py

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

# from app.utils.vector_store import save_to_vector_db

from app.modules.vector_store.chunk_rag_data import chunk_rag_data

# File: new-backend/app/modules/langgraph_nodes/store_and_send.py

from app.utils.vector_store import save_to_vector_db

from app.modules.vector_store.chunk_rag_data import chunk_rag_data

def store_and_send(data):

# … any pre-processing …

chunks = chunk_rag_data(data)

save_to_vector_db(chunks)

# … any post-processing …

🤖 Prompt for AI Agents

In new-backend/app/modules/langgraph_nodes/store_and_send.py at the beginning of the file, re-enable the import for the vector-store save function by uncommenting it. In the function where chunk_rag_data is used, remove any debug print statements and replace them with calls to the vector-store save function to store the chunked data. Add try-except blocks around both the chunking and storage operations to catch and log any errors that occur. Finally, remove any variables that become unused after implementing the storage logic to keep the code clean.

coderabbitai · 2025-07-03T00:23:27Z

new-backend/app/modules/scraper/cleaner.py

+try:
+    nltk.data.find('corpora/stopwords')
+    nltk.data.find('corpora/punkt_tab')
+
+except LookupError:
+    nltk.download('stopwords')
+    nltk.download('punkt_tab')


🛠️ Refactor suggestion

Move NLTK data initialization to avoid import-time delays.

Downloading NLTK data at module import time can significantly slow application startup. Consider moving this logic to a dedicated initialization function or lazy-loading approach.

-try: - nltk.data.find('corpora/stopwords') - nltk.data.find('corpora/punkt_tab') - -except LookupError: - nltk.download('stopwords') - nltk.download('punkt_tab') +def ensure_nltk_data(): + """Ensure required NLTK data is available, downloading if necessary.""" + try: + nltk.data.find('corpora/stopwords') + nltk.data.find('corpora/punkt_tab') + except LookupError: + nltk.download('stopwords') + nltk.download('punkt_tab')

Then call ensure_nltk_data() when NLTK functionality is actually needed.

🤖 Prompt for AI Agents

In new-backend/app/modules/scraper/cleaner.py around lines 4 to 10, the NLTK data check and download happen at import time, causing startup delays. Refactor this code by moving the NLTK data initialization into a separate function, such as ensure_nltk_data(), that performs the check and downloads if necessary. Then, call this function only when NLTK functionality is required, avoiding delays during module import.

new-backend/app/modules/vector_store/chunk_rag_data.py

coderabbitai · 2025-07-03T00:23:28Z

new-backend/app/modules/langgraph_nodes/judge.py

+        prompt = f"""
+You are an expert evaluator. Please rate the following counter-perspective
+on originality, reasoning quality, and factual grounding. Provide ONLY
+a single integer score from 0 (very poor) to 100 (excellent).
+
+=== Perspective to score ===
+{text}
+"""


💡 Verification agent

🧩 Analysis chain

Clarify whether evaluating "perspective" or "counter-perspective".

The prompt mentions "counter-perspective" but the code extracts and processes "perspective". This inconsistency could affect the scoring accuracy.

🏁 Script executed:

#!/bin/bash # Check how perspective is used across the codebase rg -A 3 -B 3 "counter-perspective|perspective" --type py

Length of output: 20874

Align “perspective” terminology in judge prompt

There’s a mismatch between the code’s perspective attribute (which holds the generated counter-perspective) and the prompt wording. To avoid confusion, update the wording in judge.py (or rename the field) so they match.

• File: new-backend/app/modules/langgraph_nodes/judge.py
• Change the prompt wording on line 21 from “counter-perspective” to “perspective” (or vice-versa rename the attribute to counter_perspective across your models).
• Example diff:

prompt = f""" -You are an expert evaluator. Please rate the following counter-perspective +You are an expert evaluator. Please rate the following perspective on originality, reasoning quality, and factual grounding. Provide ONLY a single integer score from 0 (very poor) to 100 (excellent). === Perspective to score === {text} """

• (Optional) For full consistency, consider renaming the PerspectiveOutput.perspective field to counter_perspective in generate_perspective.py and update any downstream references (e.g. vector-store metadata).

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

prompt = f"""

You are an expert evaluator. Please rate the following counter-perspective

on originality, reasoning quality, and factual grounding. Provide ONLY

a single integer score from 0 (very poor) to 100 (excellent).

=== Perspective to score ===

{text}

"""

prompt = f"""

You are an expert evaluator. Please rate the following perspective

on originality, reasoning quality, and factual grounding. Provide ONLY

a single integer score from 0 (very poor) to 100 (excellent).

=== Perspective to score ===

{text}

"""

🤖 Prompt for AI Agents

In new-backend/app/modules/langgraph_nodes/judge.py around lines 20 to 27, the prompt text uses "counter-perspective" while the code attribute is named "perspective," causing inconsistency. To fix this, update the prompt wording on line 21 to replace "counter-perspective" with "perspective" so the terminology matches. Optionally, for full consistency, rename the attribute in related files like generate_perspective.py and update all references accordingly.

coderabbitai · 2025-07-03T00:23:28Z

new-backend/app/modules/langgraph_nodes/generate_perspective.py

+        facts_str = "\n".join([f"Claim: {f['original_claim']}\n"
+                               "Verdict: {f['verdict']}\nExplanation: "
+                               "{f['explanation']}" for f in state["facts"]])


⚠️ Potential issue

Fix f-string formatting bug.

Lines 41-42 are not f-strings but contain {} placeholders. This will result in literal {f['verdict']} text instead of the actual values.

- facts_str = "\n".join([f"Claim: {f['original_claim']}\n" - "Verdict: {f['verdict']}\nExplanation: " - "{f['explanation']}" for f in state["facts"]]) + facts_str = "\n".join([f"Claim: {f['original_claim']}\n" + f"Verdict: {f['verdict']}\nExplanation: " + f"{f['explanation']}" for f in state["facts"]])

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

facts_str = "\n".join([f"Claim: {f['original_claim']}\n"

"Verdict: {f['verdict']}\nExplanation: "

"{f['explanation']}" for f in state["facts"]])

facts_str = "\n".join([f"Claim: {f['original_claim']}\n"

f"Verdict: {f['verdict']}\nExplanation: "

f"{f['explanation']}" for f in state["facts"]])

🤖 Prompt for AI Agents

In new-backend/app/modules/langgraph_nodes/generate_perspective.py around lines 40 to 42, the string concatenation uses curly braces for variable placeholders but the strings are not marked as f-strings, causing the placeholders to be treated as literal text. Fix this by adding the 'f' prefix to all strings containing placeholders so that the expressions inside curly braces are properly evaluated and formatted.

coderabbitai · 2025-07-03T00:23:28Z

new-backend/app/modules/facts_check/llm_processing.py

+load_dotenv()
+
+client = Groq(api_key=os.getenv("GROQ_API_KEY"))


🛠️ Refactor suggestion

Add validation for required API key.

The code should verify that the API key exists to fail fast with a clear error message.

load_dotenv() -client = Groq(api_key=os.getenv("GROQ_API_KEY")) +api_key = os.getenv("GROQ_API_KEY") +if not api_key: + raise ValueError("GROQ_API_KEY environment variable is not set") +client = Groq(api_key=api_key)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

load_dotenv()

client = Groq(api_key=os.getenv("GROQ_API_KEY"))

load_dotenv()

api_key = os.getenv("GROQ_API_KEY")

if not api_key:

raise ValueError("GROQ_API_KEY environment variable is not set")

client = Groq(api_key=api_key)

🤖 Prompt for AI Agents

In new-backend/app/modules/facts_check/llm_processing.py around lines 7 to 9, the code initializes the Groq client with an API key from the environment without checking if the key exists. Add a validation step after loading the environment variables to check if the "GROQ_API_KEY" is present. If it is missing, raise an exception or exit with a clear error message indicating the missing API key to ensure the application fails fast and clearly.

coderabbitai · 2025-07-03T00:23:28Z

new-backend/app/modules/facts_check/llm_processing.py

+        return {
+            "claim": claim,
+            "verifications": results_list,
+            "status": "success",
+        }


⚠️ Potential issue

Remove incorrect "claim" field from return value.

The function returns "claim": claim which uses the last claim from the loop. This is incorrect when processing multiple claims. The function already returns all verifications in the list.

return { - "claim": claim, "verifications": results_list, "status": "success", }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

return {

"claim": claim,

"verifications": results_list,

"status": "success",

}

return {

"verifications": results_list,

"status": "success",

}

🤖 Prompt for AI Agents

In new-backend/app/modules/facts_check/llm_processing.py around lines 125 to 129, remove the "claim" field from the returned dictionary because it incorrectly uses only the last claim from the loop. Instead, return only the "verifications" list and the "status" to accurately represent all processed claims.

new-backend/app/modules/facts_check/llm_processing.py

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

new-backend/app/modules/langgraph_nodes/store_and_send.py (1)

9-17: Complete the implementation - chunks are created but not used.

The function creates chunks but doesn't store them in Pinecone as mentioned in the PR objectives. This is the same issue flagged in previous reviews and remains unresolved.

Based on the PR objectives, this function should:

Generate embeddings for each chunk

Store embeddings in Pinecone vector database

Use the chunks variable that's currently unused

The current implementation is incomplete and doesn't fulfill the PR requirements.

🧹 Nitpick comments (2)

new-backend/app/modules/langgraph_nodes/store_and_send.py (2)
8-8: Remove debug print statement.

The debug print statement should be removed from production code.
-        print(state)
12-14: Improve exception handling with proper chaining.

Use raise ... from err to preserve the original exception context for better debugging.
-        except KeyError as e:
-            raise Exception(f"Missing required data field for chunking: {e}")
-        except Exception as e:
-            raise Exception(f"Failed to chunk data: {e}")
+        except KeyError as e:
+            raise Exception(f"Missing required data field for chunking: {e}") from e
+        except Exception as e:
+            raise Exception(f"Failed to chunk data: {e}") from e

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 30a5547 and a376121.

📒 Files selected for processing (3)

new-backend/app/modules/langgraph_nodes/store_and_send.py (1 hunks)
new-backend/app/modules/vector_store/chunk_rag_data.py (1 hunks)
new-backend/app/utils/generate_chunk_id.py (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

new-backend/app/modules/vector_store/chunk_rag_data.py

🧰 Additional context used

🧬 Code Graph Analysis (1)

new-backend/app/modules/langgraph_nodes/store_and_send.py (1)

new-backend/app/modules/vector_store/chunk_rag_data.py (1)

chunk_rag_data (4-65)

🪛 Ruff (0.11.9)

new-backend/app/modules/langgraph_nodes/store_and_send.py

10-10: Local variable chunks is assigned to but never used

Remove assignment to unused variable chunks

(F841)

12-12: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

14-14: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

🪛 Flake8 (7.2.0)

new-backend/app/modules/langgraph_nodes/store_and_send.py

[error] 10-10: local variable 'chunks' is assigned to but never used

(F841)

new-backend/app/utils/generate_chunk_id.py

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (1)

new-backend/app/modules/langgraph_nodes/store_and_send.py (1)

1-3: Complete the implementation - vector storage still missing.

The save_to_vector_db import remains commented out while chunking and embedding have been added. This confirms the incomplete implementation identified in previous reviews.

🧹 Nitpick comments (1)

new-backend/app/modules/langgraph_nodes/store_and_send.py (1)
5-5: Fix formatting: add blank line before function definition.

PEP 8 requires two blank lines before top-level function definitions.
+

 def store_and_send(state):

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a376121 and e53a7c4.

⛔ Files ignored due to path filters (1)

new-backend/uv.lock is excluded by !**/*.lock

📒 Files selected for processing (5)

new-backend/app/modules/langgraph_nodes/store_and_send.py (1 hunks)
new-backend/app/modules/vector_store/chunk_rag_data.py (1 hunks)
new-backend/app/modules/vector_store/embed.py (1 hunks)
new-backend/app/utils/generate_chunk_id.py (1 hunks)
new-backend/pyproject.toml (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (3)

new-backend/app/modules/vector_store/chunk_rag_data.py
new-backend/app/utils/generate_chunk_id.py
new-backend/pyproject.toml

🧰 Additional context used

🧬 Code Graph Analysis (1)

new-backend/app/modules/langgraph_nodes/store_and_send.py (2)

new-backend/app/modules/vector_store/chunk_rag_data.py (1)

chunk_rag_data (4-73)

new-backend/app/modules/vector_store/embed.py (1)

embed_chunks (7-30)

🪛 Flake8 (7.2.0)

new-backend/app/modules/langgraph_nodes/store_and_send.py

[error] 5-5: expected 2 blank lines, found 1

(E302)

🪛 Ruff (0.11.9)

new-backend/app/modules/langgraph_nodes/store_and_send.py

12-12: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

14-14: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

20-20: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

🔇 Additional comments (3)

new-backend/app/modules/vector_store/embed.py (3)

1-4: Good choice of embedding model and efficient global instance.

The SentenceTransformer model "all-MiniLM-L6-v2" is a well-balanced choice for general text embedding tasks, offering good performance with reasonable computational requirements. Using a global instance is efficient and avoids reloading the model on each function call.

7-10: Well-designed function signature and early return optimization.

The function has clear type hints and handles the empty input case efficiently with an early return.

13-30: Robust validation and efficient embedding generation.

The validation logic provides clear error messages with specific indices, making debugging easier. The embedding generation efficiently processes all texts at once, and the output format with "id", "values", and "metadata" is well-structured for vector database storage.

coderabbitai · 2025-07-04T09:42:52Z

new-backend/app/modules/langgraph_nodes/store_and_send.py

+        print(state)
+        try:
+            chunks = chunk_rag_data(state)
+        except KeyError as e:
+            raise Exception(f"Missing required data field for chunking: {e}")
+        except Exception as e:
+            raise Exception(f"Failed to chunk data: {e}")


🛠️ Refactor suggestion

Remove debug code and improve exception handling.

The print statement should be removed for production code. Additionally, exception handling should use proper chaining to preserve the original error context.

- print(state) try: chunks = chunk_rag_data(state) except KeyError as e: - raise Exception(f"Missing required data field for chunking: {e}") + raise Exception(f"Missing required data field for chunking: {e}") from e except Exception as e: - raise Exception(f"Failed to chunk data: {e}") + raise Exception(f"Failed to chunk data: {e}") from e

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

print(state)

try:

chunks = chunk_rag_data(state)

except KeyError as e:

raise Exception(f"Missing required data field for chunking: {e}")

except Exception as e:

raise Exception(f"Failed to chunk data: {e}")

try:

chunks = chunk_rag_data(state)

except KeyError as e:

raise Exception(f"Missing required data field for chunking: {e}") from e

except Exception as e:

raise Exception(f"Failed to chunk data: {e}") from e

🧰 Tools

🪛 Ruff (0.11.9)

12-12: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

14-14: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

🤖 Prompt for AI Agents

In new-backend/app/modules/langgraph_nodes/store_and_send.py around lines 8 to 14, remove the print(state) debug statement and update the exception handling to use proper exception chaining by adding "from e" when re-raising exceptions, preserving the original error context.

coderabbitai · 2025-07-04T09:42:52Z

new-backend/app/modules/langgraph_nodes/store_and_send.py

+        try:
+            vectors = embed_chunks(chunks)
+            if vectors:
+                print("embedding generated successfully!")
+        except Exception as e:
+            raise Exception(f"failed to embed chunks: {e}")


🛠️ Refactor suggestion

Improve exception handling consistency.

Apply the same exception chaining improvement here as recommended for the chunking section.

try: vectors = embed_chunks(chunks) if vectors: print("embedding generated successfully!") except Exception as e: - raise Exception(f"failed to embed chunks: {e}") + raise Exception(f"failed to embed chunks: {e}") from e

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

try:

vectors = embed_chunks(chunks)

if vectors:

print("embedding generated successfully!")

except Exception as e:

raise Exception(f"failed to embed chunks: {e}")

try:

vectors = embed_chunks(chunks)

if vectors:

print("embedding generated successfully!")

except Exception as e:

raise Exception(f"failed to embed chunks: {e}") from e

🧰 Tools

🪛 Ruff (0.11.9)

20-20: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

🤖 Prompt for AI Agents

In new-backend/app/modules/langgraph_nodes/store_and_send.py around lines 15 to 20, the exception handling raises a new Exception without preserving the original traceback. Modify the except block to use "raise Exception(f'failed to embed chunks: {e}') from e" to maintain exception chaining and improve debugging.

…he pinecone with metadata and original text

coderabbitai

Actionable comments posted: 4

♻️ Duplicate comments (5)

new-backend/app/modules/facts_check/llm_processing.py (2)
7-9: Add validation for required API key.

The code should verify that the API key exists to fail fast with a clear error message.
 load_dotenv()
 
-client = Groq(api_key=os.getenv("GROQ_API_KEY"))
+api_key = os.getenv("GROQ_API_KEY")
+if not api_key:
+    raise ValueError("GROQ_API_KEY environment variable is not set")
+client = Groq(api_key=api_key)
119-123: Remove incorrect "claim" field from return value.

The function returns "claim": claim which uses the last claim from the loop. This is incorrect when processing multiple claims. The function already returns all verifications in the list.
         return {
-            "claim": claim,
             "verifications": results_list,
             "status": "success",
         }
new-backend/app/modules/langgraph_nodes/store_and_send.py (3)
9-9: Remove debug print statement.

Debug print statements should be removed from production code or replaced with proper logging.

Apply this diff to remove the debug code:
-        print(state)
10-15: Improve exception chaining for better error traceability.

The exception handling should preserve the original error context as recommended by the static analysis tool.

Apply this diff to improve exception chaining:
         except KeyError as e:
-            raise Exception(f"Missing required data field for chunking: {e}")
+            raise Exception(f"Missing required data field for chunking: {e}") from e
         except Exception as e:
-            raise Exception(f"Failed to chunk data: {e}")
+            raise Exception(f"Failed to chunk data: {e}") from e
16-21: Remove debug print and improve exception chaining.

The debug print statement should be removed and exception chaining should be improved.

Apply this diff to address both issues:
             if vectors:
-                print("embedding generated successfully!")
         except Exception as e:
-            raise Exception(f"failed to embed chunks: {e}")
+            raise Exception(f"failed to embed chunks: {e}") from e

🧹 Nitpick comments (3)

new-backend/app/modules/facts_check/llm_processing.py (2)
78-80: Fix missing space in system prompt.

The system prompt has a missing space that makes it grammatically incorrect.
-                            "Your job is to determine whether the given"
-                            " claim is True, False"
-                            "based on the provided web search evidence."
+                            "Your job is to determine whether the given "
+                            "claim is True or False "
+                            "based on the provided web search evidence."
107-109: Consider improving markdown stripping logic.

The current regex pattern may not handle all markdown variations. Consider using a more robust pattern or a dedicated markdown parser.
-# Strip markdown code blocks if present
-            content = re.sub(r"^```json|```$", "", content).strip()
+            # Strip markdown code blocks if present
+            content = re.sub(r"^```(?:json)?\s*|```\s*$", "", content, flags=re.MULTILINE).strip()
new-backend/app/modules/langgraph_nodes/store_and_send.py (1)
23-24: Complete implementation achieved - remove debug print.

Great work completing the vector storage implementation! The function now properly stores vectors in Pinecone as required by the PR objectives. However, the debug print should be removed.

Apply this diff to remove the debug print:
         store(vectors)
-        print("Vectors saved to Pinecone!")

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e53a7c4 and 385805f.

⛔ Files ignored due to path filters (1)

new-backend/uv.lock is excluded by !**/*.lock

📒 Files selected for processing (7)

new-backend/app/db/vector_store.py (1 hunks)
new-backend/app/modules/facts_check/llm_processing.py (1 hunks)
new-backend/app/modules/facts_check/web_search.py (1 hunks)
new-backend/app/modules/langgraph_nodes/store_and_send.py (1 hunks)
new-backend/app/utils/fact_check_utils.py (1 hunks)
new-backend/app/utils/store_vectors.py (1 hunks)
new-backend/pyproject.toml (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (3)

new-backend/app/modules/facts_check/web_search.py
new-backend/pyproject.toml
new-backend/app/utils/fact_check_utils.py

🧰 Additional context used

🧬 Code Graph Analysis (1)

new-backend/app/modules/langgraph_nodes/store_and_send.py (3)

new-backend/app/modules/vector_store/chunk_rag_data.py (1)

chunk_rag_data (4-73)

new-backend/app/modules/vector_store/embed.py (1)

embed_chunks (7-30)

new-backend/app/utils/store_vectors.py (1)

store (10-32)

🪛 Ruff (0.11.9)

new-backend/app/utils/store_vectors.py

32-32: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

new-backend/app/modules/langgraph_nodes/store_and_send.py

13-13: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

15-15: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

21-21: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

new-backend/app/db/vector_store.py

14-14: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

40-41: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

🔇 Additional comments (8)

new-backend/app/modules/facts_check/llm_processing.py (1)

12-58: LGTM: Claim extraction function is well-implemented.

The function properly handles state input validation, error handling, and API interaction. The prompt structure is clear and the return format correctly augments the original state.

new-backend/app/db/vector_store.py (3)

5-7: LGTM - Proper environment variable validation.

The environment variable handling correctly validates the presence of the required API key and raises an appropriate error if missing.

16-19: LGTM - Appropriate constants for vector configuration.

The constants for index name, dimensions (384), and cosine metric are well-defined and align with the embedding model specifications mentioned in the codebase.

22-34: LGTM - Proper index creation with serverless specification.

The index creation logic correctly checks for existence before creating and uses appropriate serverless specifications for AWS US East 1.

new-backend/app/utils/store_vectors.py (2)

10-21: LGTM - Well-documented function with clear interface.

The function signature, documentation, and parameter descriptions are well-structured and clearly communicate the expected behavior and exceptions.

22-23: LGTM - Proper input validation.

The validation correctly checks for empty vectors list and raises an appropriate ValueError.

new-backend/app/modules/langgraph_nodes/store_and_send.py (2)

1-3: LGTM - Proper imports for the vector storage pipeline.

The imports correctly reference the chunking, embedding, and storage functions needed for the complete pipeline implementation.

6-37: Excellent implementation of the complete vector storage pipeline.

The function now successfully implements the full pipeline described in the PR objectives:

✅ Chunks the cleaned_text using chunk_rag_data

✅ Generates embeddings for each chunk using embed_chunks

✅ Stores embeddings in Pinecone using the store function

✅ Returns the updated state for frontend sync

This addresses the previously incomplete implementation and fulfills all the PR requirements. The error handling covers each step appropriately, and the function integrates well with the LangGraph pipeline.

coderabbitai · 2025-07-05T00:01:31Z

new-backend/app/modules/facts_check/llm_processing.py

+            try:
+                parsed = json.loads(content)
+            except Exception as parse_err:
+                print(f"❌ LLM JSON parse error: {parse_err}")
+
+            results_list.append(parsed)


⚠️ Potential issue

Fix undefined variable error in JSON parsing.

When JSON parsing fails, the parsed variable is not defined, but line 117 still tries to append it to results_list, which will cause a NameError.

# Try parsing the JSON response try: parsed = json.loads(content) except Exception as parse_err: print(f"❌ LLM JSON parse error: {parse_err}") + parsed = { + "verdict": "Error", + "explanation": f"JSON parsing failed: {parse_err}", + "original_claim": claim, + "source_link": source + } results_list.append(parsed)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

try:

parsed = json.loads(content)

except Exception as parse_err:

print(f"❌ LLM JSON parse error: {parse_err}")

results_list.append(parsed)

# Try parsing the JSON response

try:

parsed = json.loads(content)

except Exception as parse_err:

print(f"❌ LLM JSON parse error: {parse_err}")

parsed = {

"verdict": "Error",

"explanation": f"JSON parsing failed: {parse_err}",

"original_claim": claim,

"source_link": source

}

results_list.append(parsed)

🤖 Prompt for AI Agents

In new-backend/app/modules/facts_check/llm_processing.py around lines 112 to 117, the variable 'parsed' is used outside the try-except block without guaranteed initialization, causing a NameError if JSON parsing fails. To fix this, initialize 'parsed' to a default value (e.g., None) before the try block or only append 'parsed' to 'results_list' inside the try block after successful parsing, ensuring it is defined before use.

coderabbitai · 2025-07-05T00:01:31Z

new-backend/app/db/vector_store.py

+try:
+
+    # Initialize Pinecone client
+    pc = Pinecone(api_key=PINECONE_API_KEY)
+
+except Exception as e:
+    raise RuntimeError(f"Error occured while intialising pinecone client:{e}")


⚠️ Potential issue

Fix typo and improve exception chaining.

There's a typo in the error message and the exception handling should preserve the original error context.

Apply this diff to fix the issues:

- raise RuntimeError(f"Error occured while intialising pinecone client:{e}") + raise RuntimeError(f"Error occurred while initializing pinecone client: {e}") from e

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

try:

# Initialize Pinecone client

pc = Pinecone(api_key=PINECONE_API_KEY)

except Exception as e:

raise RuntimeError(f"Error occured while intialising pinecone client:{e}")

try:

# Initialize Pinecone client

pc = Pinecone(api_key=PINECONE_API_KEY)

except Exception as e:

raise RuntimeError(f"Error occurred while initializing pinecone client: {e}") from e

🧰 Tools

🪛 Ruff (0.11.9)

14-14: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

🤖 Prompt for AI Agents

In new-backend/app/db/vector_store.py around lines 8 to 14, fix the typo in the error message by changing "occured" to "occurred" and "intialising" to "initialising". Also, improve exception handling by using "raise RuntimeError(...) from e" to preserve the original exception context.

coderabbitai · 2025-07-05T00:01:31Z

new-backend/app/db/vector_store.py

+try:
+    # Connect to the index
+    index = pc.Index(INDEX_NAME)
+except Exception as e:
+    raise RuntimeError("Error occured while "
+                       f"connecting to the index {INDEX_NAME}:{e}")


⚠️ Potential issue

Fix typo and improve exception chaining.

There's a typo in the error message and the exception handling should preserve the original error context.

Apply this diff to fix the issues:

- raise RuntimeError("Error occured while " + raise RuntimeError("Error occurred while " f"connecting to the index {INDEX_NAME}:{e}") + f"connecting to the index {INDEX_NAME}: {e}") from e

Committable suggestion skipped: line range outside the PR's diff.

🧰 Tools

🪛 Ruff (0.11.9)

40-41: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

🤖 Prompt for AI Agents

In new-backend/app/db/vector_store.py around lines 36 to 41, fix the typo in the error message by changing "occured" to "occurred" and improve exception handling by using "raise ... from e" to preserve the original exception context when re-raising the RuntimeError.

coderabbitai · 2025-07-05T00:01:31Z

new-backend/app/utils/store_vectors.py

+    try:
+        index.upsert(vectors, namespace=namespace)
+        logger.info(f"Successfully stored {len(vectors)} "
+                    f"vectors in namespace '{namespace}'")
+    except Exception as e:
+        logger.error("Failed to store "
+                     f"vectors in namespace '{namespace}': {e}")
+        raise RuntimeError(f"Vector storage failed: {e}")


🛠️ Refactor suggestion

Improve exception chaining for better error traceability.

The exception handling should preserve the original error context as recommended by the static analysis tool.

Apply this diff to improve exception chaining:

- raise RuntimeError(f"Vector storage failed: {e}") + raise RuntimeError(f"Vector storage failed: {e}") from e

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

try:

index.upsert(vectors, namespace=namespace)

logger.info(f"Successfully stored {len(vectors)} "

f"vectors in namespace '{namespace}'")

except Exception as e:

logger.error("Failed to store "

f"vectors in namespace '{namespace}': {e}")

raise RuntimeError(f"Vector storage failed: {e}")

try:

index.upsert(vectors, namespace=namespace)

logger.info(f"Successfully stored {len(vectors)} "

f"vectors in namespace '{namespace}'")

except Exception as e:

logger.error("Failed to store "

f"vectors in namespace '{namespace}': {e}")

raise RuntimeError(f"Vector storage failed: {e}") from e

🧰 Tools

🪛 Ruff (0.11.9)

32-32: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

🤖 Prompt for AI Agents

In new-backend/app/utils/store_vectors.py around lines 25 to 32, the exception handling catches errors but does not preserve the original traceback. Modify the raise statement to use "raise RuntimeError(...) from e" to chain the original exception, which improves error traceability and debugging.

coderabbitai

Actionable comments posted: 4

🧹 Nitpick comments (1)

frontend/app/analyze/loading/page.tsx (1)

95-102: Optimize progress animation logic.

The current progress animation increments every 100ms regardless of actual progress, which may not align with the step-based progress calculation.

Apply this diff to improve progress synchronization:

-      const progressInterval = setInterval(() => {
-        setProgress((prev) => {
-          if (prev < 100) {
-            return prev + 1
-          }
-          return prev
-        })
-      }, 100)
+      const progressInterval = setInterval(() => {
+        setProgress((prev) => {
+          const targetProgress = Math.min((currentStep + 1) * 20, 100)
+          if (prev < targetProgress) {
+            return Math.min(prev + 2, targetProgress)
+          }
+          return prev
+        })
+      }, 100)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 385805f and a2dee5c.

⛔ Files ignored due to path filters (1)

frontend/package-lock.json is excluded by !**/package-lock.json

📒 Files selected for processing (3)

frontend/app/analyze/loading/page.tsx (2 hunks)
frontend/app/analyze/results/page.tsx (3 hunks)
frontend/package.json (2 hunks)

✅ Files skipped from review due to trivial changes (1)

frontend/package.json

🔇 Additional comments (2)

frontend/app/analyze/results/page.tsx (1)

23-23: LGTM! State initialization is correct.

The new state variable for analysis data is properly initialized.

frontend/app/analyze/loading/page.tsx (1)

9-9: LGTM! Axios import is appropriate.

The axios library is properly imported for making HTTP requests.

coderabbitai · 2025-07-05T09:16:59Z

frontend/app/analyze/results/page.tsx

+    <div className="p-4">
+      <h1 className="text-2xl font-bold mb-4">Analysis Results</h1>
+      <pre className="bg-black p-4 rounded text-sm overflow-x-auto whitespace-pre-wrap">
+        {JSON.stringify(analysisData, null, 2)}
+      </pre>
+    </div>


🛠️ Refactor suggestion

Improve JSON display implementation and consider security implications.

The current implementation has several issues:

The debug section is positioned above the header, breaking the layout

Raw JSON display may expose sensitive information

No handling for null/undefined data

Poor user experience with raw JSON dump

Apply this diff to improve the implementation:

- <div className="p-4"> - <h1 className="text-2xl font-bold mb-4">Analysis Results</h1> - <pre className="bg-black p-4 rounded text-sm overflow-x-auto whitespace-pre-wrap"> - {JSON.stringify(analysisData, null, 2)} - </pre> - </div> + {process.env.NODE_ENV === 'development' && analysisData && ( + <div className="p-4 mt-16"> + <details className="mb-4"> + <summary className="text-sm font-medium cursor-pointer">Debug: Raw Analysis Data</summary> + <pre className="bg-gray-100 dark:bg-gray-800 p-4 rounded text-xs overflow-x-auto whitespace-pre-wrap mt-2"> + {JSON.stringify(analysisData, null, 2)} + </pre> + </details> + </div> + )}

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents

In frontend/app/analyze/results/page.tsx around lines 78 to 83, the JSON display is currently placed above the header, shows raw JSON which may expose sensitive data, lacks handling for null or undefined analysisData, and results in poor user experience. To fix this, move the JSON display below the header, add conditional rendering to handle null or undefined analysisData gracefully, sanitize or selectively display only non-sensitive parts of analysisData, and replace the raw JSON dump with a more user-friendly formatted view or summary.

coderabbitai · 2025-07-05T09:16:59Z

frontend/app/analyze/results/page.tsx

+    const storedData = sessionStorage.getItem("analysisResult")
+    if (storedData) {
+      setAnalysisData(JSON.parse(storedData))
+    } else {
+      // fallback if user visits results page directly
+      // maybe redirect or show error
+      console.warn("No analysis result found")
+    }


⚠️ Potential issue

Add error handling for JSON parsing and improve user experience.

The current implementation lacks error handling for JSON parsing, which could crash the component if the stored data is malformed.

Apply this diff to add proper error handling:

- const storedData = sessionStorage.getItem("analysisResult") - if (storedData) { - setAnalysisData(JSON.parse(storedData)) - } else { - // fallback if user visits results page directly - // maybe redirect or show error - console.warn("No analysis result found") - } + const storedData = sessionStorage.getItem("analysisResult") + if (storedData) { + try { + setAnalysisData(JSON.parse(storedData)) + } catch (error) { + console.error("Failed to parse analysis data:", error) + // Handle corrupted data + sessionStorage.removeItem("analysisResult") + router.push("/analyze") + } + } else { + // fallback if user visits results page directly + console.warn("No analysis result found") + router.push("/analyze") + }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

const storedData = sessionStorage.getItem("analysisResult")

if (storedData) {

setAnalysisData(JSON.parse(storedData))

} else {

// fallback if user visits results page directly

// maybe redirect or show error

console.warn("No analysis result found")

}

const storedData = sessionStorage.getItem("analysisResult")

if (storedData) {

try {

setAnalysisData(JSON.parse(storedData))

} catch (error) {

console.error("Failed to parse analysis data:", error)

// Handle corrupted data

sessionStorage.removeItem("analysisResult")

router.push("/analyze")

}

} else {

// fallback if user visits results page directly

console.warn("No analysis result found")

router.push("/analyze")

}

🤖 Prompt for AI Agents

In frontend/app/analyze/results/page.tsx around lines 42 to 49, the JSON.parse call on storedData lacks error handling, which can cause the component to crash if the data is malformed. Wrap the JSON.parse call in a try-catch block to catch parsing errors, log the error, and handle the failure gracefully by either clearing the invalid data or showing an appropriate user message to improve user experience.

coderabbitai · 2025-07-05T09:16:59Z

frontend/app/analyze/loading/page.tsx

+        const res = await axios.post("http://localhost:8000/api/process", {
+          url: storedUrl,
+        })


🛠️ Refactor suggestion

Add request timeout and better error handling.

The API call lacks timeout configuration and detailed error handling, which could lead to poor user experience.

Apply this diff to improve the API call:

- const res = await axios.post("http://localhost:8000/api/process", { - url: storedUrl, - }) + const res = await axios.post(`${apiUrl}/api/process`, { + url: storedUrl, + }, { + timeout: 60000, // 60 seconds timeout for analysis + headers: { + 'Content-Type': 'application/json' + } + })

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

const res = await axios.post("http://localhost:8000/api/process", {

url: storedUrl,

})

const res = await axios.post(`${apiUrl}/api/process`, {

url: storedUrl,

}, {

timeout: 60000, // 60 seconds timeout for analysis

headers: {

'Content-Type': 'application/json'

}

})

🤖 Prompt for AI Agents

In frontend/app/analyze/loading/page.tsx around lines 64 to 66, the axios POST request to the API lacks a timeout setting and detailed error handling. Add a timeout option to the axios request configuration to prevent hanging requests, and wrap the call in a try-catch block to handle errors gracefully. In the catch block, provide meaningful error messages or fallback behavior to improve user experience.

coderabbitai · 2025-07-05T09:17:00Z

frontend/app/analyze/loading/page.tsx

+  const runAnalysis = async () => {
    const storedUrl = sessionStorage.getItem("articleUrl")
    if (storedUrl) {
      setArticleUrl(storedUrl)
+
+      try {
+        const res = await axios.post("http://localhost:8000/api/process", {
+          url: storedUrl,
+        })
+
+        // Save response to sessionStorage
+        sessionStorage.setItem("analysisResult", JSON.stringify(res.data))
+
+        // optional logging
+        console.log("Analysis result saved")
+        console.log(res)
+      } catch (err) {
+        console.error("Failed to process article:", err)
+        router.push("/analyze") // fallback in case of error
+        return
+      }
+
+      // Progress and step simulation
+      const stepInterval = setInterval(() => {
+        setCurrentStep((prev) => {
+          if (prev < steps.length - 1) {
+            return prev + 1
+          } else {
+            clearInterval(stepInterval)
+            setTimeout(() => {
+              router.push("/analyze/results")
+            }, 2000)
+            return prev
+          }
+        })
+      }, 2000)
+
+      const progressInterval = setInterval(() => {
+        setProgress((prev) => {
+          if (prev < 100) {
+            return prev + 1
+          }
+          return prev
+        })
+      }, 100)
+
+      return () => {
+        clearInterval(stepInterval)
+        clearInterval(progressInterval)
+      }
    } else {
-      // Redirect back if no URL found
      router.push("/analyze")
-      return
    }
+  }
+
+  runAnalysis()
+}, [router])


⚠️ Potential issue

Fix critical async function and cleanup issues.

The current implementation has several critical problems:

The cleanup function is incorrectly placed inside the async function

Intervals may not be properly cleaned up, causing memory leaks

Hard-coded localhost URL won't work in production

Apply this diff to fix the async function structure and cleanup:

- useEffect(() => { - const runAnalysis = async () => { + useEffect(() => { + let stepInterval: NodeJS.Timeout + let progressInterval: NodeJS.Timeout + + const runAnalysis = async () => { const storedUrl = sessionStorage.getItem("articleUrl") if (storedUrl) { setArticleUrl(storedUrl) try { - const res = await axios.post("http://localhost:8000/api/process", { + const apiUrl = process.env.NEXT_PUBLIC_API_URL || "http://localhost:8000" + const res = await axios.post(`${apiUrl}/api/process`, { url: storedUrl, }) // Save response to sessionStorage sessionStorage.setItem("analysisResult", JSON.stringify(res.data)) // optional logging console.log("Analysis result saved") - console.log(res) } catch (err) { console.error("Failed to process article:", err) router.push("/analyze") // fallback in case of error return } // Progress and step simulation - const stepInterval = setInterval(() => { + stepInterval = setInterval(() => { setCurrentStep((prev) => { if (prev < steps.length - 1) { return prev + 1 } else { clearInterval(stepInterval) setTimeout(() => { router.push("/analyze/results") }, 2000) return prev } }) }, 2000) - const progressInterval = setInterval(() => { + progressInterval = setInterval(() => { setProgress((prev) => { if (prev < 100) { return prev + 1 } return prev }) }, 100) - return () => { - clearInterval(stepInterval) - clearInterval(progressInterval) - } } else { router.push("/analyze") } - } + } - runAnalysis() -}, [router]) + runAnalysis() + + return () => { + if (stepInterval) clearInterval(stepInterval) + if (progressInterval) clearInterval(progressInterval) + } + }, [router])

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

const runAnalysis = async () => {

const storedUrl = sessionStorage.getItem("articleUrl")

if (storedUrl) {

setArticleUrl(storedUrl)

try {

const res = await axios.post("http://localhost:8000/api/process", {

url: storedUrl,

})

// Save response to sessionStorage

sessionStorage.setItem("analysisResult", JSON.stringify(res.data))

// optional logging

console.log("Analysis result saved")

console.log(res)

} catch (err) {

console.error("Failed to process article:", err)

router.push("/analyze") // fallback in case of error

return

}

// Progress and step simulation

const stepInterval = setInterval(() => {

setCurrentStep((prev) => {

if (prev < steps.length - 1) {

return prev + 1

} else {

clearInterval(stepInterval)

setTimeout(() => {

router.push("/analyze/results")

}, 2000)

return prev

}

})

}, 2000)

const progressInterval = setInterval(() => {

setProgress((prev) => {

if (prev < 100) {

return prev + 1

}

return prev

})

}, 100)

return () => {

clearInterval(stepInterval)

clearInterval(progressInterval)

}

} else {

// Redirect back if no URL found

router.push("/analyze")

return

}

}

runAnalysis()

}, [router])

useEffect(() => {

let stepInterval: NodeJS.Timeout

let progressInterval: NodeJS.Timeout

const runAnalysis = async () => {

const storedUrl = sessionStorage.getItem("articleUrl")

if (storedUrl) {

setArticleUrl(storedUrl)

try {

const apiUrl = process.env.NEXT_PUBLIC_API_URL || "http://localhost:8000"

const res = await axios.post(`${apiUrl}/api/process`, {

url: storedUrl,

})

// Save response to sessionStorage

sessionStorage.setItem("analysisResult", JSON.stringify(res.data))

// optional logging

console.log("Analysis result saved")

} catch (err) {

console.error("Failed to process article:", err)

router.push("/analyze") // fallback in case of error

return

}

// Progress and step simulation

stepInterval = setInterval(() => {

setCurrentStep((prev) => {

if (prev < steps.length - 1) {

return prev + 1

} else {

clearInterval(stepInterval)

setTimeout(() => {

router.push("/analyze/results")

}, 2000)

return prev

}

})

}, 2000)

progressInterval = setInterval(() => {

setProgress((prev) => {

if (prev < 100) {

return prev + 1

}

return prev

})

}, 100)

} else {

router.push("/analyze")

}

}

runAnalysis()

return () => {

if (stepInterval) clearInterval(stepInterval)

if (progressInterval) clearInterval(progressInterval)

}

}, [router])

🤖 Prompt for AI Agents

In frontend/app/analyze/loading/page.tsx lines 58 to 114, the async function runAnalysis incorrectly includes the cleanup function inside it, causing improper interval cleanup and potential memory leaks. Also, the API URL is hard-coded to localhost, which is not suitable for production. To fix this, move the interval setup and cleanup logic outside the async function into a useEffect hook, ensure intervals are cleared in the cleanup function returned by useEffect, and replace the hard-coded URL with a configurable environment variable or relative path to support production deployment.

coderabbitai

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a2dee5c and 921cbe8.

📒 Files selected for processing (3)

new-backend/app/modules/facts_check/web_search.py (1 hunks)
new-backend/main.py (2 hunks)
new-backend/start.sh (1 hunks)

✅ Files skipped from review due to trivial changes (1)

new-backend/start.sh

🚧 Files skipped from review as they are similar to previous changes (1)

new-backend/app/modules/facts_check/web_search.py

🔇 Additional comments (1)

new-backend/main.py (1)

3-3: LGTM - Clean import addition.

The import follows FastAPI conventions and is necessary for the CORS middleware functionality.

coderabbitai · 2025-07-05T09:26:35Z

new-backend/main.py

+app.add_middleware(
+    CORSMiddleware,
+    allow_origins=["*"],
+    allow_credentials=True,
+    allow_methods=["*"],
+    allow_headers=["*"],
+)
+


⚠️ Potential issue

Security concern: Overly permissive CORS configuration.

The combination of allow_origins=["*"] with allow_credentials=True violates CORS security principles and creates potential security vulnerabilities. This configuration allows any origin to make credentialed requests to your API.

For production environments, consider this more secure configuration:

app.add_middleware( CORSMiddleware, - allow_origins=["*"], + allow_origins=[ + "http://localhost:3000", # React dev server + "https://yourdomain.com", # Production frontend + ], allow_credentials=True, - allow_methods=["*"], + allow_methods=["GET", "POST", "PUT", "DELETE"], - allow_headers=["*"], + allow_headers=["Content-Type", "Authorization"], )

Alternatively, if you need wildcard origins during development, set allow_credentials=False:

app.add_middleware( CORSMiddleware, allow_origins=["*"], - allow_credentials=True, + allow_credentials=False, allow_methods=["*"], allow_headers=["*"], )

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

app.add_middleware(

CORSMiddleware,

allow_origins=["*"],

allow_credentials=True,

allow_methods=["*"],

allow_headers=["*"],

)

app.add_middleware(

CORSMiddleware,

allow_origins=[

"http://localhost:3000", # React dev server

"https://yourdomain.com", # Production frontend

],

allow_credentials=True,

allow_methods=["GET", "POST", "PUT", "DELETE"],

allow_headers=["Content-Type", "Authorization"],

)

Suggested change

app.add_middleware(

CORSMiddleware,

allow_origins=["*"],

allow_credentials=True,

allow_methods=["*"],

allow_headers=["*"],

)

app.add_middleware(

CORSMiddleware,

allow_origins=["*"],

allow_credentials=False,

allow_methods=["*"],

allow_headers=["*"],

)

🤖 Prompt for AI Agents

In new-backend/main.py around lines 14 to 21, the CORS middleware is configured with allow_origins=["*"] and allow_credentials=True, which is insecure. To fix this, replace allow_origins=["*"] with a specific list of trusted origins for production, or if wildcard origins are needed during development, set allow_credentials=False to prevent credentialed requests from any origin.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

new-backend/start.sh (1)

2-2: Provide a default for PORT to avoid empty output

$PORT is unset on most local machines; printing an empty value is confusing
and FastAPI will still start on its default port. Apply the defaulting shown in
the diff above.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c19351e and 80e9609.

📒 Files selected for processing (2)

new-backend/main.py (2 hunks)
new-backend/start.sh (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

new-backend/main.py

🧰 Additional context used

🪛 Shellcheck (0.10.0)

new-backend/start.sh

[error] 1-1: Tips depend on target shell and yours is unknown. Add a shebang or a 'shell' directive.

(SC2148)

coderabbitai · 2025-07-05T10:35:32Z

new-backend/start.sh

+uv sync
+echo "Render PORT is: $PORT"
+uv run main.py


🛠️ Refactor suggestion

Add a shebang + defensive flags + exec to harden the script

Without an explicit shebang the runtime shell is unpredictable (SC2148).
Adding set -euo pipefail stops the build on any failure and prevents silent
errors. Re-launching with exec ensures PID 1 receives signals correctly
(e.g. in Docker).

+#!/usr/bin/env bash + +# Abort on error, undefined var, or failed pipeline +set -euo pipefail + +# Sync Python deps uv sync -echo "Render PORT is: $PORT" -uv run main.py + +# Expose the port (falls back to 8000 locally) +PORT=${PORT:-8000} +echo "Render PORT is: $PORT" + +# Replace the shell with the app so signals are forwarded +exec uv run main.py

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

uv sync

echo "Render PORT is: $PORT"

uv run main.py

#!/usr/bin/env bash

# Abort on error, undefined var, or failed pipeline

set -euo pipefail

# Sync Python deps

uv sync

# Expose the port (falls back to 8000 locally)

PORT=${PORT:-8000}

echo "Render PORT is: $PORT"

# Replace the shell with the app so signals are forwarded

exec uv run main.py

🧰 Tools

🪛 Shellcheck (0.10.0)

[error] 1-1: Tips depend on target shell and yours is unknown. Add a shebang or a 'shell' directive.

(SC2148)

🤖 Prompt for AI Agents

In new-backend/start.sh at lines 1 to 3, the script lacks a shebang and defensive flags, which can cause unpredictable behavior and silent errors. Add a shebang line (e.g., #!/bin/bash) at the top, include `set -euo pipefail` to stop on errors and undefined variables, and use `exec` before the command that runs the main process to ensure proper signal handling.

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (2)

new-backend/start.sh (2)
2-2: Add the missing -u and -o pipefail defensive flags.

Only set -e is present; without -u (undefined var) and -o pipefail (fail fast in pipelines) silent errors can slip through.
This was pointed out in a previous review but not fully applied.
-set -e
+set -euo pipefail
8-9: Preface the final command with exec so PID 1 gets signals.

uv run should replace the shell so the process receives SIGTERM/SIGINT directly (important in Docker/Render deploys).
Same recommendation was given earlier but is still missing.
-uv run main.py
+exec uv run main.py

🧹 Nitpick comments (2)

new-backend/start.sh (2)
1-1: Prefer portable shebang

#!/usr/bin/env bash is more portable than hard-coding /bin/bash, which may not exist in minimal containers.
-#!/bin/bash
+#!/usr/bin/env bash
5-5: Avoid redundant package install on every start

Re-installing uv each launch slows cold-starts and may break offline images.
Consider a conditional install or vendor it in the image layer.
-pip install uv
+command -v uv >/dev/null 2>&1 || pip install --no-cache-dir -q uv

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 80e9609 and 93c6793.

📒 Files selected for processing (1)

new-backend/start.sh (1 hunks)

ParagGhatage added 6 commits June 20, 2025 09:47

added functionality to the fact-check node

953e59d

built generate-perspective-node

5bc20ba

added functionality to judge_output node

3b54f63

resolved some bugs and applied changes to prevent nexpected errors

f95f8a4

reduced sentiment model token size to 3, converted setiment to lower …

457a50a

…case before storing

added function to chunk and prepare data to save in vector-db

30a5547

coderabbitai bot reviewed Jul 3, 2025

View reviewed changes

added error handlng to chunking function

a376121

coderabbitai bot reviewed Jul 3, 2025

View reviewed changes

new-backend/app/utils/generate_chunk_id.py Outdated Show resolved Hide resolved

generate vector embedding for chunks

e53a7c4

coderabbitai bot reviewed Jul 4, 2025

View reviewed changes

modified fact-check node to use Serpapi, stores vector embedding in t…

385805f

…he pinecone with metadata and original text

coderabbitai bot reviewed Jul 5, 2025

View reviewed changes

frontend+backend connect

a2dee5c

coderabbitai bot reviewed Jul 5, 2025

View reviewed changes

start.sh for backend deployment on Render

921cbe8

coderabbitai bot reviewed Jul 5, 2025

View reviewed changes

ParagGhatage added 2 commits July 5, 2025 15:29

access port number through .env

c19351e

debugging deployment

80e9609

coderabbitai bot reviewed Jul 5, 2025

View reviewed changes

install uv in production environment

93c6793

coderabbitai bot reviewed Jul 5, 2025

View reviewed changes

backend deployed on Hugging Face Spaces and connected to frontend

e595a45

ManavSarkar merged commit cc55296 into main Jul 18, 2025
1 check passed

coderabbitai bot mentioned this pull request Nov 24, 2025

Feat: Added User Authentication and Authorization #120

Open

4 tasks

coderabbitai bot mentioned this pull request Jan 9, 2026

Async LangGraph with DuckDuckGo fact checking and config cleanup #124

Open

		# from app.utils.vector_store import save_to_vector_db
		from app.modules.vector_store.chunk_rag_data import chunk_rag_data

-# from app.utils.vector_store import save_to_vector_db
-from app.modules.vector_store.chunk_rag_data import chunk_rag_data
+# File: new-backend/app/modules/langgraph_nodes/store_and_send.py
+from app.utils.vector_store import save_to_vector_db
+from app.modules.vector_store.chunk_rag_data import chunk_rag_data
+def store_and_send(data):
+    # … any pre-processing …
+    chunks = chunk_rag_data(data)
+    save_to_vector_db(chunks)
+    # … any post-processing …

		load_dotenv()

		client = Groq(api_key=os.getenv("GROQ_API_KEY"))

-load_dotenv()
-client = Groq(api_key=os.getenv("GROQ_API_KEY"))
+load_dotenv()
+api_key = os.getenv("GROQ_API_KEY")
+if not api_key:
+    raise ValueError("GROQ_API_KEY environment variable is not set")
+client = Groq(api_key=api_key)

-        const res = await axios.post("http://localhost:8000/api/process", {
-          url: storedUrl,
-        })
+        const res = await axios.post(`${apiUrl}/api/process`, {
+          url: storedUrl,
+        }, {
+          timeout: 60000, // 60 seconds timeout for analysis
+          headers: {
+            'Content-Type': 'application/json'
+          }
+        })

-uv sync
-echo "Render PORT is: $PORT"
-uv run main.py
+#!/usr/bin/env bash
+# Abort on error, undefined var, or failed pipeline
+set -euo pipefail
+# Sync Python deps
+uv sync
+# Expose the port (falls back to 8000 locally)
+PORT=${PORT:-8000}
+echo "Render PORT is: $PORT"
+# Replace the shell with the app so signals are forwarded
+exec uv run main.py

store_and_send Node: Embedding & Pinecone Storage + Frontend Sync #105

store_and_send Node: Embedding & Pinecone Storage + Frontend Sync #105

Uh oh!

Conversation

ParagGhatage commented Jul 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔧 How it Works

✅ Checklist

Summary by CodeRabbit

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jul 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Possibly related PRs

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot Jul 3, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jul 3, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Jul 3, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jul 3, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jul 3, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jul 3, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jul 4, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jul 4, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jul 5, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jul 5, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jul 5, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jul 5, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

`store_and_send` Node: Embedding & Pinecone Storage + Frontend Sync #105

`store_and_send` Node: Embedding & Pinecone Storage + Frontend Sync #105

ParagGhatage commented Jul 3, 2025 •

edited

Loading

coderabbitai bot commented Jul 3, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)