Skip to content

Rendering data on /results page.#107

Merged
ManavSarkar merged 8 commits intomainfrom
rendering-data
Jul 18, 2025
Merged

Rendering data on /results page.#107
ManavSarkar merged 8 commits intomainfrom
rendering-data

Conversation

@ParagGhatage
Copy link
Collaborator

@ParagGhatage ParagGhatage commented Jul 12, 2025

Tasks done-

  • tested data is getting returned to the frontend withhout any errors.
  • Parsed the data into Json format.
  • Rendered the data in specific section.
    Results:
Facts perspective summary

Summary by CodeRabbit

  • New Features

    • Introduced a robust backend architecture with FastAPI, Docker support, and automated deployment to Hugging Face Spaces.
    • Added a modular pipeline for article analysis, including claim extraction, fact verification, sentiment analysis, perspective generation, and scoring.
    • Integrated vector database storage and semantic search capabilities.
    • Enhanced frontend to dynamically display real analysis results with a tabbed interface and bias meter.
    • Added a simplified AI chat panel for user interaction.
  • Bug Fixes

    • Improved error handling and input validation throughout the backend analysis pipeline.
  • Chores

    • Updated dependencies and project structure for better maintainability.
    • Automated environment setup and deployment workflows.
    • Cleaned up obsolete files and redundant documentation.
  • Documentation

    • Added and updated backend setup instructions; removed outdated documentation.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jul 12, 2025

Important

Review skipped

Review was skipped as selected files did not have any reviewable changes.

💤 Files selected but had no reviewable changes (2)
  • backend/app/modules/langgraph_nodes/fact_check.py
  • backend/app/modules/pipeline.py

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

This update restructures the backend and frontend of the Perspective project. The backend is refactored to use a modular FastAPI architecture with LangGraph-based processing, Pinecone vector storage, and Groq-powered LLM nodes. Multiple new backend modules and utilities are introduced, while legacy files, routes, and services are removed. The frontend is updated to dynamically render analysis results, refactor loading logic, and streamline the UI with tabbed navigation and session-based data handling.

Changes

Files / Groups Change Summary
.github/workflows/deploy-backend-to-hf.yml Added a new GitHub Actions workflow for deploying backend code to Hugging Face Space on main branch push.
.gitignore Added .github/act-events/ and .secrets to ignored paths.
backend/.dockerignore New file to exclude .venv and .env from Docker build context.
backend/Dockerfile, backend/start.sh Added Dockerfile and startup script for backend using Python 3.13, uv, and non-root user setup.
backend/README.md New backend README with setup and structure instructions.
backend/pyproject.toml Added dependencies for search, LLM, NLP, Pinecone, and utility libraries.
backend/requirements.txt Removed legacy requirements file.
backend/main.py New FastAPI app with CORS, router inclusion, and Uvicorn main entry point.
backend/app/main.py, backend/app/routes.py, backend/app/scrapers/*, backend/app/services/*, backend/app/test_perspective.py Removed legacy FastAPI app, routes, scraping, summarization, and AI service modules, and test script.
backend/app/prompts/opposite_perspective.py, backend/app/prompts/related_topics.py Removed prompt template modules.
backend/app/modules/langgraph_builder.py Refactored: added TypedDict for state, updated node function references, replaced set_conditional_edges with add_conditional_edges, and adjusted edge returns.
backend/app/modules/langgraph_nodes/fact_check.py Refactored to use a new fact-check pipeline utility, removing keyword extraction and direct search.
backend/app/modules/langgraph_nodes/generate_perspective.py, backend/app/modules/langgraph_nodes/judge.py, backend/app/modules/langgraph_nodes/store_and_send.py Added new modules for generating perspectives, judging them, and storing data in vector DB; each includes robust error handling and LLM integration.
backend/app/modules/langgraph_nodes/sentiment.py Reduced max tokens to 3, returns sentiment in lowercase.
backend/app/modules/facts_check/llm_processing.py, backend/app/modules/facts_check/web_search.py Added modules for claim extraction, fact verification using Groq, and web search via SerpAPI.
backend/app/modules/scraper/cleaner.py Added runtime NLTK data checks and downloads.
backend/app/db/vector_store.py New Pinecone vector store initialization and index management module.
backend/app/modules/vector_store/chunk_rag_data.py, backend/app/modules/vector_store/embed.py New utilities for chunking data and embedding text using SentenceTransformer.
backend/app/utils/fact_check_utils.py, backend/app/utils/generate_chunk_id.py, backend/app/utils/prompt_templates.py, backend/app/utils/store_vectors.py New utilities for fact-check pipeline orchestration, chunk ID generation, prompt templates, and vector upsert to Pinecone.
frontend/app/analyze/loading/page.tsx Refactored loading logic: now runs API call, handles errors, and manages progress simulation via intervals.
frontend/app/analyze/results/page.tsx Now loads analysis data from session storage, renders dynamic results in tabs, and simplifies the chat UI and layout.
frontend/components/bias-meter.tsx Added semicolons and formatting; no logic changes.
frontend/package.json Added axios dependency.
new-backend/README.md, new-backend/main.py, new-backend/app/modules/langgraph_nodes/* Removed duplicate/legacy backend files and modules.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant Frontend
    participant BackendAPI
    participant LangGraph
    participant Pinecone
    participant GroqLLM
    participant SerpAPI

    User->>Frontend: Submit article URL
    Frontend->>BackendAPI: POST /api/analyze with URL
    BackendAPI->>LangGraph: Start analysis workflow
    LangGraph->>GroqLLM: Extract claims from article
    GroqLLM-->>LangGraph: Return claims
    LangGraph->>SerpAPI: Search web for each claim
    SerpAPI-->>LangGraph: Return search results
    LangGraph->>GroqLLM: Verify claims with evidence
    GroqLLM-->>LangGraph: Return verification verdicts
    LangGraph->>GroqLLM: Generate counter-perspective
    GroqLLM-->>LangGraph: Return perspective and reasoning
    LangGraph->>GroqLLM: Score perspective
    GroqLLM-->>LangGraph: Return score
    LangGraph->>Pinecone: Store facts and perspective vectors
    Pinecone-->>LangGraph: Confirm storage
    LangGraph-->>BackendAPI: Return analysis result
    BackendAPI-->>Frontend: Return analysis result
    Frontend->>User: Display results in tabs
Loading

Possibly related PRs

  • AOSSIE-Org/Perspective#97: Introduces the new backend structure with FastAPI and modular routing, closely matching the new backend architecture in this PR.
  • AOSSIE-Org/Perspective#99: Establishes the initial LangGraph structure and nodes, which are refactored and extended in this PR.
  • AOSSIE-Org/Perspective#101: Adds the sentiment analysis node using the Groq SDK, directly related to the updated sentiment node in this PR.

Poem

In the warren, code hops anew,
Old tunnels cleared, fresh backend grew.
With LangGraph lines and Pinecone stores,
Groq and SerpAPI open doors.
Frontend tabs now show the tale—
Perspectives shift, the facts prevail!
🐇✨


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 26

🔭 Outside diff range comments (1)
backend/app/modules/langgraph_nodes/fact_check.py (1)

21-27: Improve exception handling and logging.

The exception handling could be more specific and the logging should use proper logging levels.

     except Exception as e:
-        print(f"some error occured in fact_checking:{e}")
+        logger.exception(f"Unexpected error in fact checking: {str(e)}")
         return {
             "status": "error",
             "error_from": "fact_checking",
-            "message": f"{e}",
+            "message": f"Fact checking failed: {str(e)}",
             }
🧹 Nitpick comments (13)
backend/start.sh (1)

4-5: Consider checking if uv is already installed before installing.

While the current approach works, you could optimize by checking if uv is already available to avoid unnecessary installation attempts.

-# Install uv if not present
-pip install uv
+# Install uv if not present
+if ! command -v uv &> /dev/null; then
+    pip install uv
+fi
backend/app/utils/prompt_templates.py (1)

3-32: Consider using a more maintainable prompt format.

The multiline string with manual line breaks could be improved for better readability and maintenance. Consider using a cleaner format without the line continuation character.

-generation_prompt = ChatPromptTemplate.from_template("""
-You are an AI assistant that generates a well-reasoned '
-'counter-perspective to a given article.
+generation_prompt = ChatPromptTemplate.from_template("""
+You are an AI assistant that generates a well-reasoned counter-perspective to a given article.
backend/Dockerfile (2)

5-6: Consider pinning OS package versions for reproducibility.

Installing OS packages without version pinning can lead to inconsistent builds across different times.

-RUN apt-get update && apt-get install -y curl build-essential
+RUN apt-get update && apt-get install -y \
+    curl=7.* \
+    build-essential=12.* \
+    && apt-get clean \
+    && rm -rf /var/lib/apt/lists/*

17-18: Optimize Docker layer caching by copying dependency files first.

Currently, the entire project is copied before installing dependencies, which means dependency installation will re-run even for code-only changes.

-# Copy project code
-COPY . /app
+# Copy dependency files first for better caching
+COPY pyproject.toml uv.lock ./
+
+# Install dependencies
+RUN uv sync --locked --no-cache
+
+# Copy rest of the code
+COPY . /app

Then move the dependency installation after copying dependency files and before copying the full codebase.

backend/app/db/vector_store.py (1)

16-19: Consider making constants configurable.

The hardcoded values work for the current setup, but consider making them configurable via environment variables for different environments.

INDEX_NAME = os.getenv("PINECONE_INDEX_NAME", "perspective")
DIMENSIONS = int(os.getenv("PINECONE_DIMENSIONS", "384"))
METRIC = os.getenv("PINECONE_METRIC", "cosine")
backend/app/modules/vector_store/embed.py (1)

4-4: Consider lazy loading the embedder model.

Loading the model at module import time may cause unnecessary delays and memory usage if the module is imported but not used.

_embedder = None

def get_embedder():
    global _embedder
    if _embedder is None:
        _embedder = SentenceTransformer("all-MiniLM-L6-v2")
    return _embedder

Then update line 21 to use get_embedder().encode(texts).tolist().

backend/README.md (2)

59-61: Add language specification to code block.

The static analysis tool correctly identifies missing language specification for better syntax highlighting.

-```
+```bash
 http://localhost:8000/api/

---

`76-87`: **Add language specification and fix path inconsistency.**

The code block needs a language specification, and there's a path inconsistency between the README and the actual setup.



```diff
-```
+```bash
 new-backend/

Also, line 28 mentions cd new-backend but the structure shows the current directory. Consider clarifying the correct path or updating the clone instruction.

backend/app/utils/fact_check_utils.py (1)

40-40: Consider making the rate limit delay configurable.

The hardcoded 5-second delay might be too aggressive for development or too lenient for production.

import os

# At the top of the file
SEARCH_DELAY = float(os.getenv("FACT_CHECK_SEARCH_DELAY", "5.0"))

# Then use it:
time.sleep(SEARCH_DELAY)  # ⏱️ Configurable delay to avoid rate limits
.github/workflows/deploy-backend-to-hf.yml (1)

17-17: Consider shallow clone for deployment efficiency.

Full git history (fetch-depth: 0) may be unnecessary for deployment. Consider using a shallow clone to improve performance.

-          fetch-depth: 0
+          fetch-depth: 1
backend/app/modules/langgraph_nodes/judge.py (1)

31-36: Simplify response handling logic.

The response handling is overly complex for a simple ChatGroq response. ChatGroq responses typically have a consistent .content attribute.

-        if isinstance(response, list) and response:
-            raw = response[0].content.strip()
-        elif hasattr(response, "content"):
-            raw = response.content.strip()
-        else:
-            raw = str(response).strip()
+        raw = response.content.strip()
frontend/app/analyze/results/page.tsx (2)

68-70: Simplify badge variant logic for better readability.

The nested ternary operator can be simplified.

Consider using an object map:

+const sentimentVariants = {
+  positive: 'secondary',
+  negative: 'destructive',
+  neutral: 'outline'
+} as const
+
-<Badge variant={sentiment === 'positive' ? 'secondary' : sentiment === 'negative' ? 'destructive' : 'outline'} className="capitalize">
+<Badge variant={sentimentVariants[sentiment] || 'outline'} className="capitalize">

48-48: Document that this is a placeholder implementation.

The hardcoded response suggests this is a mock implementation.

Consider adding a TODO comment or making the placeholder nature more explicit:

 setTimeout(() => {
+  // TODO: Replace with actual AI backend integration
   setMessages([...newMessages, { role: "system", content: "Based on the article... let me know if you want more details." }])
 }, 1000)
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 501e9c2 and 982e9b8.

⛔ Files ignored due to path filters (3)
  • backend/uv.lock is excluded by !**/*.lock
  • frontend/package-lock.json is excluded by !**/package-lock.json
  • new-backend/uv.lock is excluded by !**/*.lock
📒 Files selected for processing (46)
  • .github/workflows/deploy-backend-to-hf.yml (1 hunks)
  • .gitignore (1 hunks)
  • backend/.dockerignore (1 hunks)
  • backend/Dockerfile (1 hunks)
  • backend/README.md (1 hunks)
  • backend/app/db/vector_store.py (1 hunks)
  • backend/app/main.py (0 hunks)
  • backend/app/modules/facts_check/llm_processing.py (1 hunks)
  • backend/app/modules/facts_check/web_search.py (1 hunks)
  • backend/app/modules/langgraph_builder.py (5 hunks)
  • backend/app/modules/langgraph_nodes/fact_check.py (2 hunks)
  • backend/app/modules/langgraph_nodes/generate_perspective.py (1 hunks)
  • backend/app/modules/langgraph_nodes/judge.py (1 hunks)
  • backend/app/modules/langgraph_nodes/sentiment.py (1 hunks)
  • backend/app/modules/langgraph_nodes/store_and_send.py (1 hunks)
  • backend/app/modules/scraper/cleaner.py (1 hunks)
  • backend/app/modules/vector_store/chunk_rag_data.py (1 hunks)
  • backend/app/modules/vector_store/embed.py (1 hunks)
  • backend/app/prompts/opposite_perspective.py (0 hunks)
  • backend/app/prompts/related_topics.py (0 hunks)
  • backend/app/routes.py (0 hunks)
  • backend/app/scrapers/article_scraper.py (0 hunks)
  • backend/app/scrapers/clean_data.py (0 hunks)
  • backend/app/services/ai_service.py (0 hunks)
  • backend/app/services/analysis_service.py (0 hunks)
  • backend/app/services/counter_service.py (0 hunks)
  • backend/app/services/related_topics.py (0 hunks)
  • backend/app/services/summarization_service.py (0 hunks)
  • backend/app/test_perspective.py (0 hunks)
  • backend/app/utils/fact_check_utils.py (1 hunks)
  • backend/app/utils/generate_chunk_id.py (1 hunks)
  • backend/app/utils/prompt_templates.py (1 hunks)
  • backend/app/utils/store_vectors.py (1 hunks)
  • backend/main.py (1 hunks)
  • backend/pyproject.toml (1 hunks)
  • backend/requirements.txt (0 hunks)
  • backend/start.sh (1 hunks)
  • frontend/app/analyze/loading/page.tsx (2 hunks)
  • frontend/app/analyze/results/page.tsx (3 hunks)
  • frontend/components/bias-meter.tsx (3 hunks)
  • frontend/package.json (2 hunks)
  • new-backend/README.md (0 hunks)
  • new-backend/app/modules/langgraph_nodes/generate_perspective.py (0 hunks)
  • new-backend/app/modules/langgraph_nodes/judge.py (0 hunks)
  • new-backend/app/modules/langgraph_nodes/store_and_send.py (0 hunks)
  • new-backend/main.py (0 hunks)
💤 Files with no reviewable changes (18)
  • new-backend/main.py
  • backend/requirements.txt
  • new-backend/app/modules/langgraph_nodes/generate_perspective.py
  • new-backend/app/modules/langgraph_nodes/store_and_send.py
  • backend/app/services/related_topics.py
  • backend/app/scrapers/article_scraper.py
  • backend/app/main.py
  • new-backend/README.md
  • backend/app/services/analysis_service.py
  • new-backend/app/modules/langgraph_nodes/judge.py
  • backend/app/services/summarization_service.py
  • backend/app/services/counter_service.py
  • backend/app/scrapers/clean_data.py
  • backend/app/prompts/related_topics.py
  • backend/app/test_perspective.py
  • backend/app/prompts/opposite_perspective.py
  • backend/app/services/ai_service.py
  • backend/app/routes.py
🧰 Additional context used
🧬 Code Graph Analysis (4)
backend/app/modules/langgraph_nodes/fact_check.py (1)
backend/app/utils/fact_check_utils.py (1)
  • run_fact_check_pipeline (10-47)
backend/app/utils/fact_check_utils.py (2)
backend/app/modules/facts_check/web_search.py (1)
  • search_with_serpapi (5-28)
backend/app/modules/facts_check/llm_processing.py (2)
  • run_claim_extractor_sdk (12-57)
  • run_fact_verifier_sdk (60-132)
backend/app/modules/vector_store/chunk_rag_data.py (1)
backend/app/utils/generate_chunk_id.py (1)
  • generate_id (4-8)
backend/app/modules/langgraph_builder.py (2)
backend/app/modules/langgraph_nodes/sentiment.py (1)
  • run_sentiment_sdk (10-53)
backend/app/modules/langgraph_nodes/error_handler.py (1)
  • error_handler (3-11)
🪛 Ruff (0.11.9)
backend/app/modules/langgraph_nodes/store_and_send.py

13-13: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


15-15: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


21-21: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

backend/app/utils/store_vectors.py

32-32: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

backend/app/db/vector_store.py

14-14: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


40-41: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

🪛 markdownlint-cli2 (0.17.2)
backend/README.md

59-59: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


76-76: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🪛 actionlint (1.7.7)
.github/workflows/deploy-backend-to-hf.yml

30-30: shellcheck reported issue in this script: SC2086:info:2:31: Double quote to prevent globbing and word splitting

(shellcheck)

🔇 Additional comments (19)
.gitignore (1)

28-30: LGTM! Appropriate security-focused additions.

The new gitignore entries properly exclude GitHub Actions event files and secret files from version control, which is essential for security and CI/CD workflows.

frontend/package.json (1)

41-41: Axios version validated: 1.10.0 is current and secure

Verified that 1.10.0 is the latest Axios release (June 14, 2025) and there are no known security vulnerabilities as of July 2025. Approving this dependency addition.

• frontend/package.json, line 41: "axios": "^1.10.0"

backend/app/utils/generate_chunk_id.py (1)

4-8: Well-implemented utility function with proper validation.

The generate_id function correctly validates input, uses secure SHA-256 hashing, and returns a consistent format. The 15-character hash substring should provide sufficient uniqueness for article chunk identification.

backend/app/modules/langgraph_nodes/sentiment.py (1)

39-39: Good addition for consistency.

Converting sentiment to lowercase ensures consistent output format for downstream processing.

backend/.dockerignore (1)

1-2: Standard and appropriate Docker exclusions.

The .dockerignore properly excludes virtual environments and environment files from Docker builds, following security and efficiency best practices.

backend/app/modules/scraper/cleaner.py (1)

2-10: Excellent approach for ensuring NLTK dependencies are available.

The runtime check for required NLTK corpora is well-implemented and follows best practices. This ensures the text cleaning functionality will work reliably across different environments without manual setup.

backend/pyproject.toml (1)

10-25: Dependencies align well with the new AI/ML functionality.

The added packages support the described backend refactoring with search capabilities, LangChain integration, NLP processing, vector storage, and embedding functionality. All are well-established libraries appropriate for the use case.

frontend/components/bias-meter.tsx (1)

1-78: Good stylistic improvements for code consistency.

The addition of semicolons and formatting adjustments improves code consistency without affecting functionality. These changes align with TypeScript/JavaScript best practices.

backend/app/utils/store_vectors.py (1)

10-28: Well-structured vector storage utility function.

The function design is solid with proper input validation, clear documentation, appropriate logging, and good integration with the Pinecone index. The implementation follows best practices for error handling and type safety.

backend/app/modules/langgraph_nodes/fact_check.py (1)

11-11: Integration Verified: run_fact_check_pipeline Returns a Tuple as Expected
The run_fact_check_pipeline(state) function in backend/app/utils/fact_check_utils.py

  • Accepts a single state dict
  • Returns ([], "<error message>") on various failure paths
  • Returns (final.get("verifications", []), None) on success

No changes are needed—its signature and return format match the call site.

backend/app/modules/langgraph_nodes/store_and_send.py (1)

10-15: Fix incorrect status check logic.

The status check on line 13 is checking the wrong variable - it should check the result from chunk_rag_data, not the original state.

         try:
             chunks = chunk_rag_data(state)
+            # Note: chunk_rag_data doesn't return status, so this check may be unnecessary
+            # or should check the chunks themselves
-        except KeyError as e:
-            raise Exception(f"Missing required data field for chunking: {e}")
+        except KeyError as e:
+            raise Exception(f"Missing required data field for chunking: {e}") from e
         except Exception as e:
-            raise Exception(f"Failed to chunk data: {e}")
+            raise Exception(f"Failed to chunk data: {e}") from e

Likely an incorrect or invalid review comment.

backend/app/modules/vector_store/embed.py (1)

7-30: Excellent implementation with proper validation.

The function properly validates input structure, handles edge cases, and returns a consistent format that integrates well with the vector store operations.

backend/app/modules/vector_store/chunk_rag_data.py (1)

54-55: Fix broken f-string formatting.

The f-string is incorrectly split across lines, causing a syntax error.

-                    raise ValueError("Missing required fact field:"
-                                     f" {field} in fact index {i}")
+                    raise ValueError(f"Missing required fact field: {field} in fact index {i}")

Likely an incorrect or invalid review comment.

frontend/app/analyze/loading/page.tsx (2)

81-93: LGTM: Well-implemented step progression logic.

The step interval logic correctly manages progression through analysis steps and handles cleanup properly.


104-107: LGTM: Proper cleanup implementation.

The cleanup function correctly clears both intervals to prevent memory leaks.

backend/app/modules/langgraph_nodes/generate_perspective.py (1)

9-11: LGTM: Well-defined Pydantic model.

The PerspectiveOutput model is properly structured with clear field descriptions for the LLM's structured output.

backend/app/modules/langgraph_builder.py (3)

11-22: Good addition of type safety with TypedDict.

The MyState class provides clear type definitions for the state dictionary used throughout the graph, improving code maintainability and IDE support.


56-56: Also applies to: 63-63, 72-72, 81-81, 95-95


100-100: Correct use of "end" for graph termination.

Using "__end__" is the proper way to indicate the end of the graph execution in LangGraph.

model="gemma2-9b-it",
temperature=0.2,
max_tokens=10,
max_tokens=3,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Reconsider the max_tokens reduction - may cause response truncation.

Setting max_tokens=3 is too restrictive for sentiment analysis. The expected responses ("Positive", "Negative", "Neutral") require 7-8 characters, which could be truncated with only 3 tokens, potentially causing "Pos", "Neg", or "Neu" responses that may break downstream processing.

-            max_tokens=3,
+            max_tokens=10,
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
max_tokens=3,
max_tokens=10,
🤖 Prompt for AI Agents
In backend/app/modules/langgraph_nodes/sentiment.py at line 35, the max_tokens
parameter is set to 3, which is too low and may truncate the sentiment analysis
output. Increase max_tokens to at least 8 to ensure full words like "Positive",
"Negative", or "Neutral" are returned without truncation, preventing issues in
downstream processing.

Comment on lines 29 to 32
except Exception as e:
logger.error("Failed to store "
f"vectors in namespace '{namespace}': {e}")
raise RuntimeError(f"Vector storage failed: {e}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix exception chaining to preserve error context.

The static analysis hint is correct - when re-raising exceptions, use from e to preserve the original exception chain for better debugging.

     except Exception as e:
         logger.error("Failed to store "
                      f"vectors in namespace '{namespace}': {e}")
-        raise RuntimeError(f"Vector storage failed: {e}")
+        raise RuntimeError(f"Vector storage failed: {e}") from e
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
except Exception as e:
logger.error("Failed to store "
f"vectors in namespace '{namespace}': {e}")
raise RuntimeError(f"Vector storage failed: {e}")
except Exception as e:
logger.error("Failed to store "
f"vectors in namespace '{namespace}': {e}")
raise RuntimeError(f"Vector storage failed: {e}") from e
🧰 Tools
🪛 Ruff (0.11.9)

32-32: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

🤖 Prompt for AI Agents
In backend/app/utils/store_vectors.py around lines 29 to 32, the exception is
re-raised without preserving the original exception context. Modify the raise
statement to use 'raise RuntimeError(...) from e' to maintain the exception
chaining and improve debugging.

Comment on lines 14 to 20
app.add_middleware(
CORSMiddleware,
allow_origins=["*"],
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Review CORS configuration for production security.

The current CORS configuration allows all origins ("*"), credentials, methods, and headers, which poses security risks in production environments. Consider restricting these based on your deployment environment.

-app.add_middleware(
-    CORSMiddleware,
-    allow_origins=["*"],
-    allow_credentials=True,
-    allow_methods=["*"],
-    allow_headers=["*"],
-)
+import os
+
+# Configure CORS based on environment
+allowed_origins = os.getenv("ALLOWED_ORIGINS", "*").split(",")
+app.add_middleware(
+    CORSMiddleware,
+    allow_origins=allowed_origins,
+    allow_credentials=True,
+    allow_methods=["GET", "POST", "PUT", "DELETE"],
+    allow_headers=["*"],
+)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
app.add_middleware(
CORSMiddleware,
allow_origins=["*"],
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
import os
# Configure CORS based on environment
allowed_origins = os.getenv("ALLOWED_ORIGINS", "*").split(",")
app.add_middleware(
CORSMiddleware,
allow_origins=allowed_origins,
allow_credentials=True,
allow_methods=["GET", "POST", "PUT", "DELETE"],
allow_headers=["*"],
)
🤖 Prompt for AI Agents
In backend/main.py around lines 14 to 20, the CORS middleware is configured to
allow all origins, credentials, methods, and headers, which is insecure for
production. Update the allow_origins list to specify only trusted domains
instead of "*", and review allow_credentials, allow_methods, and allow_headers
to restrict them to necessary values based on your deployment environment to
enhance security.

Comment on lines 5 to 8
def search_with_serpapi(query, max_results=1):
api_key = os.getenv("SERPAPI_KEY")
if not api_key:
raise ValueError("SERPAPI_KEY not set in environment")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add input validation for the query parameter.

The function should validate that the query parameter is not empty or None to prevent unnecessary API calls.

 def search_with_serpapi(query, max_results=1):
+    if not query or not query.strip():
+        raise ValueError("Query parameter cannot be empty")
+    
     api_key = os.getenv("SERPAPI_KEY")
     if not api_key:
         raise ValueError("SERPAPI_KEY not set in environment")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def search_with_serpapi(query, max_results=1):
api_key = os.getenv("SERPAPI_KEY")
if not api_key:
raise ValueError("SERPAPI_KEY not set in environment")
def search_with_serpapi(query, max_results=1):
if not query or not query.strip():
raise ValueError("Query parameter cannot be empty")
api_key = os.getenv("SERPAPI_KEY")
if not api_key:
raise ValueError("SERPAPI_KEY not set in environment")
# …rest of the implementation…
🤖 Prompt for AI Agents
In backend/app/modules/facts_check/web_search.py around lines 5 to 8, the
function search_with_serpapi lacks validation for the query parameter. Add a
check at the start of the function to ensure the query is not None or an empty
string, and raise a ValueError if this validation fails to prevent unnecessary
API calls.

Comment on lines 17 to 28
search = GoogleSearch(params)
results = search.get_dict()
organic = results.get("organic_results", [])

return [
{
"title": r.get("title", ""),
"snippet": r.get("snippet", ""),
"link": r.get("link", ""),
}
for r in organic
]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add error handling for SerpAPI failures.

The function should handle potential API errors from SerpAPI to prevent unhandled exceptions.

-    search = GoogleSearch(params)
-    results = search.get_dict()
-    organic = results.get("organic_results", [])
+    try:
+        search = GoogleSearch(params)
+        results = search.get_dict()
+        
+        if "error" in results:
+            raise RuntimeError(f"SerpAPI error: {results['error']}")
+            
+        organic = results.get("organic_results", [])
+    except Exception as e:
+        raise RuntimeError(f"Search API failed: {str(e)}")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
search = GoogleSearch(params)
results = search.get_dict()
organic = results.get("organic_results", [])
return [
{
"title": r.get("title", ""),
"snippet": r.get("snippet", ""),
"link": r.get("link", ""),
}
for r in organic
]
try:
search = GoogleSearch(params)
results = search.get_dict()
if "error" in results:
raise RuntimeError(f"SerpAPI error: {results['error']}")
organic = results.get("organic_results", [])
except Exception as e:
raise RuntimeError(f"Search API failed: {str(e)}")
return [
{
"title": r.get("title", ""),
"snippet": r.get("snippet", ""),
"link": r.get("link", ""),
}
for r in organic
]
🤖 Prompt for AI Agents
In backend/app/modules/facts_check/web_search.py around lines 17 to 28, the code
calls SerpAPI without handling potential errors, which can cause unhandled
exceptions if the API fails. Add a try-except block around the API call and
result processing to catch exceptions, log or handle the error appropriately,
and return a safe default value such as an empty list to ensure the function
fails gracefully.

Comment on lines 113 to 119
try:
parsed = json.loads(content)
except Exception as parse_err:
print(f"❌ LLM JSON parse error: {parse_err}")

results_list.append(parsed)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix undefined variable error when JSON parsing fails.

If JSON parsing fails, parsed will be undefined, causing a runtime error when appending to results_list.

Apply this fix:

 try:
     parsed = json.loads(content)
+    results_list.append(parsed)
 except Exception as parse_err:
     print(f"❌ LLM JSON parse error: {parse_err}")
-
-results_list.append(parsed)
+    # Skip this result or add error entry
+    continue
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
try:
parsed = json.loads(content)
except Exception as parse_err:
print(f"❌ LLM JSON parse error: {parse_err}")
results_list.append(parsed)
try:
parsed = json.loads(content)
results_list.append(parsed)
except Exception as parse_err:
print(f"❌ LLM JSON parse error: {parse_err}")
# Skip this result or add error entry
continue
🤖 Prompt for AI Agents
In backend/app/modules/facts_check/llm_processing.py around lines 113 to 119,
the variable 'parsed' is used outside the try-except block without being defined
if json.loads(content) raises an exception. To fix this, define 'parsed' with a
default value (e.g., None or an empty dict) before the try block or assign it in
the except block to ensure it is always defined before appending to
results_list.

Comment on lines 121 to 124
"claim": claim,
"verifications": results_list,
"status": "success",
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Return comprehensive results instead of just the last claim.

The function currently returns only the last processed claim instead of all claims.

The return statement should include all processed data:

 return {
-    "claim": claim,
+    "claims": [result.get("claim") for result in search_results],
     "verifications": results_list,
     "status": "success",
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"claim": claim,
"verifications": results_list,
"status": "success",
}
return {
- "claim": claim,
+ "claims": [result.get("claim") for result in search_results],
"verifications": results_list,
"status": "success",
}
🤖 Prompt for AI Agents
In backend/app/modules/facts_check/llm_processing.py around lines 121 to 124,
the function currently returns only the last processed claim's results. Modify
the return statement to include all processed claims and their corresponding
results, ensuring the output contains comprehensive data for every claim
handled, not just the final one.

* Provides an interactive interface for users to review an article's summary, explore alternative perspectives, verify factual claims, and engage in AI-assisted discussion. Includes responsive navigation, a bias visualization, and a curated list of references.
*/
export default function AnalyzePage() {
const [analysisData, setAnalysisData] = useState<any>(null)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Define proper TypeScript interface instead of using any.

Using any defeats the purpose of TypeScript's type safety.

Define an interface for the analysis data:

interface AnalysisData {
  cleaned_text: string
  facts: Array<{
    original_claim: string
    verdict: string
    explanation: string
    source_link: string
  }>
  sentiment: 'positive' | 'negative' | 'neutral'
  perspective: {
    perspective: string
    reasoning: string
  }
  score: number
}

const [analysisData, setAnalysisData] = useState<AnalysisData | null>(null)
🤖 Prompt for AI Agents
In frontend/app/analyze/results/page.tsx at line 20, replace the useState type
from any to a properly defined TypeScript interface. Define an interface named
AnalysisData that includes cleaned_text as string, facts as an array of objects
with original_claim, verdict, explanation, and source_link as strings, sentiment
as a union of 'positive', 'negative', or 'neutral', perspective as an object
with perspective and reasoning strings, and score as a number. Then update the
useState declaration to use AnalysisData or null instead of any.

Comment on lines +117 to +119
<Link href={fact.source_link} target="_blank" className="flex items-center text-sm hover:underline">
<LinkIcon className="mr-1 h-4 w-4" /> Source
</Link>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Add security attributes to external links.

External links with target="_blank" should include security attributes to prevent potential vulnerabilities.

-<Link href={fact.source_link} target="_blank" className="flex items-center text-sm hover:underline">
+<Link href={fact.source_link} target="_blank" rel="noopener noreferrer" className="flex items-center text-sm hover:underline">
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
<Link href={fact.source_link} target="_blank" className="flex items-center text-sm hover:underline">
<LinkIcon className="mr-1 h-4 w-4" /> Source
</Link>
<Link
href={fact.source_link}
target="_blank"
rel="noopener noreferrer"
className="flex items-center text-sm hover:underline"
>
<LinkIcon className="mr-1 h-4 w-4" /> Source
</Link>
🤖 Prompt for AI Agents
In frontend/app/analyze/results/page.tsx around lines 117 to 119, the external
link using target="_blank" lacks security attributes. Add rel="noopener
noreferrer" to the Link component to prevent security vulnerabilities related to
opening new tabs.

Comment on lines 35 to 37
const storedData = sessionStorage.getItem("analysisResult")
if (storedData) setAnalysisData(JSON.parse(storedData))
else console.warn("No analysis result found")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Add error handling for sessionStorage operations.

JSON parsing can throw exceptions if the stored data is malformed.

Wrap the sessionStorage operations in a try-catch:

-const storedData = sessionStorage.getItem("analysisResult")
-if (storedData) setAnalysisData(JSON.parse(storedData))
-else console.warn("No analysis result found")
+try {
+  const storedData = sessionStorage.getItem("analysisResult")
+  if (storedData) {
+    setAnalysisData(JSON.parse(storedData))
+  } else {
+    // Handle missing data appropriately
+    setIsLoading(false)
+  }
+} catch (error) {
+  console.error("Failed to parse analysis data:", error)
+  setIsLoading(false)
+}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const storedData = sessionStorage.getItem("analysisResult")
if (storedData) setAnalysisData(JSON.parse(storedData))
else console.warn("No analysis result found")
try {
const storedData = sessionStorage.getItem("analysisResult")
if (storedData) {
setAnalysisData(JSON.parse(storedData))
} else {
// Handle missing data appropriately
setIsLoading(false)
}
} catch (error) {
console.error("Failed to parse analysis data:", error)
setIsLoading(false)
}
🤖 Prompt for AI Agents
In frontend/app/analyze/results/page.tsx around lines 35 to 37, the code
retrieves and parses data from sessionStorage without error handling, which can
cause the app to crash if the stored JSON is malformed. Wrap the
sessionStorage.getItem and JSON.parse calls in a try-catch block to catch and
handle any exceptions, logging a warning or error message if parsing fails or
data is missing.

resolving merge conflicts
@ManavSarkar ManavSarkar merged commit 1ae7e6b into main Jul 18, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants