Skip to content

Conversation

@ParagGhatage
Copy link
Collaborator

@ParagGhatage ParagGhatage commented Jun 23, 2025

This PR introduces the generate-perspective node to the LangGraph pipeline. This node is designed to generate a well-reasoned, logical, and respectful counter-perspective to any given article.

How It Works

The node takes in:

  • cleaned_text: The cleaned and extracted article content
  • sentiment: The sentiment of the original article (positive, neutral, or negative)
  • facts: A list of verified factual statements derived from the article or external sources

It uses a prompt template with clearly defined sections and a Chain-of-Thought reasoning style to guide the LLM in generating thoughtful responses.

Model used: llama-3.3-70b-versatile


Sample Output State (after generate-perspective-node)

{
    "cleaned_text": "The 2025 French Open men’s final at Roland Garros was more than just a sporting event — it was also a major celebrity moment.\n\nAs Carlos Alcaraz battled Jannik Sinner on the iconic clay courts of Paris, the stands were filled with famous faces from film, music, and sport.\n\nAmong those spotted in the crowd were singer and fashion icon Pharrell Williams, actors Natalie Portman, Lily Collins, Dustin Hoffman, and Eddie Redmayne, as well as filmmaker Spike Lee. Netflix star Taylor Zakhar-Perez and British Formula 1 driver George Russell also made an appearance, adding to the glamour and excitement of the championship match.\n\nRoland Garros has long been a favourite among stars, known for its unique combination of top-level tennis and Parisian flair. This year was no exception, with celebrity attendees enjoying both the high-stakes final and the stylish atmosphere of the grounds.\n\nTake a look at the various celebrities who turned up at the event:\n\nWhile all eyes were on Alcaraz and Sinner as they went head-to-head in a tense and athletic final, the buzz in the stands was equally electric. Fans and photographers alike turned their cameras to the VIP section, capturing moments of the celebrities enjoying the match, chatting between sets, and soaking in the summer sunshine.\n\nWith its mix of elite sport and high-profile guests, the Roland Garros men’s final once again proved that tennis can bring together the worlds of film, fashion, and speed — all in one unforgettable Paris afternoon.\n\nStay updated with the latest Trending, India , World and United States news. Follow all the latest updates on Israel Iran Conflict here on Livemint.\n\nBusiness NewsNewsUs NewsRoland Garros 2025: From Pharrell Williams to Natalie Portman, stars step out at French Open Men’s Final",
    "facts": [
        {
            "verdict": "True",
            "explanation": "The provided article states that Carlos Alcaraz and Jannik Sinner played in the 2025 French Open men's final.",
            "original_claim": "**Carlos Alcaraz and Jannik Sinner played in the 2025 French Open men's final.**",
            "source_link": "https://www.cbssports.com/tennis/news/2025-french-open-what-carlos-alcaraz-jannik-sinner-said-after-all-time-mens-singles-final-at-roland-garros/"
        },
        {
            "verdict": "Unverifiable",
            "explanation": "The article lists celebrities who attended the 2023 French Open final, not the 2025 final. ",
            "original_claim": "**Pharrell Williams attended the 2025 French Open men's final.**",
            "source_link": "https://www.essentiallysports.com/atp-tennis-news-pharell-williams-to-dustin-hoffman-celebrities-who-attended-the-french-open-final-between-alcaraz-and-sinner/"
        },
        {
            "verdict": "Unverifiable",
            "explanation": "The provided evidence focuses on George Russell's participation in the Canadian Grand Prix, a Formula 1 race. It does not offer any information about his presence at the 2025 French Open.",
            "original_claim": "**George Russell, a British Formula 1 driver, was present at the 2025 French Open men's final.**",
            "source_link": "https://www.espn.com/f1/story/_/id/45519593/red-bull-protest-russell-canadian-gp-victory"
        }
    ],
    "sentiment": "Positive",
    "perspective": {
        "reasoning": "The article presents a positive sentiment towards the 2025 French Open men's final, highlighting the presence of celebrities in the crowd. However, a counter-perspective could argue that this focus on celebrity attendance detracts from the true significance of the event, which is the sporting competition itself. Step 1: Identify the main theme of the article, which is the intersection of sports and celebrity culture at the French Open. Step 2: Consider the potential drawbacks of this intersection, such as the distraction from the athletes' achievements and the sport's integrity. Step 3: Evaluate the verified facts provided, noting that some claims of celebrity attendance are unverifiable. Step 4: Reflect on the potential consequences of prioritizing celebrity presence over the sport itself, including the possible undermining of the event's integrity. Step 5: Formulate a counter-perspective that presents a more nuanced view of the event, acknowledging both the excitement of celebrity attendance and the potential risks to the sport's appreciation.",
        "perspective": "The 2025 French Open men's final, while a significant sporting event, may not be as glamorous or celebrity-studded as perceived. The presence of famous faces could be seen as a distraction from the true essence of the competition, which is the athletes' skill and dedication. Furthermore, the emphasis on celebrity attendees might overshadow the achievements of the players and the sport as a whole. The verified facts provided do not fully support the claims of celebrity attendance, which could indicate a focus on sensationalism over factual reporting. Ultimately, the intersection of sports and celebrity culture at events like the French Open can be seen as a double-edged sword, potentially undermining the integrity and appreciation of the sport itself."
    },
    
    "status": "success"
}

<!-- This is an auto-generated comment: release notes by coderabbit.ai -->
## Summary by CodeRabbit

* **New Features**
  * Introduced a multi-stage text analysis workflow that includes sentiment analysis, fact checking, perspective generation, judgment, and error handling.
  * Added the ability to generate well-reasoned counter-perspectives for provided articles.
  * Implemented scoring and retry logic for generated perspectives, with results stored and prepared for frontend use.

* **Chores**
  * Added new backend dependencies to support language processing, search, and environment management.

* **Style**
  * Made minor formatting improvements for code readability.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jun 23, 2025

Walkthrough

This update introduces a modular, state-driven workflow for text analysis using a graph-based approach. New modules define nodes for generating perspectives, judging quality, storing results, and handling errors. A prompt template for counter-perspective generation and additional dependencies for language processing and graph orchestration are included.

Changes

File(s) Change Summary
app/modules/langgraph_builder.py New module defining a typed state and function to build a stateful processing graph with conditional workflow logic.
app/modules/langgraph_nodes/generate_perspective.py, app/modules/langgraph_nodes/judge.py, New node modules: generate perspectives, judge quality, store/send results, each with error handling and state updates.
app/modules/langgraph_nodes/store_and_send.py
app/utils/prompt_templates.py New prompt template for generating counter-perspectives using structured output.
app/modules/pipeline.py Added a blank line after an import; no functional change.
app/utils/fact_check_utils.py Minor formatting: added blank lines for readability, no logic changes.
pyproject.toml Added dependencies: dotenv, duckduckgo-search, groq, langchain, langchain-community, langchain-groq, langgraph, nltk.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant Pipeline
    participant SentimentNode
    participant FactCheckNode
    participant PerspectiveNode
    participant JudgeNode
    participant StoreSendNode
    participant ErrorHandler

    User->>Pipeline: Submit text
    Pipeline->>SentimentNode: Analyze sentiment
    SentimentNode-->>Pipeline: Update state
    Pipeline->>FactCheckNode: Fact check
    FactCheckNode-->>Pipeline: Update state
    Pipeline->>PerspectiveNode: Generate perspective
    PerspectiveNode-->>Pipeline: Update state
    Pipeline->>JudgeNode: Judge perspective
    JudgeNode-->>Pipeline: Update state (score, status)
    alt score < 70 and retries < 3
        Pipeline->>PerspectiveNode: Retry perspective generation
    else score >= 70 or retries >= 3
        Pipeline->>StoreSendNode: Store and send results
        StoreSendNode-->>Pipeline: Success or error
    end
    alt Any error status
        Pipeline->>ErrorHandler: Handle error
    end
Loading

Possibly related PRs

Suggested reviewers

  • Ms-Error

Poem

In the warren of code, a graph now blooms,
Nodes for thoughts, for facts, for rooms—
Of judgment, storage, error’s tune—
With prompts that spark perspective’s light,
And dependencies tucked in snug and tight.
Hopping forward, the pipeline’s new delight!
🐇✨


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5bc20ba and 5c17520.

📒 Files selected for processing (2)
  • new-backend/app/modules/pipeline.py (1 hunks)
  • new-backend/app/utils/fact_check_utils.py (2 hunks)
✅ Files skipped from review due to trivial changes (2)
  • new-backend/app/modules/pipeline.py
  • new-backend/app/utils/fact_check_utils.py
✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 17

🔭 Outside diff range comments (1)
new-backend/app/modules/facts_check/llm_processing.py (1)

60-138: Improve string formatting and logging practices.

The function logic is sound with good fallback handling for JSON parsing. However, there are several areas for improvement:

  1. String concatenation: Lines 68-70 can be simplified using f-strings
  2. Regex robustness: The markdown stripping regex might miss edge cases
  3. Replace print statements with proper logging
-            evidence = (f"{result.get('title', '')}"
-                        f"\n{result.get('snippet', '')}"
-                        f"\nLink: {source}")
+            evidence = f"{result.get('title', '')}\n{result.get('snippet', '')}\nLink: {source}"

-            content = re.sub(r"^```json|```$", "", content).strip()
+            content = re.sub(r"^```(?:json)?\s*|```\s*$", "", content, flags=re.MULTILINE).strip()

-            print(content)
-            print(f"❌ LLM JSON parse error: {parse_err}")
-        print(f"🔥 Error in fact_verification: {e}")
+            # Use proper logging instead of print statements
+            import logging
+            logging.info(f"LLM response: {content}")
+            logging.error(f"LLM JSON parse error: {parse_err}")
+        logging.error(f"Error in fact_verification: {e}")
🧹 Nitpick comments (18)
new-backend/app/modules/scraper/cleaner.py (1)

32-71: Consider making boilerplate patterns configurable.

The expanded boilerplate patterns list is comprehensive and well-thought-out. However, the hardcoded list might be too aggressive for some content types or miss patterns specific to certain websites.

Consider making the boilerplate patterns configurable through a configuration file or environment variables to allow customization without code changes:

# Load patterns from config
def load_boilerplate_patterns():
    default_patterns = [
        r"read more at.*",
        r"subscribe to.*",
        # ... existing patterns
    ]
    # Could load additional patterns from config file
    return default_patterns

boilerplate_phrases = load_boilerplate_patterns()
new-backend/app/modules/langgraph_nodes/store_and_send.py (2)

12-12: Fix typo in error message.

There's a spelling error in the error message.

-        print(f"some error occured in store_and_send:{e}")
+        print(f"Some error occurred in store_and_send: {e}")

11-17: Consider more specific exception handling.

The generic exception handling might hide important errors and make debugging difficult. Consider logging the full exception details and potentially re-raising critical errors.

    except Exception as e:
-        print(f"some error occured in store_and_send:{e}")
+        logging.error(f"Error in store_and_send: {e}", exc_info=True)
        return {
            "status": "error",
            "error_from": "store_and_send",
            "message": f"{e}",
        }
new-backend/app/utils/prompt_templates.py (1)

3-32: Enhance prompt robustness and specificity.

The prompt template is well-structured but could benefit from more specific instructions and better handling of edge cases.

Consider these improvements:

  1. Add input validation instructions:
Generate a logical and respectful *opposite perspective* to the article.
+If the article lacks clear arguments or is purely factual, focus on alternative interpretations or potential concerns.
+If no facts are provided, base your reasoning solely on the article content.
Use *step-by-step reasoning* and return your output in this JSON format:
  1. Add length guidelines:
  "counter_perspective": "<your opposite point of view>",
+  // Keep counter_perspective between 100-300 words
  "reasoning_steps": [
+    // Provide 3-5 reasoning steps, each 1-2 sentences
  1. Add fallback instructions:
}}
+
+If you cannot generate a meaningful counter-perspective, return:
+{{
+  "counter_perspective": "Unable to generate opposing view - article may be purely factual or lack clear arguments",
+  "reasoning_steps": ["Explanation of why counter-perspective cannot be generated"]
+}}
new-backend/app/modules/langgraph_nodes/sentiment.py (1)

33-36: Make model configuration parameters configurable.

Hard-coded model parameters reduce flexibility and make testing difficult.

+# Add constants at module level
+SENTIMENT_MODEL = os.getenv("SENTIMENT_MODEL", "gemma2-9b-it")
+SENTIMENT_TEMPERATURE = float(os.getenv("SENTIMENT_TEMPERATURE", "0.2"))
+SENTIMENT_MAX_TOKENS = int(os.getenv("SENTIMENT_MAX_TOKENS", "10"))

            model="gemma2-9b-it",
            temperature=0.2,
            max_tokens=10,
+            model=SENTIMENT_MODEL,
+            temperature=SENTIMENT_TEMPERATURE,
+            max_tokens=SENTIMENT_MAX_TOKENS,
new-backend/app/modules/pipeline.py (1)

41-64: Remove commented code or convert to proper documentation.

Large blocks of commented code should either be removed or converted to proper documentation if they serve as examples.

If this code represents the intended workflow structure, consider moving it to documentation or a separate example file rather than leaving it commented in the production code.

new-backend/pyproject.toml (1)

9-19: Consider using more restrictive version constraints for production dependencies.

The newly added dependencies use minimum version constraints (>=) which could lead to compatibility issues if breaking changes are introduced in future releases. For production applications, consider using compatible release operators (~=) or exact version pinning for critical dependencies.

-    "dotenv>=0.9.9",
-    "duckduckgo-search>=8.0.4",
-    "groq>=0.28.0",
-    "langchain>=0.3.25",
-    "langchain-community>=0.3.25",
-    "langchain-groq>=0.3.2",
-    "langgraph>=0.4.8",
-    "nltk>=3.9.1",
+    "dotenv~=0.9.9",
+    "duckduckgo-search~=8.0.4",
+    "groq~=0.28.0",
+    "langchain~=0.3.25",
+    "langchain-community~=0.3.25",
+    "langchain-groq~=0.3.2",
+    "langgraph~=0.4.8",
+    "nltk~=3.9.1",
new-backend/app/modules/langgraph_nodes/judge.py (1)

11-11: Fix typo in error message.

There's a typo in the function name within the error message.

-        print(f"some error occured in judge_perspetive:{e}")
+        print(f"some error occurred in judge_perspective:{e}")
new-backend/app/utils/fact_check_utils.py (2)

28-28: Make the rate limiting delay configurable.

The hardcoded 4-second delay between searches may be inefficient and should be configurable based on the search provider's rate limits.

-        time.sleep(4)  # Add 4 second delay to prevent rate-limit
+        delay = state.get("search_delay", 4)  # Default 4 seconds, configurable
+        time.sleep(delay)  # Add delay to prevent rate-limit

8-31: Consider breaking down the function for better modularity.

This function handles multiple responsibilities (claim extraction, parsing, searching, verification). Consider breaking it into smaller, focused functions for better maintainability.

The function could be refactored into:

  • extract_and_parse_claims(state)
  • search_claims_with_rate_limiting(claims, delay=4)
  • verify_search_results(search_results)
new-backend/app/modules/langgraph_nodes/fact_check.py (2)

1-1: Remove commented-out import.

Clean up the commented import line.

-# from app.modules.pipeline import run_fact_check_pipeline\

14-14: Fix typo in error message.

There's a typo in the error message.

-        print(f"some error occured in fact_checking:{e}")
+        print(f"some error occurred in fact_checking:{e}")
new-backend/app/modules/langgraph_nodes/generate_perspective.py (4)

35-38: Simplify conditional logic as suggested by static analysis.

The elif after raise is unnecessary and can be simplified.

         if not text:
             raise ValueError("Missing or empty 'cleaned_text' in state")
-        elif not facts:
+        if not facts:
             raise ValueError("Missing or empty 'facts' in state")

14-19: Consider making model configuration more flexible.

The hardcoded model name and temperature could be made configurable through environment variables or parameters.

-my_llm = "llama-3.3-70b-versatile"
+import os
+my_llm = os.getenv("PERSPECTIVE_MODEL", "llama-3.3-70b-versatile")
 
 llm = ChatGroq(
     model=my_llm,
-    temperature=0.7
+    temperature=float(os.getenv("PERSPECTIVE_TEMPERATURE", "0.7"))
 )

50-50: Fix typo in error message.

There's a typo in the error message.

-        print(f"some error occured in generate_perspective:{e}")
+        print(f"some error occurred in generate_perspective:{e}")

44-48: Add validation for LLM API response.

Consider adding validation to ensure the LLM returns a properly structured response before using it.

         result = chain.invoke({
             "cleaned_article": text,
             "facts": facts_str,
             "sentiment": state.get("sentiment", "neutral")
         })
+        
+        # Validate LLM response structure
+        if not hasattr(result, 'perspective') or not result.perspective:
+            raise ValueError("LLM failed to generate a valid perspective")
new-backend/app/modules/facts_check/llm_processing.py (2)

12-58: LGTM! Well-structured claim extraction with good error handling.

The function properly validates input, uses appropriate LLM parameters, and has comprehensive error handling. The system prompt is clear and specific about the expected output format.

Consider using f-strings consistently in the system prompt for better readability:

                        "You are an assistant that extracts v"
                        "erifiable factual claims from articles. "
                        "Each claim must be short, fact-based, and"
                        " independently verifiable through internet search. "
                        "Only return a list of 3 clear bullet-point claims."

Could be:

+                        "You are an assistant that extracts verifiable factual claims from articles. "
+                        "Each claim must be short, fact-based, and independently verifiable through internet search. "
+                        "Only return a list of 3 clear bullet-point claims."

38-38: Consider making model parameters configurable.

Both functions hard-code the model name "gemma2-9b-it". Consider making this configurable through environment variables or a configuration file for better flexibility and maintainability.

+MODEL_NAME = os.getenv("GROQ_MODEL", "gemma2-9b-it")

-            model="gemma2-9b-it",
+            model=MODEL_NAME,

Also applies to: 100-100

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9232290 and 5bc20ba.

⛔ Files ignored due to path filters (1)
  • new-backend/uv.lock is excluded by !**/*.lock
📒 Files selected for processing (17)
  • frontend/app/page.tsx (1 hunks)
  • new-backend/app/modules/facts_check/llm_processing.py (1 hunks)
  • new-backend/app/modules/facts_check/web_search.py (1 hunks)
  • new-backend/app/modules/langgraph_builder.py (1 hunks)
  • new-backend/app/modules/langgraph_nodes/error_handler.py (1 hunks)
  • new-backend/app/modules/langgraph_nodes/fact_check.py (1 hunks)
  • new-backend/app/modules/langgraph_nodes/generate_perspective.py (1 hunks)
  • new-backend/app/modules/langgraph_nodes/judge.py (1 hunks)
  • new-backend/app/modules/langgraph_nodes/sentiment.py (1 hunks)
  • new-backend/app/modules/langgraph_nodes/store_and_send.py (1 hunks)
  • new-backend/app/modules/pipeline.py (2 hunks)
  • new-backend/app/modules/scraper/cleaner.py (2 hunks)
  • new-backend/app/modules/scraper/extractor.py (1 hunks)
  • new-backend/app/routes/routes.py (2 hunks)
  • new-backend/app/utils/fact_check_utils.py (1 hunks)
  • new-backend/app/utils/prompt_templates.py (1 hunks)
  • new-backend/pyproject.toml (1 hunks)
🧰 Additional context used
🪛 Pylint (3.3.7)
new-backend/app/modules/langgraph_nodes/generate_perspective.py

[refactor] 9-9: Too few public methods (0/2)

(R0903)


[refactor] 35-38: Unnecessary "elif" after "raise", remove the leading "el" from "elif"

(R1720)

🔇 Additional comments (7)
new-backend/app/modules/scraper/extractor.py (1)

40-40: Good change to improve extraction quality.

Adding no_fallback=True will improve the quality of extracted content by disabling less reliable fallback mechanisms. This aligns well with the expanded cleaning logic in the pipeline.

Monitor the impact of this change on extraction success rates, as it might reduce the number of articles that can be successfully processed:

#!/bin/bash
# Description: Search for any error handling or fallback logic related to trafilatura extraction

# Look for extraction error handling in the codebase
rg -A 5 -B 5 "extract_with_trafilatura|trafilatura.*extract" --type py
frontend/app/page.tsx (1)

118-120: LGTM! Good UX enhancement.

The addition clearly communicates the free access benefit to users, using consistent styling and animation patterns that match the existing design system.

new-backend/app/modules/langgraph_nodes/fact_check.py (1)

5-24: LGTM: Well-structured error handling and state management.

The function properly validates input, handles exceptions gracefully, and maintains consistent state structure throughout the pipeline.

new-backend/app/modules/langgraph_nodes/generate_perspective.py (1)

9-11: Address static analysis hint: PerspectiveOutput class design.

The class has only fields and no methods, which triggers a pylint warning. This is acceptable for Pydantic models used for data validation and serialization.

new-backend/app/modules/facts_check/llm_processing.py (1)

1-10: LGTM! Clean imports and secure API key handling.

The imports are appropriate and the API key is properly loaded from environment variables, following security best practices.

new-backend/app/modules/langgraph_builder.py (2)

1-22: LGTM! Clean imports and well-defined state structure.

The imports are appropriate and the MyState TypedDict provides clear type definitions for all state fields used in the pipeline.


24-55: LGTM! Systematic node setup with clear pipeline flow.

The graph construction properly sets up all processing nodes with appropriate entry and finish points. The pipeline flow from sentiment analysis through fact checking, perspective generation, judgment, and storage is logical.

Also applies to: 104-106

Comment on lines 2 to 10
import nltk

try:
nltk.data.find('corpora/stopwords')
nltk.data.find('corpora/punkt_tab')

except LookupError:
nltk.download('stopwords')
nltk.download('punkt_tab')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add error handling for NLTK downloads.

The NLTK initialization lacks proper error handling for download failures. Network issues or permission problems could cause the downloads to fail silently or crash the application.

try:
    nltk.data.find('corpora/stopwords')
    nltk.data.find('corpora/punkt_tab')
except LookupError:
-    nltk.download('stopwords')
-    nltk.download('punkt_tab')
+    try:
+        nltk.download('stopwords')
+        nltk.download('punkt_tab')
+    except Exception as e:
+        print(f"Warning: Failed to download NLTK data: {e}")
+        # Consider whether to raise or continue with degraded functionality
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
import nltk
try:
nltk.data.find('corpora/stopwords')
nltk.data.find('corpora/punkt_tab')
except LookupError:
nltk.download('stopwords')
nltk.download('punkt_tab')
import nltk
try:
nltk.data.find('corpora/stopwords')
nltk.data.find('corpora/punkt_tab')
except LookupError:
try:
nltk.download('stopwords')
nltk.download('punkt_tab')
except Exception as e:
print(f"Warning: Failed to download NLTK data: {e}")
# Consider whether to raise or continue with degraded functionality
🤖 Prompt for AI Agents
In new-backend/app/modules/scraper/cleaner.py around lines 2 to 10, the current
NLTK download logic does not handle errors during the download process, which
can cause silent failures or crashes. Wrap the nltk.download calls in try-except
blocks to catch exceptions, log or handle download failures gracefully, and
ensure the application can respond appropriately to network or permission
issues.

def store_and_send(state):
# to store data in vector db
try:
print(state)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Replace print with proper logging.

Using print statements for debugging in production code is not recommended. Use the logging module instead.

+import logging

-        print(state)
+        logging.info(f"Processing state: {state}")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
print(state)
# At the top of new-backend/app/modules/langgraph_nodes/store_and_send.py
import logging
# … other code …
# Around line 7, replace:
- print(state)
+ logging.info(f"Processing state: {state}")
🤖 Prompt for AI Agents
In new-backend/app/modules/langgraph_nodes/store_and_send.py at line 7, replace
the print statement with a logging call. Import the logging module if not
already imported, configure a logger for the module, and use logger.debug or
logger.info to output the state instead of print, ensuring consistent and
configurable logging.

Comment on lines 4 to 22
def store_and_send(state):
# to store data in vector db
try:
print(state)
# save_to_vector_db({
# **state
# })
except Exception as e:
print(f"some error occured in store_and_send:{e}")
return {
"status": "error",
"error_from": "store_and_send",
"message": f"{e}",
}
# sending to frontend
return {
**state,
"status": "success"
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Improve function documentation and type hints.

The function lacks type hints and documentation, making it unclear what the expected input and output formats are.

+from typing import Dict, Any

-def store_and_send(state):
+def store_and_send(state: Dict[str, Any]) -> Dict[str, Any]:
+    """
+    Store the processing state and prepare response for frontend.
+    
+    Args:
+        state: Dictionary containing the processing state
+        
+    Returns:
+        Dictionary with the original state plus status indicator
+    """
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def store_and_send(state):
# to store data in vector db
try:
print(state)
# save_to_vector_db({
# **state
# })
except Exception as e:
print(f"some error occured in store_and_send:{e}")
return {
"status": "error",
"error_from": "store_and_send",
"message": f"{e}",
}
# sending to frontend
return {
**state,
"status": "success"
}
from typing import Dict, Any
def store_and_send(state: Dict[str, Any]) -> Dict[str, Any]:
"""
Store the processing state and prepare response for frontend.
Args:
state: Dictionary containing the processing state
Returns:
Dictionary with the original state plus status indicator
"""
# to store data in vector db
try:
print(state)
# save_to_vector_db({
# **state
# })
except Exception as e:
print(f"some error occurred in store_and_send: {e}")
return {
"status": "error",
"error_from": "store_and_send",
"message": f"{e}",
}
# sending to frontend
return {
**state,
"status": "success"
}
🤖 Prompt for AI Agents
In new-backend/app/modules/langgraph_nodes/store_and_send.py around lines 4 to
22, the function store_and_send lacks type hints and documentation. Add a
docstring explaining the purpose of the function, the expected type and
structure of the input parameter 'state', and the format of the returned
dictionary. Also, include appropriate type hints for the input parameter and the
return type to improve code clarity and maintainability.

max_tokens=10,
)

sentiment = chat_completion.choices[0].message.content.strip()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Validate sentiment response format.

The API response should be validated to ensure it matches the expected format before returning.

        sentiment = chat_completion.choices[0].message.content.strip()
+        
+        # Validate sentiment response
+        valid_sentiments = ["Positive", "Negative", "Neutral"]
+        if sentiment not in valid_sentiments:
+            logging.warning(f"Unexpected sentiment response: {sentiment}")
+            # Could either raise an error or default to "Neutral"
+            sentiment = "Neutral"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
sentiment = chat_completion.choices[0].message.content.strip()
sentiment = chat_completion.choices[0].message.content.strip()
# Validate sentiment response
valid_sentiments = ["Positive", "Negative", "Neutral"]
if sentiment not in valid_sentiments:
logging.warning(f"Unexpected sentiment response: {sentiment}")
# Could either raise an error or default to "Neutral"
sentiment = "Neutral"
🤖 Prompt for AI Agents
In new-backend/app/modules/langgraph_nodes/sentiment.py at line 38, the
sentiment response from the API is used directly without validation. Add code to
check that chat_completion.choices is a non-empty list, that the first choice
has a message attribute with a content string, and that the content matches the
expected sentiment format before assigning it to the sentiment variable. If the
validation fails, handle the error appropriately, such as raising an exception
or returning a default value.

}

except Exception as e:
print(f"Error in sentiment_analysis: {e}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Replace print with proper logging.

Use logging instead of print for error messages in production code.

-        print(f"Error in sentiment_analysis: {e}")
+        logging.error(f"Error in sentiment_analysis: {e}", exc_info=True)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
print(f"Error in sentiment_analysis: {e}")
logging.error(f"Error in sentiment_analysis: {e}", exc_info=True)
🤖 Prompt for AI Agents
In new-backend/app/modules/langgraph_nodes/sentiment.py at line 47, replace the
print statement used for error output with a proper logging call. Import the
logging module if not already imported, configure a logger for the module, and
use logger.error to log the error message instead of print. This ensures error
messages are handled appropriately in production environments.

if not perspective:
raise ValueError("Missing or empty 'perspective' in state")

score = 85 if "reasoning" in perspective else 40
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Replace dummy scoring with proper evaluation logic.

The current scoring mechanism based on keyword presence ("reasoning") is overly simplistic and fragile. Consider implementing a more robust evaluation system that assesses the quality, coherence, and logical structure of the perspective.

-        score = 85 if "reasoning" in perspective else 40
+        # TODO: Implement proper perspective evaluation logic
+        # Consider factors like: logical coherence, factual accuracy, 
+        # argument structure, and counter-perspective quality
+        score = self._evaluate_perspective_quality(perspective, state.get("facts", []))

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In new-backend/app/modules/langgraph_nodes/judge.py at line 9, replace the
simplistic keyword-based scoring with a more robust evaluation method that
analyzes the quality, coherence, and logical structure of the perspective.
Implement logic that goes beyond keyword checks, such as using NLP techniques or
scoring criteria that assess argument strength and clarity, to generate a
meaningful score.

Comment on lines 1 to 7
def judge_perspective(state):
# Dummy scoring
try:
perspective = state.get("perspective")

if not perspective:
raise ValueError("Missing or empty 'perspective' in state")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add input validation for state parameter.

The function should validate that state is a dictionary before attempting to access its properties.

 def judge_perspective(state):
     # Dummy scoring
     try:
+        if not isinstance(state, dict):
+            raise TypeError("State must be a dictionary")
+            
         perspective = state.get("perspective")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def judge_perspective(state):
# Dummy scoring
try:
perspective = state.get("perspective")
if not perspective:
raise ValueError("Missing or empty 'perspective' in state")
def judge_perspective(state):
# Dummy scoring
try:
if not isinstance(state, dict):
raise TypeError("State must be a dictionary")
perspective = state.get("perspective")
if not perspective:
raise ValueError("Missing or empty 'perspective' in state")
🤖 Prompt for AI Agents
In new-backend/app/modules/langgraph_nodes/judge.py around lines 1 to 7, the
function judge_perspective assumes the input parameter state is a dictionary
without validation. Add a check at the start of the function to verify that
state is an instance of dict, and if not, raise a TypeError with a clear message
indicating that state must be a dictionary. This will prevent runtime errors
when accessing state properties.

Comment on lines 26 to 27
except Exception as e:
print(f"❌ Search failed for: {claim} -> {e}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Improve error handling for search failures.

Currently, search failures are logged but silently ignored, which could lead to incomplete fact-checking results. Consider collecting and reporting these failures.

+        failed_claims = []
         for claim in claims:
             print(f"\n🔍Searching for claim...: {claim}")
             try:
                 val = search_duckduckgo(claim)
                 val[0]["claim"] = claim
                 search_results.append(val[0])
             except Exception as e:
                 print(f"❌ Search failed for: {claim} -> {e}")
+                failed_claims.append({"claim": claim, "error": str(e)})
             time.sleep(4)  # Add 4 second delay to prevent rate-limit
+        
+        if failed_claims:
+            print(f"⚠️  Failed to search {len(failed_claims)} claims out of {len(claims)}")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
except Exception as e:
print(f"❌ Search failed for: {claim} -> {e}")
# Collect any claims that fail during search
failed_claims = []
for claim in claims:
print(f"\n🔍Searching for claim...: {claim}")
try:
val = search_duckduckgo(claim)
val[0]["claim"] = claim
search_results.append(val[0])
except Exception as e:
print(f"❌ Search failed for: {claim} -> {e}")
failed_claims.append({"claim": claim, "error": str(e)})
time.sleep(4) # Add 4 second delay to prevent rate-limit
# Report a summary of any failures
if failed_claims:
print(f"⚠️ Failed to search {len(failed_claims)} claims out of {len(claims)}")
🤖 Prompt for AI Agents
In new-backend/app/utils/fact_check_utils.py around lines 26 to 27, the current
error handling for search failures only prints the error and silently ignores
it, which may cause incomplete fact-checking results. Modify the code to collect
these exceptions in a list or log structure and ensure they are reported or
returned so that the calling function can handle or display the failures
appropriately.

Comment on lines +40 to +42
facts_str = "\n".join([f"Claim: {f['original_claim']}\n"
"Verdict: {f['verdict']}\nExplanation: "
"{f['explanation']}" for f in state["facts"]])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix critical string formatting bug in facts_str.

The string formatting is using positional placeholders {} but accessing dictionary keys directly, which will cause a KeyError.

-        facts_str = "\n".join([f"Claim: {f['original_claim']}\n"
-                               "Verdict: {f['verdict']}\nExplanation: "
-                               "{f['explanation']}" for f in state["facts"]])
+        facts_str = "\n".join([f"Claim: {f['original_claim']}\n"
+                               f"Verdict: {f['verdict']}\nExplanation: "
+                               f"{f['explanation']}" for f in state["facts"]])
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
facts_str = "\n".join([f"Claim: {f['original_claim']}\n"
"Verdict: {f['verdict']}\nExplanation: "
"{f['explanation']}" for f in state["facts"]])
facts_str = "\n".join([f"Claim: {f['original_claim']}\n"
f"Verdict: {f['verdict']}\nExplanation: "
f"{f['explanation']}" for f in state["facts"]])
🤖 Prompt for AI Agents
In new-backend/app/modules/langgraph_nodes/generate_perspective.py around lines
40 to 42, the string construction for facts_str incorrectly uses curly braces
without an f-string prefix, causing dictionary keys not to be interpolated and
leading to a KeyError. Fix this by converting the multiline string inside the
join to a proper f-string so that the dictionary keys are correctly formatted
into the string.

Comment on lines 56 to 102
graph.add_conditional_edges(
"sentiment_analysis",
lambda x: (
"error_handler" if x.get("status") == "error" else "fact_checking"
)
)

graph.add_conditional_edges(
"fact_checking",
lambda x: (
"error_handler"
if x.get("status") == "error"
else "generate_perspective"
)
)

graph.add_conditional_edges(
"generate_perspective",
lambda x: (
"error_handler"
if x.get("status") == "error"
else "judge_perspective"
)
)

graph.add_conditional_edges(
"judge_perspective",
lambda state: (
"error_handler"
if state.get("status") == "error"
else (
"store_and_send"
if state.get("retries", 0) >= 3
else "generate_perspective"
)
if state.get("score", 0) < 70
else "store_and_send"
)
)
graph.add_conditional_edges(
"store_and_send",
lambda x: (
"error_handler"
if x.get("status") == "error"
else "__end__"
)
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Refactor complex conditional logic for better readability.

The conditional edges handle error cases well, but the logic in lines 82-94 for judge_perspective is complex and hard to follow. Consider refactoring for clarity:

+def judge_perspective_router(state):
+    """Route based on judgment results with retry logic."""
+    if state.get("status") == "error":
+        return "error_handler"
+    
+    score = state.get("score", 0)
+    retries = state.get("retries", 0)
+    
+    if score >= 70:
+        return "store_and_send"
+    elif retries >= 3:
+        return "store_and_send"
+    else:
+        return "generate_perspective"

    graph.add_conditional_edges(
        "judge_perspective",
-        lambda state: (
-            "error_handler"
-            if state.get("status") == "error"
-            else (
-                "store_and_send"
-                if state.get("retries", 0) >= 3
-                else "generate_perspective"
-            )
-            if state.get("score", 0) < 70
-            else "store_and_send"
-            )
+        judge_perspective_router
    )

This makes the retry and scoring logic much clearer and easier to maintain.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
graph.add_conditional_edges(
"sentiment_analysis",
lambda x: (
"error_handler" if x.get("status") == "error" else "fact_checking"
)
)
graph.add_conditional_edges(
"fact_checking",
lambda x: (
"error_handler"
if x.get("status") == "error"
else "generate_perspective"
)
)
graph.add_conditional_edges(
"generate_perspective",
lambda x: (
"error_handler"
if x.get("status") == "error"
else "judge_perspective"
)
)
graph.add_conditional_edges(
"judge_perspective",
lambda state: (
"error_handler"
if state.get("status") == "error"
else (
"store_and_send"
if state.get("retries", 0) >= 3
else "generate_perspective"
)
if state.get("score", 0) < 70
else "store_and_send"
)
)
graph.add_conditional_edges(
"store_and_send",
lambda x: (
"error_handler"
if x.get("status") == "error"
else "__end__"
)
)
def judge_perspective_router(state):
"""Route based on judgment results with retry logic."""
if state.get("status") == "error":
return "error_handler"
score = state.get("score", 0)
retries = state.get("retries", 0)
if score >= 70:
return "store_and_send"
elif retries >= 3:
return "store_and_send"
else:
return "generate_perspective"
graph.add_conditional_edges(
"sentiment_analysis",
lambda x: (
"error_handler" if x.get("status") == "error" else "fact_checking"
)
)
graph.add_conditional_edges(
"fact_checking",
lambda x: (
"error_handler"
if x.get("status") == "error"
else "generate_perspective"
)
)
graph.add_conditional_edges(
"generate_perspective",
lambda x: (
"error_handler"
if x.get("status") == "error"
else "judge_perspective"
)
)
graph.add_conditional_edges(
"judge_perspective",
judge_perspective_router
)
graph.add_conditional_edges(
"store_and_send",
lambda x: (
"error_handler"
if x.get("status") == "error"
else "__end__"
)
)
🤖 Prompt for AI Agents
In new-backend/app/modules/langgraph_builder.py between lines 56 and 102, the
conditional logic for the "judge_perspective" edge is complex and nested, making
it hard to read. Refactor the lambda function to use clearer if-else statements
or intermediate variables to separately handle the error status, retry count,
and score checks. This will improve readability and maintainability by
simplifying the decision flow.

@ManavSarkar ManavSarkar merged commit 0484f21 into main Jul 18, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants