Skip to content

Conversation

@ParagGhatage
Copy link
Collaborator

@ParagGhatage ParagGhatage commented Jun 10, 2025

Tasks done

  • Base LangGraph setup.
  • Nodes finalized.
  • dummy node files.
  • langgraph workflow finalized.
  • Added error_handler node to handle errors of entire langgraph.

Here's the structure of LangGraph:

flowchart TD
    Start([Start]) --> SentimentAnalysis

    SentimentAnalysis["🧠 Sentiment Analysis"]
    FactCheck["🔍 Fact Checking"]
    GeneratePerspective["🧾 Generate Perspective"]
    JudgePerspective["⚖️ Judge Perspective"]
    StoreAndSend["📦 Store and Send"]
    ErrorHandler["❌ Error Handler"]
    Frontend([Frontend])
    Frontend1([Frontend1])


    SentimentAnalysis -->|status=success| FactCheck
    SentimentAnalysis -->|status=error| ErrorHandler

    FactCheck -->|status=success| GeneratePerspective
    FactCheck -->|status=error| ErrorHandler

    GeneratePerspective -->|status=success| JudgePerspective
    GeneratePerspective -->|status=error| ErrorHandler

    JudgePerspective -->|status=error| ErrorHandler
    JudgePerspective -->|score<70| GeneratePerspective
    JudgePerspective -->|score>=70| StoreAndSend

    StoreAndSend -->|status=error| ErrorHandler
    StoreAndSend -->|status=success| Frontend
   
     ErrorHandler -->|Error | Frontend1


Loading

Summary by CodeRabbit

Summary by CodeRabbit

  • New Features

    • Introduced an automated workflow that processes articles through sentiment analysis, fact checking, perspective generation, judgment, and result storage.
    • Added sentiment analysis using advanced language models.
    • Implemented automated fact checking and opposing perspective generation for articles.
    • Included a scoring system to evaluate generated perspectives and conditional workflow branching.
    • Results are now stored in a vector database for future reference.
  • Improvements

    • Enhanced article text cleaning with expanded removal of common boilerplate and promotional phrases.
    • Improved content extraction by disabling fallback mechanisms for more consistent results.
  • Chores

    • Added new dependencies: langchain, langgraph, and transformers.
  • UI Updates

    • Added a note below the "Get Started" button on the landing page indicating no sign-in is required and the service is free.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jun 10, 2025

Walkthrough

A new modular state-based processing pipeline is introduced, orchestrated via a LangGraph workflow. It sequentially applies sentiment analysis, fact-checking, perspective generation, judgment, and result storage to article text. Supporting modules implement each processing step with error handling, and the /process API endpoint is updated to execute the complete workflow.

Changes

File(s) Change Summary
app/modules/langgraph_builder.py Adds builder for a state-based LangGraph workflow with conditional and sequential nodes.
app/modules/langgraph_nodes/fact_check.py Adds stubbed web search and fact-checking functions with error handling.
app/modules/langgraph_nodes/generate_perspective.py Adds function for generating an opposing perspective using an LLM chain and prompt template with error handling.
app/modules/langgraph_nodes/judge.py Adds dummy scoring function for judging generated perspectives with error handling.
app/modules/langgraph_nodes/sentiment.py Adds sentiment analysis function using Hugging Face Transformers with error handling.
app/modules/langgraph_nodes/store_and_send.py Adds function to store results in a vector database and return success status with error handling.
app/modules/langgraph_nodes/error_handler.py Adds error handler function that logs and returns error status with details.
app/modules/pipeline.py Adds function to run the LangGraph workflow and integrates the builder with pre-compilation.
app/routes/routes.py Updates /process endpoint to run the new workflow instead of returning raw article text.
pyproject.toml Adds dependencies: langchain, langgraph, and transformers.
app/modules/scraper/cleaner.py Expands boilerplate phrases removed during text cleaning with many additional patterns.
app/modules/scraper/extractor.py Adds no_fallback=True argument to trafilatura extraction call to disable fallback extraction.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant API
    participant Pipeline
    participant LangGraph
    participant Sentiment
    participant FactCheck
    participant PerspectiveGen
    participant Judge
    participant StoreSend

    Client->>API: POST /process (with URL)
    API->>Pipeline: run_scraper_pipeline(url)
    Pipeline-->>API: article_text
    API->>Pipeline: run_langgraph_workflow({'text': article_text})
    Pipeline->>LangGraph: build_langgraph()
    Pipeline->>LangGraph: langgraph_workflow({'text': article_text})
    LangGraph->>Sentiment: run_sentiment(state)
    Sentiment-->>LangGraph: sentiment result
    LangGraph->>FactCheck: run_fact_check(state)
    FactCheck-->>LangGraph: facts
    LangGraph->>PerspectiveGen: generate_perspective(state)
    PerspectiveGen-->>LangGraph: perspective
    LangGraph->>Judge: judge_perspective(state)
    Judge-->>LangGraph: score
    alt score < 70 and retries < 3
        LangGraph->>PerspectiveGen: generate_perspective(state)
        PerspectiveGen-->>LangGraph: perspective
        LangGraph->>Judge: judge_perspective(state)
        Judge-->>LangGraph: score
        %% (loop continues until score >= 70 or retries >= 3)
    end
    LangGraph->>StoreSend: store_and_send(state)
    StoreSend-->>LangGraph: status
    LangGraph-->>Pipeline: final result
    Pipeline-->>API: result
    API-->>Client: result
Loading

Poem

In the warren of code, a new path appears,
Where sentiment and facts chase away fears.
Perspectives are judged, then safely stored,
All in a workflow, perfectly scored.
With hops and with hops, this rabbit delights—
The pipeline now sparkles with AI-powered insights!
🐇✨

✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

coderabbitai bot added a commit that referenced this pull request Jun 10, 2025
Docstrings generation was requested by @ParagGhatage.

* #99 (comment)

The following files were modified:

* `new-backend/app/modules/langgraph_builder.py`
* `new-backend/app/modules/langgraph_nodes/fact_check.py`
* `new-backend/app/modules/langgraph_nodes/generate_perspective.py`
* `new-backend/app/modules/langgraph_nodes/judge.py`
* `new-backend/app/modules/langgraph_nodes/sentiment.py`
* `new-backend/app/modules/langgraph_nodes/store_and_send.py`
* `new-backend/app/modules/pipeline.py`
* `new-backend/app/routes/routes.py`
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jun 10, 2025

Note

Generated docstrings for this pull request at #100

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

♻️ Duplicate comments (1)
new-backend/app/modules/langgraph_nodes/sentiment.py (1)

6-9: Input key "text" must align with upstream pipeline output.

Given the current scraper output (which provides "cleaned_text"), this line will fail. Either change here or adapt the caller as suggested in routes.py.

🧹 Nitpick comments (3)
new-backend/app/modules/langgraph_nodes/sentiment.py (1)

1-4: Eager model download at import hurts cold-start performance.

Initialising the transformers pipeline at module import time forces the model download on every cold start. Prefer lazy, cached initialisation:

-from transformers import pipeline
-
-sentiment_pipeline = pipeline("sentiment-analysis")
+from functools import lru_cache
+from transformers import pipeline
+
+@lru_cache(maxsize=1)
+def _pipeline():
+    return pipeline("sentiment-analysis")

Call _pipeline() inside run_sentiment.

new-backend/app/modules/langgraph_nodes/judge.py (1)

4-5: Hard-coded heuristic & threshold – make configurable or document clearly.

The magic numbers 85, 40, and the downstream threshold 70 are buried in code. Consider exposing them as constants or configuration to simplify future tuning and testing.

new-backend/app/modules/langgraph_builder.py (1)

51-57: Use a constant for the threshold & avoid the reserved keyword "pass"

Hard-coding 70 and the "pass" label makes the graph brittle and harder to unit-test.

 THRESHOLD = 70
 ...
 graph.add_conditional_edges(
     "judge_perspective",
-    lambda state: "rerun" if state.get("score", 0) < 70 else "pass",
+    lambda state: "rerun" if state.get("score", 0) < THRESHOLD else "done",
     {
         "rerun": "generate_perspective",
-        "pass": "store_and_send"
+        "done": "store_and_send"
     }
 )
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9232290 and 92ace01.

⛔ Files ignored due to path filters (1)
  • new-backend/uv.lock is excluded by !**/*.lock
📒 Files selected for processing (9)
  • new-backend/app/modules/langgraph_builder.py (1 hunks)
  • new-backend/app/modules/langgraph_nodes/fact_check.py (1 hunks)
  • new-backend/app/modules/langgraph_nodes/generate_perspective.py (1 hunks)
  • new-backend/app/modules/langgraph_nodes/judge.py (1 hunks)
  • new-backend/app/modules/langgraph_nodes/sentiment.py (1 hunks)
  • new-backend/app/modules/langgraph_nodes/store_and_send.py (1 hunks)
  • new-backend/app/modules/pipeline.py (2 hunks)
  • new-backend/app/routes/routes.py (2 hunks)
  • new-backend/pyproject.toml (1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (3)
new-backend/app/modules/pipeline.py (1)
new-backend/app/modules/langgraph_builder.py (1)
  • build_langgraph (11-62)
new-backend/app/routes/routes.py (1)
new-backend/app/modules/pipeline.py (1)
  • run_langgraph_workflow (27-30)
new-backend/app/modules/langgraph_builder.py (5)
new-backend/app/modules/langgraph_nodes/generate_perspective.py (1)
  • generate_perspective (21-25)
new-backend/app/modules/langgraph_nodes/store_and_send.py (1)
  • store_and_send (4-12)
new-backend/app/modules/langgraph_nodes/sentiment.py (1)
  • run_sentiment (6-9)
new-backend/app/modules/langgraph_nodes/fact_check.py (1)
  • run_fact_check (7-12)
new-backend/app/modules/langgraph_nodes/judge.py (1)
  • judge_perspective (1-5)
🪛 Pylint (3.3.7)
new-backend/app/modules/langgraph_nodes/fact_check.py

[error] 10-10: Assigning result of a function call, where the function returns None

(E1128)


[error] 10-10: Too many positional arguments for function call

(E1121)


[error] 11-11: Non-iterable value results is used in an iterating context

(E1133)

🔇 Additional comments (2)
new-backend/pyproject.toml (1)

10-11: Verify that the specified minimum versions actually exist on PyPI.

langchain>=0.3.25, langgraph>=0.4.8, and especially transformers>=4.52.4 refer to versions that may not yet be published. A non-existent version will break the build at install time. Please confirm availability (or pin to the latest released versions) before merging.

Also applies to: 18-18

new-backend/app/modules/langgraph_builder.py (1)

12-33: Missing state_schema / validation for StateGraph

Recent langgraph versions require a schema (pydantic / dataclass) to describe the state. Omitting it will raise TypeError: StateGraph.__init__() missing required positional argument. Verify the current library version and update accordingly.

Comment on lines +23 to +24
data = run_langgraph_workflow(article_text)
return data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

State payload is incompatible with the sentiment node – will raise KeyError.

run_scraper_pipeline() returns {"cleaned_text": ..., "keywords": ...}, whereas run_sentiment() expects state["text"]. Calling the workflow with the current dict will crash on the first node.

-    data = run_langgraph_workflow(article_text)
-    return data
+    # Map scraper output to the keys expected by the LangGraph workflow
+    langgraph_state = {
+        "text": article_text["cleaned_text"],
+        "keywords": article_text["keywords"],
+    }
+    data = run_langgraph_workflow(langgraph_state)
+    return data
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
data = run_langgraph_workflow(article_text)
return data
# Map scraper output to the keys expected by the LangGraph workflow
langgraph_state = {
"text": article_text["cleaned_text"],
"keywords": article_text["keywords"],
}
data = run_langgraph_workflow(langgraph_state)
return data
🤖 Prompt for AI Agents
In new-backend/app/routes/routes.py around lines 23 to 24, the dictionary
returned by run_scraper_pipeline() uses the key "cleaned_text" but
run_sentiment() expects the key "text" in the state payload. To fix this, modify
the dictionary keys before passing the state to run_sentiment() or adjust the
output of run_scraper_pipeline() to include the key "text" instead of
"cleaned_text" so that the sentiment node receives the expected key and avoids a
KeyError.

@@ -0,0 +1,12 @@
from utils.vector_store import save_to_vector_db
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Check import path correctness

utils.vector_store is referenced as a top-level package, but all LangGraph code lives under new-backend/app. If vector_store.py sits in the same package hierarchy (e.g. app.utils.vector_store), the import will fail at runtime.

Run the quick search below to confirm the real module path and adjust accordingly:


🏁 Script executed:

#!/bin/bash
fd vector_store.py

Length of output: 58


Update the import to include the app package

The module vector_store.py lives at new-backend/app/utils/vector_store.py, so the import in
new-backend/app/modules/langgraph_nodes/store_and_send.py should reference app.utils, not utils.

• File: new-backend/app/modules/langgraph_nodes/store_and_send.py

- from utils.vector_store import save_to_vector_db
+ from app.utils.vector_store import save_to_vector_db
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
from utils.vector_store import save_to_vector_db
from app.utils.vector_store import save_to_vector_db
🤖 Prompt for AI Agents
In new-backend/app/modules/langgraph_nodes/store_and_send.py at line 1, the
import statement uses 'utils.vector_store' which is incorrect because the module
resides under the 'app' package. Update the import to 'app.utils.vector_store'
to reflect the correct package hierarchy and ensure the import works at runtime.

Comment on lines 7 to 12
def run_fact_check(state):
text = state["text"]
keywords = state["keywords"]
results = search_web(text + " " + " ".join(keywords))
sources = [{"snippet": r.text, "url": r.link} for r in results]
return {"facts": sources}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Multiple runtime failures & missing fallbacks

  1. state["keywords"]keywords is never produced by any previous node. A KeyError is inevitable.
  2. Iteration over results without verifying its type (see comment above).
  3. Assuming result objects expose .text / .link attributes.

Proposed resilient version:

 def run_fact_check(state):
     text = state["text"]
-    keywords = state["keywords"]
-    results = search_web(text + " " + " ".join(keywords))
-    sources = [{"snippet": r.text, "url": r.link} for r in results]
-    return {"facts": sources}
+    keywords = state.get("keywords", [])
+    query = " ".join([text, *keywords])
+    results = search_web(query)
+
+    sources = []
+    for r in results:
+        snippet = getattr(r, "text", None) or getattr(r, "snippet", "")
+        url = getattr(r, "link", None) or getattr(r, "url", "")
+        sources.append({"snippet": snippet, "url": url})
+
+    return {"facts": sources}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def run_fact_check(state):
text = state["text"]
keywords = state["keywords"]
results = search_web(text + " " + " ".join(keywords))
sources = [{"snippet": r.text, "url": r.link} for r in results]
return {"facts": sources}
def run_fact_check(state):
text = state["text"]
keywords = state.get("keywords", [])
query = " ".join([text, *keywords])
results = search_web(query)
sources = []
for r in results:
snippet = getattr(r, "text", None) or getattr(r, "snippet", "")
url = getattr(r, "link", None) or getattr(r, "url", "")
sources.append({"snippet": snippet, "url": url})
return {"facts": sources}
🧰 Tools
🪛 Pylint (3.3.7)

[error] 10-10: Assigning result of a function call, where the function returns None

(E1128)


[error] 10-10: Too many positional arguments for function call

(E1121)


[error] 11-11: Non-iterable value results is used in an iterating context

(E1133)

🤖 Prompt for AI Agents
In new-backend/app/modules/langgraph_nodes/fact_check.py around lines 7 to 12,
the code assumes 'keywords' always exists in state, iterates over 'results'
without type checking, and accesses '.text' and '.link' attributes on result
objects without verification, causing runtime errors. Fix this by safely
accessing 'keywords' with a default empty list if missing, verify that 'results'
is iterable before looping, and confirm or adapt to the actual attributes of
result objects when constructing the 'sources' list to avoid attribute errors.

Comment on lines +16 to +19
my_llm = "groq llm"

chain = LLMChain(prompt=prompt, llm=my_llm)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

my_llm is an invalid placeholder – LLMChain expects a BaseLLM instance

Passing a plain string will raise during graph compilation. Either inject a real LLM or keep the node disabled behind a feature flag.

Example fix with LangChain’s OpenAI wrapper:

-from langchain.chains import LLMChain
+from langchain.chains import LLMChain
+from langchain_openai import ChatOpenAI   # or any provider

-my_llm = "groq llm"
+my_llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0.7)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
my_llm = "groq llm"
chain = LLMChain(prompt=prompt, llm=my_llm)
from langchain.chains import LLMChain
from langchain_openai import ChatOpenAI # or any provider
# Replace the string placeholder with a real LLM instance
my_llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0.7)
chain = LLMChain(prompt=prompt, llm=my_llm)
🤖 Prompt for AI Agents
In new-backend/app/modules/langgraph_nodes/generate_perspective.py around lines
16 to 19, the variable my_llm is assigned a string instead of a BaseLLM
instance, which causes LLMChain to fail. Replace the string assignment with an
actual LLM instance, such as LangChain's OpenAI wrapper, or conditionally
disable this node behind a feature flag to avoid runtime errors.

Comment on lines 21 to 25
def generate_perspective(state):
text = state["text"]
facts = "\n".join([f["snippet"] for f in state["facts"]])
result = chain.run({"text": text, "facts": facts})
return {"perspective": result}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Guard against empty fact list & propagate full state

If state["facts"] is empty the join will raise. Also, downstream nodes might want both the newly generated perspective and existing keys.

 def generate_perspective(state):
     text = state["text"]
-    facts = "\n".join([f["snippet"] for f in state["facts"]])
-    result = chain.run({"text": text, "facts": facts})
-    return {"perspective": result}
+    fact_snippets = "\n".join(f.get("snippet", "") for f in state.get("facts", []))
+    perspective = chain.run({"text": text, "facts": fact_snippets})
+    return {**state, "perspective": perspective}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def generate_perspective(state):
text = state["text"]
facts = "\n".join([f["snippet"] for f in state["facts"]])
result = chain.run({"text": text, "facts": facts})
return {"perspective": result}
def generate_perspective(state):
text = state["text"]
fact_snippets = "\n".join(f.get("snippet", "") for f in state.get("facts", []))
perspective = chain.run({"text": text, "facts": fact_snippets})
return {**state, "perspective": perspective}
🤖 Prompt for AI Agents
In new-backend/app/modules/langgraph_nodes/generate_perspective.py around lines
21 to 25, the code does not handle the case when state["facts"] is empty,
causing the join operation to fail. Modify the code to check if state["facts"]
is empty before joining snippets, using an empty string if so. Additionally,
update the return statement to include the entire original state merged with the
new "perspective" key, so downstream nodes receive both the generated
perspective and existing state data.

@@ -0,0 +1,9 @@
from transformers import pipeline

sentiment_pipeline = pipeline("sentiment-analysis")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of using default model for sentiment analysis, consider exploring more robust models for improved sentiment analysis

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just a base setup, I am going to build all the modules in detail.
I'm going to integrate robust LLMs through groq LLM APIs for each LLM related task.

def run_sentiment(state):
text = state["text"]
result = sentiment_pipeline(text)[0]
return {"sentiment": result["label"], "sentiment_score": result["score"]}
Copy link

@Ms-Error Ms-Error Jun 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're returning a new dictionary, This will overwrite the original state and removes all other keys.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right.
I should return state+new data.

keywords = state["keywords"]
results = search_web(text + " " + " ".join(keywords))
sources = [{"snippet": r.text, "url": r.link} for r in results]
return {"facts": sources}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

function will overwrite the state by returning a new dictionary with only "facts".

text = state["text"]
facts = "\n".join([f["snippet"] for f in state["facts"]])
result = chain.run({"text": text, "facts": facts})
return {"perspective": result}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similarly here as well this will overwrite the state with only perspective. fix this.

score = 85 if "reasoned" in perspective else 40
return {"score": score}

# llm based score assignment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LLM should consider factors like logic, factual alignment, coherence, and tone and return a numeric score (0–100) which can be parsed and added to the state

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly.
We will have prompt templates explaining LLM to evaluate on the basis of logic, factual alignment,coherence and give score.

save_to_vector_db({
"text": state["text"],
"perspective": state["perspective"],
"facts": state["facts"]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also store score

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, will store score with genrated perspective.

"facts": state["facts"]
})
# sending to frontend
return {"status": "success"}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add try catch block to handle failure case

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. I will add error handling to all the nodes.

- added more boilerplate phrases for better text cleaning.
- added no_fallback=True in trafilatura.extract() method
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🔭 Outside diff range comments (1)
new-backend/app/modules/scraper/extractor.py (1)

26-28: ⚠️ Potential issue

headers passed as positional arg – request parameters are being sent instead of headers

requests.get() expects headers= as a keyword argument.
Passing self.headers positionally is interpreted as params, so the User-Agent never reaches the server and query-string garbage may be appended to the URL.

-res = requests.get(self.url, self.headers, timeout=10)
+res = requests.get(self.url, headers=self.headers, timeout=10)
♻️ Duplicate comments (4)
new-backend/app/modules/langgraph_nodes/store_and_send.py (2)

1-1: ⚠️ Potential issue

Incorrect package path – import will fail at runtime
vector_store.py lives under app.utils, not at the project root. The import will raise ModuleNotFoundError when the graph runs.

-from utils.vector_store import save_to_vector_db
+from app.utils.vector_store import save_to_vector_db

4-12: ⚠️ Potential issue

Still no defensive error handling despite prior review
Any exception from save_to_vector_db will crash the whole LangGraph execution. A minimal guard keeps the workflow alive and surfaces the failure upstream.

 def store_and_send(state):
-    # to store data in vector db
-    save_to_vector_db({
-        "text": state["text"],
-        "perspective": state["perspective"],
-        "facts": state["facts"],
-        "sentiment": state["label"],
-        "sentiment_score": state["score"]
-    })
+    try:
+        save_to_vector_db(
+            {
+                "text": state["text"],
+                "perspective": state["perspective"],
+                "facts": state["facts"],
+                "sentiment": state["label"],
+                "judge_score": state["score"],  # clarify metric name
+            }
+        )
+    except Exception as exc:
+        # TODO: replace print with structured logger
+        print(f"[store_and_send] DB write failed: {exc}")
+        return {**state, "status": "error", "detail": str(exc)}
new-backend/app/modules/langgraph_nodes/fact_check.py (2)

3-5: ⚠️ Potential issue

Stub signature & return type guarantee a crash
search_web() is defined without parameters and returns None, yet later it’s invoked with an argument and iterated over. Fix the signature and return an iterable placeholder.

-def search_web():
-    return None
+def search_web(query: str) -> list:
+    # TODO: integrate real search client
+    return []  # keeps pipeline alive

8-11: ⚠️ Potential issue

Multiple KeyError / TypeError risks in run_fact_check

  1. state["keywords"] is never set by previous nodes.
  2. Iterating over results without confirming iterability.
  3. Blind attribute access (r.text, r.link).

Robust version:

-    text = state["text"]
-    keywords = state["keywords"]
-    results = search_web(text + " " + " ".join(keywords))
-    sources = [{"snippet": r.text, "url": r.link} for r in results]
+    text = state["text"]
+    keywords = state.get("keywords", [])
+    query = " ".join([text, *keywords])
+
+    results = search_web(query) or []
+    sources = []
+    for r in results:
+        snippet = getattr(r, "text", "") or getattr(r, "snippet", "")
+        url = getattr(r, "link", "") or getattr(r, "url", "")
+        sources.append({"snippet": snippet, "url": url})
🧰 Tools
🪛 Pylint (3.3.7)

[error] 10-10: Assigning result of a function call, where the function returns None

(E1128)


[error] 10-10: Too many positional arguments for function call

(E1121)


[error] 11-11: Non-iterable value results is used in an iterating context

(E1133)

🧹 Nitpick comments (2)
new-backend/app/modules/langgraph_nodes/fact_check.py (1)

12-15: State overwriting concern
Merging with **state is good, but ensure no upstream keys are clobbered (e.g., an existing "facts" entry). Consider guarding or namespacing if duplication is possible.

new-backend/app/modules/scraper/cleaner.py (1)

23-63: Regex list keeps growing – consider pre-compiling and single‐pass substitution

The 40-line loop runs re.sub once per pattern, O(N × P).
Two quick wins:

  1. Merge the patterns with | and use a single re.sub (with re.I and re.M), or
  2. Pre-compile the list once at module import to avoid re-compilation on every call.

Example single-pass approach:

-boilerplate_phrases = [
-    r"read more at.*",
-    ...
-]
-for pattern in boilerplate_phrases:
-    text = re.sub(pattern, '', text, flags=re.IGNORECASE)
+BOILERPLATE_RE = re.compile(
+    r"(?:read more at.*|subscribe to.*|click here to.*|follow us on.*|"
+    r"advertisement|sponsored content|promoted by.*|recommended for you|"
+    r"© \d{4}.*|all rights reserved|terms of service|privacy policy|"
+    r"...etc...)",
+    flags=re.IGNORECASE,
+)
+
+text = BOILERPLATE_RE.sub('', text)

This keeps runtime predictable as the phrase list inevitably expands.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 92ace01 and 611ebd8.

📒 Files selected for processing (7)
  • new-backend/app/modules/langgraph_nodes/fact_check.py (1 hunks)
  • new-backend/app/modules/langgraph_nodes/generate_perspective.py (1 hunks)
  • new-backend/app/modules/langgraph_nodes/judge.py (1 hunks)
  • new-backend/app/modules/langgraph_nodes/sentiment.py (1 hunks)
  • new-backend/app/modules/langgraph_nodes/store_and_send.py (1 hunks)
  • new-backend/app/modules/scraper/cleaner.py (1 hunks)
  • new-backend/app/modules/scraper/extractor.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • new-backend/app/modules/langgraph_nodes/sentiment.py
  • new-backend/app/modules/langgraph_nodes/generate_perspective.py
  • new-backend/app/modules/langgraph_nodes/judge.py
🧰 Additional context used
🪛 Pylint (3.3.7)
new-backend/app/modules/langgraph_nodes/fact_check.py

[error] 10-10: Assigning result of a function call, where the function returns None

(E1128)


[error] 10-10: Too many positional arguments for function call

(E1121)


[error] 11-11: Non-iterable value results is used in an iterating context

(E1133)

🔇 Additional comments (1)
new-backend/app/modules/scraper/extractor.py (1)

38-45:

Details

❓ Verification inconclusive

no_fallback=True may break older Trafilatura versions – confirm compatibility

no_fallback was introduced in trafilatura ≥ 1.6. Using it against an older runtime silently raises TypeError, collapsing extraction.

Please verify the deployed image/pip constraints include a recent trafilatura build (or guard with feature detection).


🏁 Script executed:

#!/bin/bash
# Confirm trafilatura version and presence of `no_fallback` in its signature
python - <<'PY'
import inspect, trafilatura, pkg_resources, sys, textwrap
print("trafilatura", pkg_resources.get_distribution("trafilatura").version)
sig = inspect.signature(trafilatura.extract)
print("no_fallback" in sig.parameters)
PY

Length of output: 134


Let’s check how trafilatura is specified in your dependency files to confirm the minimum required version.


🏁 Script executed:

#!/bin/bash
# Search for trafilatura in common dependency manifests
grep -R "trafilatura" -n requirements*.txt setup.py pyproject.toml || true

Length of output: 216


Ensure Trafilatura ≥1.6 or guard no_fallback

We weren’t able to find any pinned version for trafilatura in requirements, setup.py or pyproject.toml. Since no_fallback was only added in 1.6, running on an older version will raise a TypeError at runtime.

Please verify that your deployed image or CI pipeline installs trafilatura ≥1.6, for example in your Dockerfile or pip install command. Alternatively, add a small feature‐detection guard around no_fallback:

import inspect
kw = {}
if 'no_fallback' in inspect.signature(trafilatura.extract).parameters:
    kw['no_fallback'] = True

result = trafilatura.extract(
    downloaded,
    include_comments=False,
    include_tables=False,
    favor_recall=True,
    output_format='json',
    **kw,
)

Comment on lines 10 to 11
"sentiment": state["label"],
"sentiment_score": state["score"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Metric mis-labeling – score is from judge, not sentiment
You’re persisting the judgement score under the key sentiment_score, which is misleading and will cause confusion downstream. Rename to something explicit like judge_score.

🤖 Prompt for AI Agents
In new-backend/app/modules/langgraph_nodes/store_and_send.py around lines 10 to
11, rename the key "sentiment_score" to "judge_score" when assigning
state["score"] to accurately reflect that the score is from the judge, not
sentiment, to avoid confusion in downstream processing.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (5)
new-backend/app/modules/langgraph_nodes/fact_check.py (3)

9-12: Unsafe access to keywords and unchecked result structure

These exact problems were raised in the previous review and remain unresolved.
Use state.get("keywords", []), ensure results is iterable, and access attributes defensively.

🧰 Tools
🪛 Pylint (3.3.7)

[error] 11-11: Assigning result of a function call, where the function returns None

(E1128)


[error] 11-11: Too many positional arguments for function call

(E1121)


[error] 12-12: Non-iterable value results is used in an iterating context

(E1133)


3-5: ⚠️ Potential issue

search_web() signature & stub guarantee a runtime crash

The function is declared with zero parameters yet is invoked with one on line 11. This will raise TypeError: search_web() takes 0 positional arguments but 1 was given on the first execution.
While here, the stub returning None also breaks the comprehension on line 12.

-def search_web():
-    return None
+def search_web(query: str) -> list:
+    """
+    TODO: Replace with a real web-search implementation.
+    Returning an empty list keeps the graph alive until then.
+    """
+    return []

14-17: 🛠️ Refactor suggestion

Error path drops the entire upstream state

Returning only {"error": …} loses every field produced by earlier nodes, breaking downstream steps that expect them.

-        return {
-            "error": str(e)
-        }
+        return {
+            **state,
+            "error": str(e)
+        }
new-backend/app/modules/langgraph_nodes/sentiment.py (1)

10-14: ⚠️ Potential issue

Error path drops the whole state and uses print instead of structured logging
Returning a fresh dict removes previously accumulated data and breaks downstream nodes—exactly the issue flagged in the earlier review. Also, print is unsuitable for production; use the logging module with exception(...) to retain stack-trace.

-import   # probably at top of file
-logger = logging.getLogger(__name__)
-
-    except Exception as e:
-        print(f"some error occured:{e}")
-        return {
-            "error": str(e)
-        }
+    except Exception as e:
+        logger.exception("Sentiment analysis failed")
+        return {
+            **state,
+            "error": str(e)
+        }
new-backend/app/modules/langgraph_nodes/store_and_send.py (1)

1-1: ⚠️ Potential issue

Import path is still wrong – should include app. prefix
Previous reviews already pointed out that vector_store.py lives under app.utils. Keeping the old path will raise ModuleNotFoundError at runtime.

-from utils.vector_store import save_to_vector_db
+from app.utils.vector_store import save_to_vector_db
🧹 Nitpick comments (2)
new-backend/app/modules/langgraph_nodes/store_and_send.py (2)

10-11: Use structured logging and fix typo in message
print(f"some error occured:{e}")
• “occurred” is misspelled.
print makes debugging in prod painful—use the project logger.

-        print(f"some error occured:{e}")
+        logger.error("store_and_send failed: %s", exc)

(Assumes logger is configured at module level.)


4-19: Optional: add type hints & docstring for clarity
This node is central to the graph; documenting the expected shape of state prevents misuse and aids static analysis.

-def store_and_send(state):
+def store_and_send(state: dict[str, object]) -> dict[str, object]:
+    """
+    Persist the final state to the vector store and return an
+    augmented state with `status` set to either `"success"` or `"error"`.
+    Expected keys in `state`: `text`, `perspective`, `facts`, `score`, …
+    """
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b3d3df1 and 5c76b93.

📒 Files selected for processing (6)
  • new-backend/app/modules/langgraph_nodes/fact_check.py (1 hunks)
  • new-backend/app/modules/langgraph_nodes/generate_perspective.py (1 hunks)
  • new-backend/app/modules/langgraph_nodes/judge.py (1 hunks)
  • new-backend/app/modules/langgraph_nodes/sentiment.py (1 hunks)
  • new-backend/app/modules/langgraph_nodes/store_and_send.py (1 hunks)
  • new-backend/app/modules/scraper/cleaner.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • new-backend/app/modules/langgraph_nodes/judge.py
  • new-backend/app/modules/langgraph_nodes/generate_perspective.py
  • new-backend/app/modules/scraper/cleaner.py
🧰 Additional context used
🪛 Pylint (3.3.7)
new-backend/app/modules/langgraph_nodes/fact_check.py

[error] 11-11: Assigning result of a function call, where the function returns None

(E1128)


[error] 11-11: Too many positional arguments for function call

(E1121)


[error] 12-12: Non-iterable value results is used in an iterating context

(E1133)

Comment on lines +1 to +3
from transformers import pipeline

sentiment_pipeline = pipeline("sentiment-analysis")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Eager pipeline initialisation will slow cold-starts and consumes ~1-2 GB RAM
Loading the HF model at import time means every worker instantiates it even if the node is never executed. Move the initialisation behind a lazy singleton or load it inside run_sentiment the first time it is called to cut cold-start latency and memory.

-from transformers import pipeline
-
-sentiment_pipeline = pipeline("sentiment-analysis")
+from transformers import pipeline
+_sentiment_pipeline = None
+
+
+def _get_pipeline():
+    global _sentiment_pipeline
+    if _sentiment_pipeline is None:          # lazy, thread-safe under GIL
+        _sentiment_pipeline = pipeline("sentiment-analysis")
+    return _sentiment_pipeline
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
from transformers import pipeline
sentiment_pipeline = pipeline("sentiment-analysis")
from transformers import pipeline
_sentiment_pipeline = None
def _get_pipeline():
global _sentiment_pipeline
if _sentiment_pipeline is None: # lazy, thread-safe under GIL
_sentiment_pipeline = pipeline("sentiment-analysis")
return _sentiment_pipeline
🤖 Prompt for AI Agents
In new-backend/app/modules/langgraph_nodes/sentiment.py at lines 1 to 3, the
sentiment-analysis pipeline is eagerly initialized at import time, causing high
memory usage and slow cold-starts. Refactor the code to initialize the pipeline
lazily by moving the pipeline creation inside the run_sentiment function or
implement a lazy singleton pattern so the model loads only on the first call,
reducing unnecessary resource consumption.

Comment on lines 10 to 14
except Exception as e:
print(f"some error occured:{e}")
return {
"error": str(e)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Error branch drops useful context and returns inconsistent schema
If persisting fails you return a dict that contains only "error", whereas the success branch returns the full state plus "status". Clients that rely on the rest of the payload will break.

-    except Exception as e:
-        print(f"some error occured:{e}")
-        return {
-            "error": str(e)
-        }
+    except Exception as exc:
+        # TODO: replace print with structured logger
+        return {
+            **state,
+            "status": "error",
+            "error": str(exc),
+        }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
except Exception as e:
print(f"some error occured:{e}")
return {
"error": str(e)
}
except Exception as exc:
# TODO: replace print with structured logger
return {
**state,
"status": "error",
"error": str(exc),
}
🤖 Prompt for AI Agents
In new-backend/app/modules/langgraph_nodes/store_and_send.py around lines 10 to
14, the error handling returns a dictionary with only an "error" key, which is
inconsistent with the success response that includes the full state and a
"status" key. Modify the error return to include the same keys as the success
response, adding the "error" message and an appropriate "status" value, so
clients receive a consistent schema regardless of success or failure.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (4)
new-backend/app/modules/langgraph_nodes/fact_check.py (2)

3-4: Stub returns None – the very next line will crash
search_web() returns None, yet the caller iterates over its result. The problem has been raised in a previous review and is still unresolved.

-def search_web():
-    return None
+def search_web(query: str) -> list[dict]:
+    # TODO: integrate real search engine
+    return []          # keeps the pipeline functional

10-12: ⚠️ Potential issue

Unsafe state access, wrong call-signature, and blind iteration

  1. keywords = state["keywords"] → will raise KeyError unless every upstream node populates it.
  2. search_web(text + " " + " ".join(keywords)) passes an arg while search_web() takes none.
  3. Iterates over results without verifying iterability or attribute shape (r.text, r.link).

These three issues will surface on the very first real run.

-        keywords = state["keywords"]
-        results = search_web(text + " " + " ".join(keywords))
-        sources = [{"snippet": r.text, "url": r.link} for r in results]
+        keywords = state.get("keywords", [])
+        query = " ".join([text, *keywords])
+        results = search_web(query) or []
+
+        sources = []
+        for r in results:
+            snippet = getattr(r, "text", None) or getattr(r, "snippet", "")
+            url     = getattr(r, "link", None) or getattr(r, "url",     "")
+            sources.append({"snippet": snippet, "url": url})
🧰 Tools
🪛 Pylint (3.3.7)

[error] 11-11: Assigning result of a function call, where the function returns None

(E1128)


[error] 11-11: Too many positional arguments for function call

(E1121)


[error] 12-12: Non-iterable value results is used in an iterating context

(E1133)

new-backend/app/modules/langgraph_nodes/store_and_send.py (2)

10-11: Replace print with structured logger
Plain print will vanish in many deployments and cannot be aggregated or searched. Use the project’s logger (e.g. logging.getLogger(__name__)) or any structured logging helper already configured.
Past review raised the same point 🠒 still unresolved.


1-1: ⚠️ Potential issue

Fix incorrect import path – prepend app. package
The vector_store.py module lives under app.utils, not at the repository root. This import will raise ModuleNotFoundError in every runtime environment outside of the project root.

-from utils.vector_store import save_to_vector_db
+from app.utils.vector_store import save_to_vector_db
🧹 Nitpick comments (1)
new-backend/app/modules/langgraph_nodes/store_and_send.py (1)

6-9: Avoid redundant dict-copy – pass state directly
{ **state } allocates a shallow copy for no benefit. Hand the original object to save_to_vector_db unless immutability is required (it isn’t modified afterwards).

-        save_to_vector_db({
-            **state
-        })
+        save_to_vector_db(state)
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b3d3df1 and 5c76b93.

📒 Files selected for processing (6)
  • new-backend/app/modules/langgraph_nodes/fact_check.py (1 hunks)
  • new-backend/app/modules/langgraph_nodes/generate_perspective.py (1 hunks)
  • new-backend/app/modules/langgraph_nodes/judge.py (1 hunks)
  • new-backend/app/modules/langgraph_nodes/sentiment.py (1 hunks)
  • new-backend/app/modules/langgraph_nodes/store_and_send.py (1 hunks)
  • new-backend/app/modules/scraper/cleaner.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
  • new-backend/app/modules/langgraph_nodes/sentiment.py
  • new-backend/app/modules/langgraph_nodes/generate_perspective.py
  • new-backend/app/modules/langgraph_nodes/judge.py
  • new-backend/app/modules/scraper/cleaner.py
🧰 Additional context used
🪛 Pylint (3.3.7)
new-backend/app/modules/langgraph_nodes/fact_check.py

[error] 11-11: Assigning result of a function call, where the function returns None

(E1128)


[error] 11-11: Too many positional arguments for function call

(E1121)


[error] 12-12: Non-iterable value results is used in an iterating context

(E1133)

Comment on lines 13 to 17
except Exception as e:
print(f"some error occured:{e}")
return {
"error": str(e)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Error path breaks the LangGraph contract
When an exception occurs you return {"error": ...} and drop the rest of the state.
Down-stream nodes will blow up because expected keys like "text" are gone.
At minimum, merge the error into the existing state or re-raise.

-        return {
-            "error": str(e)
-        }
+        state["error"] = str(e)
+        return state    # preserves contract
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
except Exception as e:
print(f"some error occured:{e}")
return {
"error": str(e)
}
except Exception as e:
print(f"some error occured:{e}")
state["error"] = str(e)
return state # preserves contract
🤖 Prompt for AI Agents
In new-backend/app/modules/langgraph_nodes/fact_check.py around lines 13 to 17,
the exception handler returns only {"error": ...}, which breaks the LangGraph
contract by dropping expected keys like "text". To fix this, modify the error
handling to merge the error message into the existing state dictionary instead
of replacing it, ensuring all expected keys remain present, or alternatively
re-raise the exception to avoid returning an incomplete state.

except Exception as e:
print(f"some error occured:{e}")
return {
"error": str(e)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similar to status as success, you can add status as failed for error condition.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (2)
new-backend/app/modules/langgraph_nodes/fact_check.py (2)

3-5: Stub returns None: guaranteed crash

search_web() returns None, yet the caller iterates over its result. This issue has been flagged before.
Fix by returning an empty list until a real implementation lands.

-def search_web():
-    return None
+def search_web(query: str) -> list:
+    # TODO: integrate real search engine. Empty list keeps the graph alive.
+    return []

8-13: Multiple unsafe assumptions (keywords, result attributes, iterability)

  1. state["keywords"] will raise KeyError when the key is absent.
  2. You concatenate and iterate over results without checking its type.
  3. Each result is assumed to expose .text and .link.

These exact issues were pointed out in earlier reviews.

-        text = state["text"]
-        keywords = state["keywords"]
-        results = search_web(text + " " + " ".join(keywords))
-        sources = [{"snippet": r.text, "url": r.link} for r in results]
+        text = state["text"]
+        keywords = state.get("keywords", [])
+        query = " ".join([text, *keywords])
+        results = search_web(query)
+
+        sources = []
+        for r in results:
+            snippet = getattr(r, "text", "") or getattr(r, "snippet", "")
+            url = getattr(r, "link", "") or getattr(r, "url", "")
+            sources.append({"snippet": snippet, "url": url})
🧰 Tools
🪛 Pylint (3.3.7)

[error] 11-11: Assigning result of a function call, where the function returns None

(E1128)


[error] 11-11: Too many positional arguments for function call

(E1121)


[error] 12-12: Non-iterable value results is used in an iterating context

(E1133)

🧹 Nitpick comments (2)
new-backend/app/modules/langgraph_nodes/judge.py (1)

8-12: Return path drops original state on error

The dict you return on error omits the incoming state, so downstream diagnostics lose context (e.g., original text, sentiment, facts). Consider merging:

-        return {
-            "status": "error",
-            "error_from": "judge_perspective",
-            "message": f"{e}",
-            }
+        return {
+            **state,
+            "status": "error",
+            "error_from": "judge_perspective",
+            "message": str(e),
+        }
new-backend/app/modules/langgraph_nodes/fact_check.py (1)

15-19: Preserve incoming state on error

Similar to the judge node, the error path discards the original state, making debugging harder and breaking contract symmetry. Merge the state instead of replacing it.

-        return {
-            "status": "error",
-            "error_from": "fact_checking",
-            "message": f"{e}",
-            }
+        return {
+            **state,
+            "status": "error",
+            "error_from": "fact_checking",
+            "message": str(e),
+        }
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5c76b93 and d9089c2.

📒 Files selected for processing (7)
  • new-backend/app/modules/langgraph_builder.py (1 hunks)
  • new-backend/app/modules/langgraph_nodes/error_handler.py (1 hunks)
  • new-backend/app/modules/langgraph_nodes/fact_check.py (1 hunks)
  • new-backend/app/modules/langgraph_nodes/generate_perspective.py (1 hunks)
  • new-backend/app/modules/langgraph_nodes/judge.py (1 hunks)
  • new-backend/app/modules/langgraph_nodes/sentiment.py (1 hunks)
  • new-backend/app/modules/langgraph_nodes/store_and_send.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
  • new-backend/app/modules/langgraph_nodes/sentiment.py
  • new-backend/app/modules/langgraph_nodes/store_and_send.py
  • new-backend/app/modules/langgraph_nodes/generate_perspective.py
  • new-backend/app/modules/langgraph_builder.py
🧰 Additional context used
🪛 Pylint (3.3.7)
new-backend/app/modules/langgraph_nodes/fact_check.py

[error] 11-11: Assigning result of a function call, where the function returns None

(E1128)


[error] 11-11: Too many positional arguments for function call

(E1121)


[error] 12-12: Non-iterable value results is used in an iterating context

(E1133)

Comment on lines +3 to +6
def error_handler(input):
print("Error detected!")
print(f"From: {input.get('error_from')}")
print(f"Message: {input.get('message')}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Avoid shadowing built-ins and switch to proper logging

Using the parameter name input shadows Python’s built-in input() function, which can be confusing.
Additionally, print() statements are not suitable for production logging; prefer the project’s configured logger.

-def error_handler(input):
-    print("Error detected!")
-    print(f"From: {input.get('error_from')}")
-    print(f"Message: {input.get('message')}")
+import logging
+logger = logging.getLogger(__name__)
+
+def error_handler(error_payload: dict):
+    logger.error("Error detected!")
+    logger.error("From: %s", error_payload.get("error_from"))
+    logger.error("Message: %s", error_payload.get("message"))
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def error_handler(input):
print("Error detected!")
print(f"From: {input.get('error_from')}")
print(f"Message: {input.get('message')}")
import logging
logger = logging.getLogger(__name__)
def error_handler(error_payload: dict):
logger.error("Error detected!")
logger.error("From: %s", error_payload.get("error_from"))
logger.error("Message: %s", error_payload.get("message"))
🤖 Prompt for AI Agents
In new-backend/app/modules/langgraph_nodes/error_handler.py around lines 3 to 6,
rename the parameter from 'input' to a non-built-in name like 'error_info' to
avoid shadowing Python's built-in input() function. Replace the print statements
with calls to the project's configured logger (e.g., logger.error or
logger.info) to properly log the error messages instead of printing them to
standard output.

"error_handler"
if state.get("status") == "error"
else ("generate_perspective"
if state.get("score", 0) < 70

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the score is less than 70, it will keep looping back to generate_perspective indefinitely until a higher score is achieved. To prevent this infinite loop, a retry limit should be added.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
new-backend/app/modules/langgraph_nodes/fact_check.py (2)

9-12: Still crashes when keywords missing & assumes result shape
(This repeats an earlier review point)
state["keywords"] will raise KeyError when upstream nodes don’t supply the key, and the list-comprehension still expects .text / .link attributes.

-        keywords = state["keywords"]
-        results = search_web(text + " " + " ".join(keywords))
-        sources = [{"snippet": r.text, "url": r.link} for r in results]
+        keywords = state.get("keywords", [])
+        query = " ".join([text, *keywords])
+        results = search_web(query)
+
+        sources = []
+        for r in results:                         # tolerate dicts or objects
+            snippet = (r.get("text") if isinstance(r, dict)
+                       else getattr(r, "text", "")) or ""
+            url = (r.get("link") if isinstance(r, dict)
+                   else getattr(r, "link", "")) or ""
+            sources.append({"snippet": snippet, "url": url})
🧰 Tools
🪛 Pylint (3.3.7)

[error] 11-11: Too many positional arguments for function call

(E1121)


15-19: Error path breaks LangGraph contract (again)
Returning a brand-new dict strips all existing state keys, so downstream nodes won’t find mandatory fields like "text". Merge the error into state instead:

-        return {
-            "status": "error",
-            "error_from": "fact_checking",
-            "message": f"{e}",
-            }
+        state["status"] = "error"
+        state["error_from"] = "fact_checking"
+        state["message"] = str(e)
+        return state
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d9089c2 and 273f368.

📒 Files selected for processing (1)
  • new-backend/app/modules/langgraph_nodes/fact_check.py (1 hunks)
🧰 Additional context used
🪛 Pylint (3.3.7)
new-backend/app/modules/langgraph_nodes/fact_check.py

[error] 11-11: Too many positional arguments for function call

(E1121)

Comment on lines +3 to +4
def search_web():
return []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Runtime‐blocking: search_web signature mismatch

search_web is declared with no parameters, yet an argument is supplied at the call-site (line 11). Python will raise
TypeError: search_web() takes 0 positional arguments but 1 was given the first time the node runs.

Diff to align the signature with its usage and add a minimal contract:

-def search_web():
-    return []
+def search_web(query: str) -> list[dict]:
+    """
+    Placeholder web-search. Keeps the pipeline alive until a real
+    implementation is plugged in.
+    """
+    # TODO: integrate actual search provider
+    return []
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def search_web():
return []
def search_web(query: str) -> list[dict]:
"""
Placeholder web-search. Keeps the pipeline alive until a real
implementation is plugged in.
"""
# TODO: integrate actual search provider
return []
🤖 Prompt for AI Agents
In new-backend/app/modules/langgraph_nodes/fact_check.py around lines 3 to 4,
the function search_web is defined without parameters but is called with one
argument at line 11, causing a TypeError. Update the search_web function
signature to accept one parameter matching the call-site argument, and add a
minimal type hint or docstring to clarify the expected input and output.



def run_sentiment(state):
text = state["text"]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a try-except block to handle cases where state["text"] is missing or empty. Similarly, implement such error handling for state["facts"] , state["perspective"] and others to ensure the keys exist and contain valid content

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (6)
new-backend/app/modules/langgraph_nodes/fact_check.py (3)

3-4: ⚠️ Potential issue

search_web signature still mismatched with its call ‑ runtime crash guaranteed
search_web is defined without parameters but invoked with one on line 17, raising
TypeError: search_web() takes 0 positional arguments but 1 was given.

-def search_web():
-    return []
+from typing import List, Dict
+
+def search_web(query: str) -> List[Dict[str, str]]:
+    """
+    Stubbed web-search function.  Replace with a real implementation later.
+    """
+    return []

Also applies to: 17-17


20-25: 🛠️ Refactor suggestion

Error path drops the rest of the state & breaks downstream nodes
Returning only status, error_from, and message removes keys expected by later nodes (cleaned_text, etc.). Merge the error into the existing state instead.

-        return {
-            "status": "error",
-            "error_from": "fact_checking",
-            "message": f"{e}",
-            }
+        state["status"] = "error"
+        state["error_from"] = "fact_checking"
+        state["message"] = str(e)
+        return state

18-18: 🛠️ Refactor suggestion

Assumes .text / .link attributes on search results
Unless search_web guarantees these attributes, this comprehension will AttributeError. Use getattr with sensible fallbacks:

-        sources = [{"snippet": r.text, "url": r.link} for r in results]
+        sources = [
+            {"snippet": getattr(r, "text", "") or getattr(r, "snippet", ""),
+             "url": getattr(r, "link", "") or getattr(r, "url", "")}
+            for r in results
+        ]
new-backend/app/modules/langgraph_nodes/generate_perspective.py (3)

16-18: ⚠️ Potential issue

my_llm must be a BaseLLM instance, not a string
Passing a plain string causes LLMChain to explode during graph compilation. Instantiate a real LLM or gate the node behind a feature flag.

-from langchain.chains import LLMChain
+from langchain.chains import LLMChain
+from langchain_openai import ChatOpenAI  # or any provider

-my_llm = "groq llm"
+my_llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0.7)

37-42: 🛠️ Refactor suggestion

Error path again discards original state
Same issue as in fact_check.py. Merge the error into state or re-raise.

-        return {
-            "status": "error",
-            "error_from": "generate_perspective",
-            "message": f"{e}",
-        }
+        state["status"] = "error"
+        state["error_from"] = "generate_perspective"
+        state["message"] = str(e)
+        return state

34-35: 🛠️ Refactor suggestion

Uses state["facts"] after already retrieving facts
You computed facts = state.get("facts") but ignore it in the join, risking a KeyError you tried to guard against.

-        facts = "\n".join([f["snippet"] for f in state["facts"]])
+        fact_snippets = "\n".join(f.get("snippet", "") for f in facts)

Then pass fact_snippets to chain.run.

🧹 Nitpick comments (1)
new-backend/app/modules/langgraph_nodes/generate_perspective.py (1)

29-32: Redundant elif after raise
As above, once a raise executes the function exits. Replace elif with a second if for cleaner flow.

🧰 Tools
🪛 Pylint (3.3.7)

[refactor] 29-32: Unnecessary "elif" after "raise", remove the leading "el" from "elif"

(R1720)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 273f368 and 23db210.

📒 Files selected for processing (7)
  • frontend/app/page.tsx (1 hunks)
  • new-backend/app/modules/langgraph_builder.py (1 hunks)
  • new-backend/app/modules/langgraph_nodes/error_handler.py (1 hunks)
  • new-backend/app/modules/langgraph_nodes/fact_check.py (1 hunks)
  • new-backend/app/modules/langgraph_nodes/generate_perspective.py (1 hunks)
  • new-backend/app/modules/langgraph_nodes/judge.py (1 hunks)
  • new-backend/app/modules/langgraph_nodes/sentiment.py (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • frontend/app/page.tsx
🚧 Files skipped from review as they are similar to previous changes (4)
  • new-backend/app/modules/langgraph_nodes/error_handler.py
  • new-backend/app/modules/langgraph_nodes/sentiment.py
  • new-backend/app/modules/langgraph_nodes/judge.py
  • new-backend/app/modules/langgraph_builder.py
🧰 Additional context used
🪛 Pylint (3.3.7)
new-backend/app/modules/langgraph_nodes/fact_check.py

[refactor] 12-15: Unnecessary "elif" after "raise", remove the leading "el" from "elif"

(R1720)


[error] 17-17: Too many positional arguments for function call

(E1121)

new-backend/app/modules/langgraph_nodes/generate_perspective.py

[refactor] 29-32: Unnecessary "elif" after "raise", remove the leading "el" from "elif"

(R1720)

🔇 Additional comments (1)
new-backend/app/modules/langgraph_nodes/fact_check.py (1)

27-30: status key silently overwrites any pre-existing status
If an earlier node already set status, this unconditional assignment will erase it. Consider a namespaced key (fact_check_status) or check before overwrite.

Comment on lines +10 to +15
keywords = state["keywords"]

if not text:
raise ValueError("Missing or empty 'cleaned_text' in state")
elif not keywords:
raise ValueError("Missing or empty 'keywords' in state")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

state["keywords"] may raise KeyError and the elif after a raise is redundant

-        keywords = state["keywords"]
+        keywords = state.get("keywords", [])

After the first raise, the elif branch is unreachable; switch to a plain if for clarity.

-        if not text:
-            raise ValueError(...)
-        elif not keywords:
+        if not text:
+            raise ValueError(...)
+        if not keywords:
             raise ValueError(...)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
keywords = state["keywords"]
if not text:
raise ValueError("Missing or empty 'cleaned_text' in state")
elif not keywords:
raise ValueError("Missing or empty 'keywords' in state")
keywords = state.get("keywords", [])
if not text:
raise ValueError("Missing or empty 'cleaned_text' in state")
if not keywords:
raise ValueError("Missing or empty 'keywords' in state")
🧰 Tools
🪛 Pylint (3.3.7)

[refactor] 12-15: Unnecessary "elif" after "raise", remove the leading "el" from "elif"

(R1720)

🤖 Prompt for AI Agents
In new-backend/app/modules/langgraph_nodes/fact_check.py around lines 10 to 15,
the code accesses state["keywords"] directly which may raise a KeyError if the
key is missing, and the elif after a raise is redundant since the raise exits
the function. Fix this by using state.get("keywords") to safely access the
keywords key and replace the elif with a separate if statement for clarity.

@ManavSarkar ManavSarkar merged commit 8872d5a into main Jun 26, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants