Backend deployment by ParagGhatage · Pull Request #106 · AOSSIE-Org/Perspective

ParagGhatage · 2025-07-07T11:04:13Z

Tasks Done:

Added Dockerfile and .dockerignore for backend deployment.
created Hugging Face Space for backend deployment and configured it.
Deployed backend
(url: https://thunder1245-perspective-backend.hf.space/api/ )
GitHub Actions workflow to Deploy backend to Hugging Face Space on each push to main branch.
Tested GitHub Actions workflow locally using act

[🚀 Deploy Backend to HF Space/deploy   ]   ✅  Success - Main 📤 Sync backend code [3.237888166s]
[🚀 Deploy Backend to HF Space/deploy   ] ⭐ Run Complete job
[🚀 Deploy Backend to HF Space/deploy   ] Cleaning up container for job deploy
[🚀 Deploy Backend to HF Space/deploy   ]   ✅  Success - Complete job
[🚀 Deploy Backend to HF Space/deploy   ] 🏁  Job succeeded

Summary by CodeRabbit

New Features
- Added a new AI prompt template for generating counter-perspectives to articles.
- Introduced a fact-checking pipeline integrating claim extraction, web search, and verification.
- Added Pinecone vector store integration for managing embeddings.
- Implemented new backend modules for generating, judging, and storing perspectives using advanced language models.
- Enhanced frontend analysis loading and results pages with improved state management and error handling.
Bug Fixes
- Improved the analysis loading workflow to handle API errors and ensure progress only starts after a successful response.
- The analysis results page now displays the latest analysis data retrieved from session storage.
Style
- Minor formatting and semicolon additions to the Bias Meter component for consistency.
Chores
- Updated frontend dependencies to include axios for API requests.
- Removed unused backend and new-backend files, dependencies, and configuration to streamline the codebase.
- Added Docker and GitHub Actions workflows for backend deployment and containerization.
- Added backend README and startup scripts for improved developer experience.

coderabbitai · 2025-07-07T11:04:21Z

Important

Review skipped

Review was skipped as selected files did not have any reviewable changes.

💤 Files selected but had no reviewable changes (2)

backend/app/modules/langgraph_nodes/fact_check.py
backend/app/modules/pipeline.py

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

This change removes the legacy backend and test files, replacing them with a new modular backend architecture. It introduces new FastAPI app setup, vector store integration, fact-checking and perspective generation modules, embedding utilities, and a Docker-based deployment workflow. The frontend is updated for analysis result handling, and dependency management is migrated to pyproject.toml.

Changes

Files/Groups	Change Summary
`backend/app/main.py`, `backend/app/routes.py`, `backend/app/services/`, `backend/app/prompts/`, `backend/app/scrapers/*`, `backend/app/test_perspective.py`, `backend/requirements.txt`	Deleted legacy backend FastAPI app, all API routes, prompt templates, service modules, scrapers, and test script.
`new-backend/main.py`, `new-backend/app/modules/langgraph_nodes/*`, `new-backend/app/utils/prompt_templates.py`, `new-backend/README.md`	Deleted new-backend FastAPI entrypoint, all node modules, and documentation (now replaced by new backend structure).
`backend/main.py`, `backend/app/db/vector_store.py`, `backend/app/modules/facts_check/`, `backend/app/modules/langgraph_builder.py`, `backend/app/modules/langgraph_nodes/`, `backend/app/modules/vector_store/`, `backend/app/utils/`	Added new FastAPI app, Pinecone vector store setup, fact-checking modules, state graph builder, node implementations, embedding, chunking, and utility functions.
`backend/pyproject.toml`	Added new dependencies: search, NLP, LangChain, Pinecone, sentence-transformers, etc.
`backend/Dockerfile`, `backend/.dockerignore`, `backend/README.md`, `backend/start.sh`	Added Dockerfile, .dockerignore, new backend README, and startup script for containerized deployment.
`.github/workflows/deploy-backend-to-hf.yml`	Added GitHub Actions workflow for deploying backend to Hugging Face Spaces.
`.gitignore`	Added `.github/act-events/` and `.secrets` to ignored files.
`frontend/app/analyze/loading/page.tsx`	Refactored analysis workflow to await API call before progress simulation; improved error handling.
`frontend/app/analyze/results/page.tsx`	Added retrieval and display of analysis results from sessionStorage as formatted JSON.
`frontend/components/bias-meter.tsx`	Minor formatting and semicolon additions; no logic changes.
`frontend/package.json`	Added `axios` dependency.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant Frontend
    participant Backend API
    participant FactCheck
    participant PerspectiveGen
    participant VectorStore

    User->>Frontend: Submit article URL
    Frontend->>Backend API: POST /api/analyze {url}
    Backend API->>FactCheck: Extract and verify claims
    FactCheck->>FactCheck: Web search & LLM verification
    FactCheck-->>Backend API: Verified facts
    Backend API->>PerspectiveGen: Generate counter-perspective
    PerspectiveGen-->>Backend API: Perspective result
    Backend API->>VectorStore: Store embeddings/chunks
    VectorStore-->>Backend API: Store confirmation
    Backend API-->>Frontend: Analysis results
    Frontend-->>User: Display results

Possibly related PRs

AOSSIE-Org/Perspective#97: Migrates backend to a new modular FastAPI structure, directly corresponding to the removal and replacement of the legacy backend in this PR.

Poem

A backend reborn, the old swept away,
With vectors and facts, we’re ready to play!
From claims to perspectives, embeddings we store,
Docker and scripts help us open the door.
The frontend now fetches results with delight—
This bunny’s code garden is looking just right!
🥕✨

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Explain this complex logic.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai explain this code block.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and explain its main purpose.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 20

🔭 Outside diff range comments (2)

new-backend/pyproject.toml (1)
6-6: Relax Python version requirement to 3.11+

Most of Python 3.13’s new capabilities (enhanced REPL, experimental free-threaded mode, preliminary JIT, richer tracebacks, stdlib import optimizations, etc.) are either developer-convenience features or still experimental. All core AI/ML and web-search libraries you’re using are fully compatible with Python 3.11+, so you can broaden your deployment options by targeting 3.11 instead:
-requires-python = ">=3.13"
+requires-python = ">=3.11"
new-backend/app/modules/langgraph_nodes/generate_perspective.py (1)
49-55: Fix typo and improve error handling.

There's a typo in the error message and the exception handling could be more specific.
     except Exception as e:
-        print(f"some error occured in generate_perspective:{e}")
+        print(f"Error occurred in generate_perspective: {e}")
         return {
             "status": "error",
             "error_from": "generate_perspective",
-            "message": f"{e}",
+            "message": str(e),
         }
Consider catching specific exceptions (e.g., ValueError, KeyError) for better error handling.

🧹 Nitpick comments (20)

frontend/app/analyze/results/page.tsx (1)
23-23: Consider adding type safety for analysis data.

The analysisData state is typed as null but could benefit from proper TypeScript typing based on the expected API response structure.
-const [analysisData, setAnalysisData] = useState(null)
+const [analysisData, setAnalysisData] = useState<AnalysisResult | null>(null)
Consider defining an interface for the expected analysis result structure.
frontend/app/analyze/loading/page.tsx (1)
95-102: Optimize progress bar animation performance.

Updating progress every 100ms may cause unnecessary re-renders.
-const progressInterval = setInterval(() => {
-  setProgress((prev) => {
-    if (prev < 100) {
-      return prev + 1
-    }
-    return prev
-  })
-}, 100)
+const progressInterval = setInterval(() => {
+  setProgress((prev) => {
+    if (prev < 100) {
+      return Math.min(prev + 2, 100) // Increment by 2 every 200ms instead
+    }
+    return prev
+  })
+}, 200)
This reduces the update frequency while maintaining smooth animation.
new-backend/app/modules/scraper/cleaner.py (1)
2-10: Consider pre-downloading NLTK data during Docker build instead of runtime.

The current implementation downloads NLTK corpora at module import time, which can cause delays during application startup and potential network issues in production environments.

For containerized deployments, consider downloading NLTK data during the Docker build process instead:
# In Dockerfile
RUN python -c "import nltk; nltk.download('stopwords'); nltk.download('punkt_tab')"
Then simplify the code to:
-try:
-    nltk.data.find('corpora/stopwords')
-    nltk.data.find('corpora/punkt_tab')
-
-except LookupError:
-    nltk.download('stopwords')
-    nltk.download('punkt_tab')
new-backend/app/modules/facts_check/web_search.py (1)
6-8: Consider using a more secure method for API key handling.

Storing API keys in environment variables is a good practice, but consider additional security measures for production deployments.

For enhanced security, consider:

Using a secrets management service

Implementing API key rotation

Adding logging for security monitoring (without exposing the key)
 def search_with_serpapi(query, max_results=1):
     api_key = os.getenv("SERPAPI_KEY")
     if not api_key:
-        raise ValueError("SERPAPI_KEY not set in environment")
+        raise ValueError("SERPAPI_KEY not set in environment")
+    
+    # Log API usage for monitoring (without exposing key)
+    print(f"Performing search with query: {query[:50]}...")
new-backend/.dockerignore (1)
1-2: Consider adding more comprehensive exclusions for production deployment.

The current exclusions are good, but consider adding common development and build artifacts:
 /.venv
 */.env
+*.pyc
+__pycache__/
+.git/
+.pytest_cache/
+*.log
+.DS_Store
+node_modules/
+.coverage
+htmlcov/
new-backend/app/modules/langgraph_nodes/sentiment.py (1)
34-34: Consider reducing temperature for more deterministic sentiment analysis.

A temperature of 0.2 might introduce unnecessary randomness for sentiment analysis, which should be deterministic.
-            temperature=0.2,
+            temperature=0.0,
new-backend/start.sh (2)
4-5: Consider optimizing the uv installation check.

The script installs uv unconditionally, which may be inefficient if it's already present. Consider checking if uv is available before installation.
# Install uv if not present
-pip install uv
+if ! command -v uv &> /dev/null; then
+    echo "Installing uv..."
+    pip install uv
+fi
8-9: Add error handling for critical operations.

Consider adding validation to ensure the sync operation succeeds before attempting to run the application.
# Sync environment and run app
-uv sync
-uv run main.py
+echo "Syncing dependencies..."
+uv sync || { echo "Failed to sync dependencies"; exit 1; }
+echo "Starting application..."
+uv run main.py
new-backend/app/utils/generate_chunk_id.py (1)
4-8: Consider collision risk with truncated hash.

The function truncates the SHA-256 hash to 15 characters, which reduces the collision resistance. While this is likely acceptable for article IDs, consider documenting this limitation or using a longer hash if uniqueness is critical.

For better collision resistance, consider using a longer hash:
-    return f"article-{hashed_text[:15]}"
+    return f"article-{hashed_text[:32]}"  # Use 32 characters for better uniqueness
Alternatively, add documentation about the collision risk:
def generate_id(text: str) -> str:
+    """Generate a unique ID for article text using SHA-256 hash.
+    
+    Note: Hash is truncated to 15 characters. While collision risk is low,
+    consider using full hash for critical applications.
+    """
new-backend/app/utils/prompt_templates.py (1)
3-32: Well-structured prompt template with minor enhancement suggestions.

The prompt template is well-designed with clear sections and structured output format. Consider adding guidance for edge cases where facts might be contradictory or insufficient.

Consider adding instructions for handling edge cases:
Generate a logical and respectful *opposite perspective* to the article.
+If the verified facts contradict the article's claims, acknowledge this in your reasoning.
+If insufficient facts are available, clearly state this limitation.
Use *step-by-step reasoning* and return your output in this JSON format:
new-backend/main.py (2)
28-30: Good deployment configuration with minor improvement suggestion.

The dynamic port configuration and host binding to 0.0.0.0 are appropriate for container deployment. Consider adding validation for the port value.
-    port = int(os.environ.get("PORT",  7860))
+    port = int(os.environ.get("PORT", 7860))
+    if not 1 <= port <= 65535:
+        raise ValueError(f"Invalid port number: {port}")
26-27: Consider adding environment validation.

While the import placement is fine, consider validating required environment variables at startup to fail fast if configuration is missing.
 if __name__ == "__main__":
     import uvicorn
     import os
+    
+    # Validate required environment variables
+    required_env_vars = ["GROQ_API_KEY", "PINECONE_API_KEY"]  # Adjust based on actual requirements
+    missing_vars = [var for var in required_env_vars if not os.getenv(var)]
+    if missing_vars:
+        raise EnvironmentError(f"Missing required environment variables: {missing_vars}")
new-backend/Dockerfile (1)
12-24: Consider cache configuration consistency.

The cache directory is set up but --no-cache flag is used during installation. This might be redundant or conflicting.

Consider either using the cache or removing the cache directory setup:
# Option 1: Use cache
-RUN uv sync --locked --no-cache
+RUN uv sync --locked

# Option 2: Remove cache directory if not using it
-ENV UV_CACHE_DIR=/app/.uv-cache
-RUN mkdir -p /app/.uv-cache && \
-    adduser --disabled-password --gecos "" appuser && \
-    chown -R appuser:appuser /app
+RUN adduser --disabled-password --gecos "" appuser && \
+    chown -R appuser:appuser /app
new-backend/app/modules/langgraph_nodes/fact_check.py (1)
14-14: Fix spelling errors in error messages.

There are typos in the error logging statements: "occured" should be "occurred".

Apply this diff to fix the spelling:
-            print(f"some error occured in fact_checking:{error_message}")
+            print(f"some error occurred in fact_checking:{error_message}")
-        print(f"some error occured in fact_checking:{e}")
+        print(f"some error occurred in fact_checking:{e}")
Also applies to: 22-22
new-backend/app/utils/fact_check_utils.py (2)
46-47: Consider explicit error handling for the final verification step.

The function returns a tuple where the second element can be None on success. Consider making the return type more explicit or handle potential failures in the verification step.
-    final = run_fact_verifier_sdk(search_results)
-    return final.get("verifications", []), None
+    final = run_fact_verifier_sdk(search_results)
+    if final.get("status") != "success":
+        return [], "Fact verification failed."
+    return final.get("verifications", []), None
40-40: Consider making the rate limiting delay configurable.

The hardcoded 5-second delay works for avoiding rate limits but could be made configurable for different environments or API providers.
-        time.sleep(5)  # ⏱️ Gentle delay to avoid DuckDuckGo ratelimit
+        time.sleep(5)  # ⏱️ Gentle delay to avoid SerpAPI ratelimit
Note: The comment mentions DuckDuckGo but the code uses SerpAPI.
new-backend/app/modules/langgraph_nodes/judge.py (1)
6-10: Consider increasing max_tokens for more reliable scoring.

The max_tokens=10 limit might be too restrictive for the LLM to provide consistent scoring responses, especially if the model occasionally includes explanatory text before the score.
     groq_llm = ChatGroq(
         model="gemma2-9b-it",
         temperature=0.0,
-        max_tokens=10,
+        max_tokens=50,
     )
new-backend/app/modules/vector_store/chunk_rag_data.py (1)
4-4: Add type hints for better code documentation.

The function lacks type hints which would improve code maintainability and IDE support. Consider adding them based on the expected input/output types.
+from typing import List, Dict, Any, Union
+
-def chunk_rag_data(data):
+def chunk_rag_data(data: Dict[str, Any]) -> List[Dict[str, Any]]:
new-backend/app/modules/langgraph_nodes/generate_perspective.py (1)
14-19: Consider using environment variables for model configuration.

The model name and temperature are hardcoded. For better flexibility across different environments, consider loading these from environment variables.
+import os
+
-my_llm = "llama-3.3-70b-versatile"
+my_llm = os.getenv("GROQ_MODEL_NAME", "llama-3.3-70b-versatile")

 llm = ChatGroq(
     model=my_llm,
-    temperature=0.7
+    temperature=float(os.getenv("GROQ_TEMPERATURE", "0.7"))
 )
new-backend/app/modules/facts_check/llm_processing.py (1)
22-28: Fix spacing in prompt content.

There's a missing space in the prompt text that could affect the LLM's understanding.
                     "content": (
                         "You are an assistant that extracts "
                         "verifiable factual claims from articles. "
-                        "Each claim must be short, fact-based, and"
-                        " independently verifiable through internet search. "
+                        "Each claim must be short, fact-based, and "
+                        "independently verifiable through internet search. "
                         "Only return a list of 3 clear bullet-point claims."
                     ),

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 501e9c2 and 56804b6.

⛔ Files ignored due to path filters (2)

frontend/package-lock.json is excluded by !**/package-lock.json
new-backend/uv.lock is excluded by !**/*.lock

📒 Files selected for processing (39)

backend/app/main.py (0 hunks)
backend/app/prompts/opposite_perspective.py (0 hunks)
backend/app/prompts/related_topics.py (0 hunks)
backend/app/routes.py (0 hunks)
backend/app/scrapers/article_scraper.py (0 hunks)
backend/app/scrapers/clean_data.py (0 hunks)
backend/app/services/ai_service.py (0 hunks)
backend/app/services/analysis_service.py (0 hunks)
backend/app/services/counter_service.py (0 hunks)
backend/app/services/related_topics.py (0 hunks)
backend/app/services/summarization_service.py (0 hunks)
backend/app/test_perspective.py (0 hunks)
backend/requirements.txt (0 hunks)
frontend/app/analyze/loading/page.tsx (2 hunks)
frontend/app/analyze/results/page.tsx (3 hunks)
frontend/components/bias-meter.tsx (3 hunks)
frontend/package.json (2 hunks)
new-backend/.dockerignore (1 hunks)
new-backend/Dockerfile (1 hunks)
new-backend/README.md (1 hunks)
new-backend/app/db/vector_store.py (1 hunks)
new-backend/app/modules/facts_check/llm_processing.py (1 hunks)
new-backend/app/modules/facts_check/web_search.py (1 hunks)
new-backend/app/modules/langgraph_builder.py (5 hunks)
new-backend/app/modules/langgraph_nodes/fact_check.py (2 hunks)
new-backend/app/modules/langgraph_nodes/generate_perspective.py (2 hunks)
new-backend/app/modules/langgraph_nodes/judge.py (1 hunks)
new-backend/app/modules/langgraph_nodes/sentiment.py (1 hunks)
new-backend/app/modules/langgraph_nodes/store_and_send.py (1 hunks)
new-backend/app/modules/scraper/cleaner.py (1 hunks)
new-backend/app/modules/vector_store/chunk_rag_data.py (1 hunks)
new-backend/app/modules/vector_store/embed.py (1 hunks)
new-backend/app/utils/fact_check_utils.py (1 hunks)
new-backend/app/utils/generate_chunk_id.py (1 hunks)
new-backend/app/utils/prompt_templates.py (1 hunks)
new-backend/app/utils/store_vectors.py (1 hunks)
new-backend/main.py (1 hunks)
new-backend/pyproject.toml (1 hunks)
new-backend/start.sh (1 hunks)

💤 Files with no reviewable changes (13)

backend/requirements.txt
backend/app/main.py
backend/app/prompts/related_topics.py
backend/app/prompts/opposite_perspective.py
backend/app/scrapers/clean_data.py
backend/app/services/analysis_service.py
backend/app/services/summarization_service.py
backend/app/scrapers/article_scraper.py
backend/app/services/related_topics.py
backend/app/test_perspective.py
backend/app/services/counter_service.py
backend/app/services/ai_service.py
backend/app/routes.py

🧰 Additional context used

🧬 Code Graph Analysis (4)

new-backend/app/modules/langgraph_nodes/fact_check.py (1)

new-backend/app/utils/fact_check_utils.py (1)

run_fact_check_pipeline (10-47)

new-backend/app/modules/langgraph_nodes/store_and_send.py (3)

new-backend/app/modules/vector_store/chunk_rag_data.py (1)

chunk_rag_data (4-73)

new-backend/app/modules/vector_store/embed.py (1)

embed_chunks (7-30)

new-backend/app/utils/store_vectors.py (1)

store (10-32)

new-backend/app/modules/vector_store/chunk_rag_data.py (1)

new-backend/app/utils/generate_chunk_id.py (1)

generate_id (4-8)

new-backend/app/modules/langgraph_builder.py (2)

new-backend/app/modules/langgraph_nodes/sentiment.py (1)

run_sentiment_sdk (10-53)

new-backend/app/modules/langgraph_nodes/error_handler.py (1)

error_handler (3-11)

🪛 Ruff (0.11.9)

new-backend/app/utils/store_vectors.py

32-32: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

new-backend/app/modules/langgraph_nodes/store_and_send.py

13-13: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

15-15: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

21-21: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

new-backend/app/db/vector_store.py

14-14: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

40-41: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

🔇 Additional comments (22)

frontend/package.json (1)

41-41: Axios dependency is up to date and secure

Version ^1.10.0 is the latest stable release (June 14, 2025) and has no known security advisories. No further action is needed.

frontend/components/bias-meter.tsx (1)

1-79: LGTM! Excellent formatting improvements.

The addition of semicolons and improved JSX formatting enhances code readability and aligns with TypeScript best practices.

new-backend/README.md (1)

1-10: LGTM! Proper Hugging Face Spaces configuration.

The YAML front matter is correctly configured for Hugging Face Spaces deployment with Docker SDK, which aligns with the PR objectives for backend deployment.

new-backend/app/modules/langgraph_nodes/sentiment.py (2)

35-35: The reduced token limit is appropriate for sentiment analysis.

Reducing max_tokens to 3 makes sense since the expected output is a single word (positive/negative/neutral), which helps ensure concise responses and reduces API costs.

39-39: Good practice to normalize sentiment output.

Converting sentiment to lowercase ensures consistent output format for downstream processing.

new-backend/app/utils/generate_chunk_id.py (1)

5-6: LGTM: Good input validation.

The input validation properly checks for both empty strings and correct type, which prevents common errors.

new-backend/app/utils/prompt_templates.py (1)

21-31: Good JSON structure specification.

The JSON format specification is clear and will help ensure consistent output parsing. The reasoning steps format encourages structured thinking.

new-backend/app/utils/store_vectors.py (1)

10-28: Well-structured function with good validation and logging.

The function properly validates input, handles the Pinecone upsert operation, and provides informative logging. The structure and error handling approach are solid.

new-backend/Dockerfile (1)

1-31: Good security practices with non-root user and proper structure.

The Dockerfile follows security best practices by using a non-root user and properly sets up the working directory. The port configuration for Hugging Face deployment is appropriate.

new-backend/app/modules/langgraph_nodes/fact_check.py (1)

11-20: Well-integrated pipeline with proper error handling.

The integration with the fact-checking pipeline is clean and maintains proper error handling. The function correctly handles both pipeline errors and exceptions while preserving the state structure.

new-backend/app/db/vector_store.py (2)

17-34: Well-designed index management with proper constants.

The index creation logic is sound with appropriate constants and conditional creation. The serverless specification for AWS US East 1 is properly configured.

5-7: Good practice for environment variable validation.

Proper validation of required environment variables with clear error messages.

new-backend/app/modules/vector_store/embed.py (3)

13-18: Excellent input validation with clear error messages.

The validation logic properly checks chunk structure and provides detailed error messages including the problematic index, which aids in debugging.

20-30: Well-structured embedding and vector creation process.

The function efficiently processes text embeddings and creates properly formatted vectors for Pinecone storage. The data structure aligns well with the expected format.

4-4: Appropriate model choice for general text embeddings.

The "all-MiniLM-L6-v2" model is a good choice for general text embeddings, providing a good balance between performance and accuracy. The 384-dimensional output aligns with the vector store configuration.

new-backend/app/modules/langgraph_nodes/judge.py (1)

29-44: Excellent robust response parsing and score validation.

The implementation handles multiple response formats gracefully and includes proper bounds checking for the extracted score. The regex pattern effectively extracts integer values from the response.
new-backend/app/modules/langgraph_builder.py (2)

14-22: Excellent addition of typed state management.

The MyState TypedDict provides clear type definitions for all state variables, improving code maintainability and IDE support. The type annotations are comprehensive and match the expected data flow.

56-102: Verify LangGraph dependency for conditional edges and terminal marker

We couldn’t locate StateGraph or its add_conditional_edges implementation in the repo, nor import langgraph in the sandbox. Please confirm that your installed LangGraph version’s StateGraph API supports:

The add_conditional_edges(source: str, condition: Callable) method

The "__end__" terminal marker

Typical checks:
pip show langgraph  
python - <<EOF
import inspect
from langgraph.graph import StateGraph
print(inspect.signature(StateGraph.add_conditional_edges))
EOF
If unsupported, either revert to set_conditional_edges or bump your LangGraph dependency accordingly.
[new-backend/app/modules/langgraph_builder.py:56–102]
new-backend/app/modules/vector_store/chunk_rag_data.py (2)

6-32: Excellent comprehensive validation of input data.

The validation logic properly checks for required fields, validates data types, and handles both dictionary and object-based perspective data. The safety checks for perspective object attributes are particularly well-implemented.

44-68: Robust fact validation and chunk generation.

The implementation properly validates all required fact fields and generates well-structured chunks with comprehensive metadata. The enumeration approach for fact indexing is clean and maintainable.

new-backend/app/modules/langgraph_nodes/generate_perspective.py (1)

35-38: Good input validation!

The validation of required state fields is well implemented with clear error messages.

new-backend/app/modules/facts_check/llm_processing.py (1)

108-110: Good markdown stripping implementation!

The regex pattern correctly handles both opening and closing markdown code blocks.

coderabbitai · 2025-07-07T11:12:48Z

frontend/app/analyze/results/page.tsx

+    const storedData = sessionStorage.getItem("analysisResult")
+    if (storedData) {
+      setAnalysisData(JSON.parse(storedData))
+    } else {
+      // fallback if user visits results page directly
+      // maybe redirect or show error
+      console.warn("No analysis result found")
+    }


🛠️ Refactor suggestion

Add error handling for sessionStorage parsing.

The JSON.parse operation could throw an error if the stored data is invalid.

const storedData = sessionStorage.getItem("analysisResult") if (storedData) { - setAnalysisData(JSON.parse(storedData)) + try { + setAnalysisData(JSON.parse(storedData)) + } catch (error) { + console.error("Failed to parse analysis data:", error) + // Consider redirecting to analyze page or showing error message + } } else { // fallback if user visits results page directly // maybe redirect or show error console.warn("No analysis result found") }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

const storedData = sessionStorage.getItem("analysisResult")

if (storedData) {

setAnalysisData(JSON.parse(storedData))

} else {

// fallback if user visits results page directly

// maybe redirect or show error

console.warn("No analysis result found")

}

const storedData = sessionStorage.getItem("analysisResult")

if (storedData) {

try {

setAnalysisData(JSON.parse(storedData))

} catch (error) {

console.error("Failed to parse analysis data:", error)

// Consider redirecting to analyze page or showing an error message

}

} else {

// fallback if user visits results page directly

// maybe redirect or show error

console.warn("No analysis result found")

}

🤖 Prompt for AI Agents

In frontend/app/analyze/results/page.tsx around lines 42 to 49, the JSON.parse call on sessionStorage data can throw an error if the stored data is invalid. Wrap the JSON.parse call in a try-catch block to handle any parsing errors gracefully. In the catch block, log the error and handle the failure case, such as clearing the invalid data or showing an error message to the user.

coderabbitai · 2025-07-07T11:12:49Z

frontend/app/analyze/results/page.tsx

+    <div className="p-4">
+      <h1 className="text-2xl font-bold mb-4">Analysis Results</h1>
+      <pre className="bg-black p-4 rounded text-sm overflow-x-auto whitespace-pre-wrap">
+        {JSON.stringify(analysisData, null, 2)}
+      </pre>
+    </div>


🛠️ Refactor suggestion

Improve JSON display styling and user experience.

The hardcoded black background doesn't respect theme preferences and raw JSON display might not be user-friendly.

-<div className="p-4"> - <h1 className="text-2xl font-bold mb-4">Analysis Results</h1> - <pre className="bg-black p-4 rounded text-sm overflow-x-auto whitespace-pre-wrap"> - {JSON.stringify(analysisData, null, 2)} - </pre> -</div> +{analysisData && ( + <div className="p-4"> + <h1 className="text-2xl font-bold mb-4">Analysis Results</h1> + <pre className="bg-muted p-4 rounded text-sm overflow-x-auto whitespace-pre-wrap border"> + {JSON.stringify(analysisData, null, 2)} + </pre> + </div> +)}

Consider replacing the raw JSON display with a structured, user-friendly presentation of the analysis results.

🤖 Prompt for AI Agents

In frontend/app/analyze/results/page.tsx around lines 78 to 83, the JSON display uses a hardcoded black background and raw JSON stringifying, which ignores theme preferences and is not user-friendly. Replace the raw JSON <pre> block with a structured, styled component that formats the analysisData into readable sections or tables, and use theme-aware styling instead of a fixed black background to improve user experience and accessibility.

coderabbitai · 2025-07-07T11:12:49Z

frontend/app/analyze/loading/page.tsx

+        const res = await axios.post("https://Thunder1245-perspective1.hf.space/api/process", {
+          url: storedUrl,
+        })


⚠️ Potential issue

Extract hardcoded API URL to environment configuration.

The hardcoded API endpoint should be configurable and not embedded in the code.

-const res = await axios.post("https://Thunder1245-perspective1.hf.space/api/process", { +const res = await axios.post(process.env.NEXT_PUBLIC_API_URL || "https://Thunder1245-perspective1.hf.space/api/process", { url: storedUrl, })

Add the API URL to your environment variables in .env.local:

NEXT_PUBLIC_API_URL=https://Thunder1245-perspective1.hf.space/api/process

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

const res = await axios.post("https://Thunder1245-perspective1.hf.space/api/process", {

url: storedUrl,

})

const res = await axios.post(

process.env.NEXT_PUBLIC_API_URL || "https://Thunder1245-perspective1.hf.space/api/process",

{

url: storedUrl,

}

)

🤖 Prompt for AI Agents

In frontend/app/analyze/loading/page.tsx around lines 64 to 66, the API URL is hardcoded in the axios.post call. To fix this, move the URL to an environment variable by adding NEXT_PUBLIC_API_URL=https://Thunder1245-perspective1.hf.space/api/process in .env.local, then replace the hardcoded string with process.env.NEXT_PUBLIC_API_URL in the axios.post call to make the endpoint configurable.

coderabbitai · 2025-07-07T11:12:49Z

frontend/app/analyze/loading/page.tsx

+      try {
+        const res = await axios.post("https://Thunder1245-perspective1.hf.space/api/process", {
+          url: storedUrl,
+        })
+
+        // Save response to sessionStorage
+        sessionStorage.setItem("analysisResult", JSON.stringify(res.data))
+
+        // optional logging
+        console.log("Analysis result saved")
+        console.log(res)
+      } catch (err) {
+        console.error("Failed to process article:", err)
+        router.push("/analyze") // fallback in case of error
+        return
+      }


🛠️ Refactor suggestion

Add user feedback during API processing.

The API call happens silently, leaving users unaware of the actual processing status.

Consider adding a loading state and user feedback:

const [currentStep, setCurrentStep] = useState(0) const [progress, setProgress] = useState(0) const [articleUrl, setArticleUrl] = useState("") +const [isProcessing, setIsProcessing] = useState(false) +const [apiError, setApiError] = useState<string | null>(null) const router = useRouter() // In the runAnalysis function: try { + setIsProcessing(true) const res = await axios.post(process.env.NEXT_PUBLIC_API_URL, { url: storedUrl, }) + setIsProcessing(false) // Save response to sessionStorage sessionStorage.setItem("analysisResult", JSON.stringify(res.data)) } catch (err) { + setIsProcessing(false) + setApiError("Failed to process article. Please try again.") console.error("Failed to process article:", err) }

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents

In frontend/app/analyze/loading/page.tsx around lines 63 to 78, the API call to process the article happens without any user feedback, leaving users unaware of the processing status. Introduce a loading state variable to track when the API call is in progress. Before the try block, set the loading state to true, and in both the success and catch blocks, set it back to false. Use this loading state to conditionally render a loading indicator or message in the UI to inform users that processing is underway.

coderabbitai · 2025-07-07T11:12:49Z

frontend/app/analyze/loading/page.tsx

+  const runAnalysis = async () => {
    const storedUrl = sessionStorage.getItem("articleUrl")
    if (storedUrl) {
      setArticleUrl(storedUrl)
+
+      try {
+        const res = await axios.post("https://Thunder1245-perspective1.hf.space/api/process", {
+          url: storedUrl,
+        })
+
+        // Save response to sessionStorage
+        sessionStorage.setItem("analysisResult", JSON.stringify(res.data))
+
+        // optional logging
+        console.log("Analysis result saved")
+        console.log(res)
+      } catch (err) {
+        console.error("Failed to process article:", err)
+        router.push("/analyze") // fallback in case of error
+        return
+      }
+
+      // Progress and step simulation
+      const stepInterval = setInterval(() => {
+        setCurrentStep((prev) => {
+          if (prev < steps.length - 1) {
+            return prev + 1
+          } else {
+            clearInterval(stepInterval)
+            setTimeout(() => {
+              router.push("/analyze/results")
+            }, 2000)
+            return prev
+          }
+        })
+      }, 2000)
+
+      const progressInterval = setInterval(() => {
+        setProgress((prev) => {
+          if (prev < 100) {
+            return prev + 1
+          }
+          return prev
+        })
+      }, 100)
+
+      return () => {
+        clearInterval(stepInterval)
+        clearInterval(progressInterval)
+      }
    } else {
-      // Redirect back if no URL found
      router.push("/analyze")
-      return
    }
+  }
+
+  runAnalysis()
+}, [router])


⚠️ Potential issue

Fix memory leak and improve async handling.

The current implementation has several issues that could cause memory leaks and race conditions.

useEffect(() => { - const runAnalysis = async () => { + let mounted = true; + let stepInterval: NodeJS.Timeout; + let progressInterval: NodeJS.Timeout; + + const runAnalysis = async () => { const storedUrl = sessionStorage.getItem("articleUrl") if (storedUrl) { + if (!mounted) return; setArticleUrl(storedUrl) try { const res = await axios.post(process.env.NEXT_PUBLIC_API_URL || "https://Thunder1245-perspective1.hf.space/api/process", { url: storedUrl, }) + if (!mounted) return; // Save response to sessionStorage sessionStorage.setItem("analysisResult", JSON.stringify(res.data)) // optional logging console.log("Analysis result saved") console.log(res) } catch (err) { console.error("Failed to process article:", err) + if (mounted) { router.push("/analyze") // fallback in case of error + } return } // Progress and step simulation - const stepInterval = setInterval(() => { + stepInterval = setInterval(() => { setCurrentStep((prev) => { if (prev < steps.length - 1) { return prev + 1 } else { clearInterval(stepInterval) setTimeout(() => { + if (mounted) { router.push("/analyze/results") + } }, 2000) return prev } }) }, 2000) - const progressInterval = setInterval(() => { + progressInterval = setInterval(() => { setProgress((prev) => { if (prev < 100) { return prev + 1 } return prev }) }, 100) - return () => { - clearInterval(stepInterval) - clearInterval(progressInterval) - } } else { + if (mounted) { router.push("/analyze") + } } } runAnalysis() + + return () => { + mounted = false; + if (stepInterval) clearInterval(stepInterval); + if (progressInterval) clearInterval(progressInterval); + } }, [router])

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

const runAnalysis = async () => {

const storedUrl = sessionStorage.getItem("articleUrl")

if (storedUrl) {

setArticleUrl(storedUrl)

try {

const res = await axios.post("https://Thunder1245-perspective1.hf.space/api/process", {

url: storedUrl,

})

// Save response to sessionStorage

sessionStorage.setItem("analysisResult", JSON.stringify(res.data))

// optional logging

console.log("Analysis result saved")

console.log(res)

} catch (err) {

console.error("Failed to process article:", err)

router.push("/analyze") // fallback in case of error

return

}

// Progress and step simulation

const stepInterval = setInterval(() => {

setCurrentStep((prev) => {

if (prev < steps.length - 1) {

return prev + 1

} else {

clearInterval(stepInterval)

setTimeout(() => {

router.push("/analyze/results")

}, 2000)

return prev

}

})

}, 2000)

const progressInterval = setInterval(() => {

setProgress((prev) => {

if (prev < 100) {

return prev + 1

}

return prev

})

}, 100)

return () => {

clearInterval(stepInterval)

clearInterval(progressInterval)

}

} else {

// Redirect back if no URL found

router.push("/analyze")

return

}

}

runAnalysis()

}, [router])

useEffect(() => {

let mounted = true

let stepInterval: NodeJS.Timeout

let progressInterval: NodeJS.Timeout

const runAnalysis = async () => {

const storedUrl = sessionStorage.getItem("articleUrl")

if (storedUrl) {

if (!mounted) return

setArticleUrl(storedUrl)

try {

const res = await axios.post(

process.env.NEXT_PUBLIC_API_URL || "https://Thunder1245-perspective1.hf.space/api/process",

{ url: storedUrl },

)

if (!mounted) return

// Save response to sessionStorage

sessionStorage.setItem("analysisResult", JSON.stringify(res.data))

// optional logging

console.log("Analysis result saved")

console.log(res)

} catch (err) {

console.error("Failed to process article:", err)

if (mounted) {

router.push("/analyze") // fallback in case of error

}

return

}

// Progress and step simulation

stepInterval = setInterval(() => {

setCurrentStep((prev) => {

if (prev < steps.length - 1) {

return prev + 1

} else {

clearInterval(stepInterval)

setTimeout(() => {

if (mounted) {

router.push("/analyze/results")

}

}, 2000)

return prev

}

})

}, 2000)

progressInterval = setInterval(() => {

setProgress((prev) => {

if (prev < 100) {

return prev + 1

}

return prev

})

}, 100)

} else {

if (mounted) {

router.push("/analyze")

}

}

}

runAnalysis()

return () => {

mounted = false

if (stepInterval) clearInterval(stepInterval)

if (progressInterval) clearInterval(progressInterval)

}

}, [router])

🤖 Prompt for AI Agents

In frontend/app/analyze/loading/page.tsx around lines 58 to 114, the runAnalysis function sets up intervals inside an async function without proper cleanup, causing potential memory leaks and race conditions. Refactor by moving the interval setup outside the async function or use a React effect cleanup function to clear intervals when the component unmounts or dependencies change. Also, ensure async calls are properly awaited and handle component unmount scenarios to avoid state updates on unmounted components.

coderabbitai · 2025-07-07T11:12:51Z

new-backend/app/modules/langgraph_nodes/store_and_send.py

+        try:
+            chunks = chunk_rag_data(state)
+        except KeyError as e:
+            raise Exception(f"Missing required data field for chunking: {e}")
+        except Exception as e:
+            raise Exception(f"Failed to chunk data: {e}")
+        try:
+            vectors = embed_chunks(chunks)
+            if vectors:
+                print("embedding generated successfully!")
+        except Exception as e:
+            raise Exception(f"failed to embed chunks: {e}")
+


🛠️ Refactor suggestion

Improve exception handling to follow Python best practices.

The exception handling should use raise ... from err or raise ... from None to properly chain exceptions and provide better debugging information.

try: chunks = chunk_rag_data(state) except KeyError as e: - raise Exception(f"Missing required data field for chunking: {e}") + raise Exception(f"Missing required data field for chunking: {e}") from e except Exception as e: - raise Exception(f"Failed to chunk data: {e}") + raise Exception(f"Failed to chunk data: {e}") from e try: vectors = embed_chunks(chunks) if vectors: print("embedding generated successfully!") except Exception as e: - raise Exception(f"failed to embed chunks: {e}") + raise Exception(f"failed to embed chunks: {e}") from e

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

try:

chunks = chunk_rag_data(state)

except KeyError as e:

raise Exception(f"Missing required data field for chunking: {e}")

except Exception as e:

raise Exception(f"Failed to chunk data: {e}")

try:

vectors = embed_chunks(chunks)

if vectors:

print("embedding generated successfully!")

except Exception as e:

raise Exception(f"failed to embed chunks: {e}")

try:

chunks = chunk_rag_data(state)

except KeyError as e:

raise Exception(f"Missing required data field for chunking: {e}") from e

except Exception as e:

raise Exception(f"Failed to chunk data: {e}") from e

try:

vectors = embed_chunks(chunks)

if vectors:

print("embedding generated successfully!")

except Exception as e:

raise Exception(f"failed to embed chunks: {e}") from e

🧰 Tools

🪛 Ruff (0.11.9)

13-13: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

15-15: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

21-21: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

🤖 Prompt for AI Agents

In new-backend/app/modules/langgraph_nodes/store_and_send.py around lines 10 to 22, the exception handling currently raises new exceptions without chaining the original ones, which loses traceback context. Update the raise statements to use "raise Exception(...) from e" to properly chain exceptions and preserve debugging information.

coderabbitai · 2025-07-07T11:12:51Z

new-backend/app/modules/langgraph_nodes/generate_perspective.py

+        facts_str = "\n".join([f"Claim: {f['original_claim']}\n"
+                               "Verdict: {f['verdict']}\nExplanation: "
+                               "{f['explanation']}" for f in state["facts"]])


⚠️ Potential issue

Fix f-string formatting bug.

The f-string formatting is broken. Only the first line has the f prefix, causing the variables in subsequent lines to not be interpolated.

- facts_str = "\n".join([f"Claim: {f['original_claim']}\n" - "Verdict: {f['verdict']}\nExplanation: " - "{f['explanation']}" for f in state["facts"]]) + facts_str = "\n".join([f"Claim: {f['original_claim']}\n" + f"Verdict: {f['verdict']}\nExplanation: " + f"{f['explanation']}" for f in state["facts"]])

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

facts_str = "\n".join([f"Claim: {f['original_claim']}\n"

"Verdict: {f['verdict']}\nExplanation: "

"{f['explanation']}" for f in state["facts"]])

facts_str = "\n".join([f"Claim: {f['original_claim']}\n"

f"Verdict: {f['verdict']}\nExplanation: "

f"{f['explanation']}" for f in state["facts"]])

🤖 Prompt for AI Agents

In new-backend/app/modules/langgraph_nodes/generate_perspective.py around lines 40 to 42, the f-string formatting is incorrect because only the first line has the f prefix, so variables in the following lines are not interpolated. Fix this by adding the f prefix to each line inside the join list comprehension to ensure all variables are properly formatted.

coderabbitai · 2025-07-07T11:12:51Z

new-backend/app/modules/facts_check/llm_processing.py

+client = Groq(api_key=os.getenv("GROQ_API_KEY"))
+


🛠️ Refactor suggestion

Validate API key existence.

The API key is loaded without validation. Consider checking if it exists to provide better error messages.

-client = Groq(api_key=os.getenv("GROQ_API_KEY")) +api_key = os.getenv("GROQ_API_KEY") +if not api_key: + raise ValueError("GROQ_API_KEY environment variable is not set") +client = Groq(api_key=api_key)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

client = Groq(api_key=os.getenv("GROQ_API_KEY"))

api_key = os.getenv("GROQ_API_KEY")

if not api_key:

raise ValueError("GROQ_API_KEY environment variable is not set")

client = Groq(api_key=api_key)

🤖 Prompt for AI Agents

In new-backend/app/modules/facts_check/llm_processing.py around lines 9 to 10, the API key for Groq client is loaded without checking if it exists. Add a validation step after retrieving the API key from the environment variable to verify it is not None or empty. If the key is missing, raise a clear and descriptive error or log a message indicating the missing API key before initializing the Groq client.

coderabbitai · 2025-07-07T11:12:51Z

new-backend/app/modules/facts_check/llm_processing.py

+                            "You are a fact-checking assistant. "
+                            "Your job is to determine whether the given"
+                            " claim is True, False"
+                            "based on the provided web search evidence."
+                            " Keep it concise and structured."


⚠️ Potential issue

Fix incomplete prompt text.

The prompt has spacing issues and an incomplete sentence about claim determination.

"content": ( "You are a fact-checking assistant. " - "Your job is to determine whether the given" - " claim is True, False" + "Your job is to determine whether the given " + "claim is True or False " "based on the provided web search evidence." " Keep it concise and structured." ),

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

"You are a fact-checking assistant. "

"Your job is to determine whether the given"

" claim is True, False"

"based on the provided web search evidence."

" Keep it concise and structured."

"content": (

"You are a fact-checking assistant. "

"Your job is to determine whether the given "

"claim is True or False "

"based on the provided web search evidence."

" Keep it concise and structured."

),

🤖 Prompt for AI Agents

In new-backend/app/modules/facts_check/llm_processing.py around lines 78 to 82, the prompt string is missing spaces between concatenated parts and ends with an incomplete sentence about claim determination. Fix this by adding necessary spaces between the string segments and completing the sentence so it clearly instructs the assistant to determine if the claim is True or False based on the evidence, ensuring the prompt reads as a coherent, complete instruction.

coderabbitai · 2025-07-07T11:12:51Z

new-backend/app/modules/facts_check/llm_processing.py

+            try:
+                parsed = json.loads(content)
+            except Exception as parse_err:
+                print(f"❌ LLM JSON parse error: {parse_err}")
+
+            results_list.append(parsed)
+


⚠️ Potential issue

Critical: Handle JSON parse errors properly.

The code continues to use parsed even if JSON parsing fails, which will cause an UnboundLocalError.

# Try parsing the JSON response try: parsed = json.loads(content) + results_list.append(parsed) except Exception as parse_err: print(f"❌ LLM JSON parse error: {parse_err}") - - results_list.append(parsed) + # Skip this result or add a default error result + results_list.append({ + "verdict": "Error", + "explanation": f"Failed to parse LLM response: {parse_err}", + "original_claim": claim, + "source_link": source + })

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

try:

parsed = json.loads(content)

except Exception as parse_err:

print(f"❌ LLM JSON parse error: {parse_err}")

results_list.append(parsed)

# Try parsing the JSON response

try:

parsed = json.loads(content)

results_list.append(parsed)

except Exception as parse_err:

print(f"❌ LLM JSON parse error: {parse_err}")

# Skip this result or add a default error result

results_list.append({

"verdict": "Error",

"explanation": f"Failed to parse LLM response: {parse_err}",

"original_claim": claim,

"source_link": source

})

🤖 Prompt for AI Agents

In new-backend/app/modules/facts_check/llm_processing.py around lines 113 to 119, the variable 'parsed' is used after a JSON parsing attempt without ensuring it was successfully assigned, leading to a potential UnboundLocalError if parsing fails. To fix this, modify the code to handle the exception properly by either initializing 'parsed' to a safe default value before the try block or by skipping appending to results_list when parsing fails, ensuring 'parsed' is only used if JSON parsing succeeds.

…h push to main branch

coderabbitai

Actionable comments posted: 15

🧹 Nitpick comments (8)

backend/app/modules/scraper/cleaner.py (1)
2-2: Remove unused NLTK import or add justification comment.

The nltk library is imported but not used anywhere in the current implementation of clean_extracted_text. This adds unnecessary overhead to the module.

If NLTK functionality is planned for future use, consider adding a comment explaining this. Otherwise, remove the import:
-import nltk
backend/start.sh (1)
1-2: Prefer set -euo pipefail for safer Bash scripting

Adding u and o pipefail prevents silent failures from unset vars and broken pipelines.
-set -e
+set -euo pipefail
backend/README.md (3)
80-80: Fix comment typo .loc → .lock
-├── uv.lock               # .loc file like package-lock.json
+├── uv.lock               # .lock file similar to npm's package-lock.json
59-61: Specify a language for the fenced code block to appease markdownlint
-```
-http://localhost:8000/api/
-```
+```text
+http://localhost:8000/api/
+```
24-24: Minor grammar: add “the”
-### 1. Clone the repo & jump into backend folder
+### 1. Clone the repo & jump into the backend folder
backend/app/utils/generate_chunk_id.py (1)
4-8: Consider increasing hash length to reduce collision risk.

Using only 15 characters of SHA-256 provides ~60 bits of entropy, which may lead to collisions at scale. Consider increasing the length or using the full hash.
-    return f"article-{hashed_text[:15]}"
+    return f"article-{hashed_text[:32]}"  # 128 bits of entropy
Additionally, consider making the prefix configurable for better reusability:
-def generate_id(text: str) -> str:
+def generate_id(text: str, prefix: str = "article") -> str:
     if not text or not isinstance(text, str):
         raise ValueError("Text must be non-empty string")
     hashed_text = hashlib.sha256(text.encode("utf-8")).hexdigest()
-    return f"article-{hashed_text[:15]}"
+    return f"{prefix}-{hashed_text[:32]}"
.github/workflows/deploy-backend-to-hf.yml (1)
34-37: Remove unused rsync installation.

The workflow installs rsync but doesn't use it in the subsequent steps. The file synchronization is handled through git operations instead.

Remove the unused rsync installation:
-      - name: 📦 Install rsync
-        run: |
-          sudo apt-get update
-          sudo apt-get install -y rsync
backend/app/modules/facts_check/llm_processing.py (1)
110-110: Replace print statement with proper logging.

Debug print statements should use a proper logging framework instead of print().
-            print(content)
+            # Consider using logging.debug(content) instead

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7d5eaf0 and b02db47.

⛔ Files ignored due to path filters (1)

backend/uv.lock is excluded by !**/*.lock

📒 Files selected for processing (24)

.github/workflows/deploy-backend-to-hf.yml (1 hunks)
.gitignore (1 hunks)
backend/.dockerignore (1 hunks)
backend/Dockerfile (1 hunks)
backend/README.md (1 hunks)
backend/app/db/vector_store.py (1 hunks)
backend/app/modules/facts_check/llm_processing.py (1 hunks)
backend/app/modules/facts_check/web_search.py (1 hunks)
backend/app/modules/langgraph_builder.py (5 hunks)
backend/app/modules/langgraph_nodes/fact_check.py (2 hunks)
backend/app/modules/langgraph_nodes/generate_perspective.py (1 hunks)
backend/app/modules/langgraph_nodes/judge.py (1 hunks)
backend/app/modules/langgraph_nodes/sentiment.py (1 hunks)
backend/app/modules/langgraph_nodes/store_and_send.py (1 hunks)
backend/app/modules/scraper/cleaner.py (1 hunks)
backend/app/modules/vector_store/chunk_rag_data.py (1 hunks)
backend/app/modules/vector_store/embed.py (1 hunks)
backend/app/utils/fact_check_utils.py (1 hunks)
backend/app/utils/generate_chunk_id.py (1 hunks)
backend/app/utils/prompt_templates.py (1 hunks)
backend/app/utils/store_vectors.py (1 hunks)
backend/main.py (1 hunks)
backend/pyproject.toml (1 hunks)
backend/start.sh (1 hunks)

✅ Files skipped from review due to trivial changes (5)

backend/.dockerignore
.gitignore
backend/app/utils/prompt_templates.py
backend/pyproject.toml
backend/Dockerfile

🧰 Additional context used

🧬 Code Graph Analysis (5)

backend/app/utils/fact_check_utils.py (2)

backend/app/modules/facts_check/web_search.py (1)

search_with_serpapi (5-28)

backend/app/modules/facts_check/llm_processing.py (2)

run_claim_extractor_sdk (12-57)

run_fact_verifier_sdk (60-132)

backend/app/modules/vector_store/chunk_rag_data.py (1)

backend/app/utils/generate_chunk_id.py (1)

generate_id (4-8)

backend/app/modules/langgraph_nodes/store_and_send.py (3)

backend/app/modules/vector_store/chunk_rag_data.py (1)

chunk_rag_data (4-73)

backend/app/modules/vector_store/embed.py (1)

embed_chunks (7-30)

backend/app/utils/store_vectors.py (1)

store (10-32)

backend/app/modules/langgraph_nodes/fact_check.py (1)

backend/app/utils/fact_check_utils.py (1)

run_fact_check_pipeline (10-47)

backend/app/modules/langgraph_builder.py (2)

backend/app/modules/langgraph_nodes/sentiment.py (1)

run_sentiment_sdk (10-53)

backend/app/modules/langgraph_nodes/error_handler.py (1)

error_handler (3-11)

🪛 Ruff (0.11.9)

backend/app/db/vector_store.py

14-14: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

40-41: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

backend/app/modules/langgraph_nodes/store_and_send.py

13-13: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

15-15: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

21-21: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

backend/app/utils/store_vectors.py

32-32: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

🪛 actionlint (1.7.7)

.github/workflows/deploy-backend-to-hf.yml

30-30: shellcheck reported issue in this script: SC2086:info:2:31: Double quote to prevent globbing and word splitting

(shellcheck)

🪛 LanguageTool

backend/README.md

[uncategorized] ~24-~24: You might be missing the article “the” here.
Context: ...rted ### 1. Clone the repo & jump into backend folder ```bash git clone https://githu...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)

🪛 markdownlint-cli2 (0.17.2)

backend/README.md

59-59: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

76-76: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🔇 Additional comments (27)

backend/app/modules/scraper/cleaner.py (1)

13-86: LGTM! Text cleaning implementation is well-structured.

The clean_extracted_text function is well-implemented with comprehensive boilerplate removal patterns and proper text processing logic. The function handles edge cases appropriately and maintains good readability.
backend/app/modules/langgraph_nodes/sentiment.py (2)

39-39: Good improvement for consistent output formatting.

Converting sentiment to lowercase ensures consistent output regardless of API response formatting.

35-35: Validate max_tokens adequacy for sentiment outputs

Reducing max_tokens to 3 is fine for single-word replies, but the Groq API may include punctuation or brief variations (e.g., “Positive.” or “The sentiment is positive”), which could exceed that limit. Please test against these edge cases and consider increasing to 5 tokens if needed.

• File: backend/app/modules/langgraph_nodes/sentiment.py:35
- max_tokens=3,
+ max_tokens=5,  # allow for punctuation or slight phrasing variations
backend/main.py (1)

28-30: Good port configuration with environment variable support.

Using environment variables for port configuration with a sensible default is a good practice for deployment flexibility.

backend/app/modules/facts_check/web_search.py (1)

5-28: Well-implemented search function with good error handling.

The function properly validates the API key, handles search parameters correctly, and processes results with graceful fallbacks for missing keys. The implementation is clean and follows good practices.

backend/app/modules/langgraph_nodes/fact_check.py (3)

1-1: Good integration of the new fact-checking pipeline.

The import of the comprehensive fact-checking pipeline replaces the previous placeholder implementation, improving functionality significantly.

11-20: Improved error handling with structured responses.

The updated logic properly handles errors from the pipeline and returns structured error responses, which is better than the previous placeholder approach.

30-30: Correct integration of verification results.

The function now properly returns the verifications from the pipeline as "facts" in the state, maintaining the expected output format.

backend/app/db/vector_store.py (2)

5-7: Good API key validation.

Proper validation of the required environment variable with clear error message.

22-34: Good index management logic.

The index creation logic properly checks for existence and creates with appropriate serverless configuration for AWS US East 1.

backend/app/modules/vector_store/chunk_rag_data.py (5)

4-13: Excellent field validation.

Comprehensive validation of required fields with clear error messages. The list type check for facts is particularly good.

15-18: Smart handling of perspective data normalization.

The check for .dict() method allows for flexible input types (both dict and object with dict method).

28-32: Good safety validation for perspective object.

The validation ensures the perspective object has the required attributes before accessing them.

44-67: Thorough fact validation and processing.

The validation of each fact's required fields and the systematic chunk creation with unique IDs is well-implemented.

71-73: Appropriate error handling.

The catch-all exception handling with logging and re-raising preserves the original error while providing debugging information.

backend/app/modules/vector_store/embed.py (4)

1-4: Good model choice and initialization.

The all-MiniLM-L6-v2 model is a solid choice for general text embeddings, providing good performance with reasonable computational requirements.

9-10: Proper handling of empty input.

Early return for empty chunks prevents unnecessary processing and potential errors.

12-18: Comprehensive chunk validation.

The validation ensures each chunk is a dictionary with the required 'text' field, providing clear error messages with indices for debugging.

20-30: Efficient embedding generation and vector construction.

The batch processing approach is efficient, and the vector construction properly maps each chunk to its embedding with preserved metadata.

.github/workflows/deploy-backend-to-hf.yml (1)

1-59: Well-structured CI/CD workflow for HF Space deployment.

The workflow correctly triggers on backend changes, handles authentication securely, and implements proper git operations for deployment.

backend/app/utils/fact_check_utils.py (1)

26-47: Excellent error handling and rate limiting implementation.

The search loop properly handles exceptions, logs outcomes, and includes appropriate delays to prevent rate limiting. The final verification step is well-integrated.

backend/app/modules/langgraph_nodes/judge.py (2)

6-10: Appropriate configuration for scoring task.

The low max_tokens (10) is perfect for a simple scoring response, and zero temperature ensures consistent outputs.

31-43: Robust response parsing with proper error handling.

The code handles multiple response formats and includes proper score validation with clamping. The regex pattern correctly extracts numeric scores.

backend/app/modules/langgraph_nodes/generate_perspective.py (1)

9-24: Excellent use of structured output and proper LLM configuration.

The Pydantic model ensures type safety, and the temperature setting (0.7) is appropriate for creative perspective generation.

backend/app/modules/langgraph_builder.py (2)

14-22: Excellent addition of typed state definition.

The TypedDict provides clear type hints for the state structure, improving code maintainability and IDE support.

56-102: Well-structured conditional logic with comprehensive error handling.

The graph correctly handles error propagation and implements a retry mechanism with scoring thresholds. The updated method calls align with the LangGraph API.

backend/app/modules/facts_check/llm_processing.py (1)

12-58: Well-structured claim extraction function.

The function has proper error handling, input validation, and returns consistent response structure. The implementation follows good practices.

coderabbitai · 2025-07-07T22:13:02Z

backend/app/modules/scraper/cleaner.py

+try:
+    nltk.data.find('corpora/stopwords')
+    nltk.data.find('corpora/punkt_tab')
+
+except LookupError:
+    nltk.download('stopwords')
+    nltk.download('punkt_tab')


🛠️ Refactor suggestion

Avoid downloading NLTK data during module import.

Downloading NLTK data during module import can cause several issues:

Slow import times

Potential failures in deployment environments with limited network access

Unnecessary downloads when NLTK functionality isn't used

Consider one of these alternatives:

Option 1: Remove the NLTK downloads entirely (if not needed):

-try: - nltk.data.find('corpora/stopwords') - nltk.data.find('corpora/punkt_tab') - -except LookupError: - nltk.download('stopwords') - nltk.download('punkt_tab')

Option 2: Lazy loading (if NLTK will be used):

-try: - nltk.data.find('corpora/stopwords') - nltk.data.find('corpora/punkt_tab') - -except LookupError: - nltk.download('stopwords') - nltk.download('punkt_tab') +def _ensure_nltk_data(): + """Ensure NLTK data is available when needed.""" + try: + nltk.data.find('corpora/stopwords') + nltk.data.find('corpora/punkt_tab') + except LookupError: + nltk.download('stopwords') + nltk.download('punkt_tab')

Then call _ensure_nltk_data() only when NLTK functionality is actually needed.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

try:

nltk.data.find('corpora/stopwords')

nltk.data.find('corpora/punkt_tab')

except LookupError:

nltk.download('stopwords')

nltk.download('punkt_tab')

# Remove the top-level try/except block and instead define a helper:

def _ensure_nltk_data():

"""Ensure NLTK data is available when needed."""

try:

nltk.data.find('corpora/stopwords')

nltk.data.find('corpora/punkt_tab')

except LookupError:

nltk.download('stopwords')

nltk.download('punkt_tab')

# …later, in whatever function actually uses NLTK:

# _ensure_nltk_data()

# tokens = nltk.word_tokenize(text)

# filtered = [w for w in tokens if w not in nltk.corpus.stopwords.words('english')]

🤖 Prompt for AI Agents

In backend/app/modules/scraper/cleaner.py around lines 4 to 10, avoid downloading NLTK data during module import as it slows imports and can fail in restricted environments. Refactor by moving the NLTK data check and download logic into a separate function, for example, _ensure_nltk_data(), and call this function only when NLTK functionality is actually needed during runtime instead of at import time.

coderabbitai · 2025-07-07T22:13:03Z

backend/start.sh

+uv sync
+uv run main.py


🛠️ Refactor suggestion

Use exec so the app becomes PID 1 and receives signals

Without exec, the shell stays PID 1 in a container, preventing proper signal forwarding (e.g., SIGTERM on docker stop).

-uv run main.py +exec uv run main.py

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

uv sync

uv run main.py

uv sync

exec uv run main.py

🤖 Prompt for AI Agents

In backend/start.sh at lines 8 to 9, the commands launching the app should use `exec` to replace the shell process with the app process. Modify the lines to prefix the commands with `exec` so that the app becomes PID 1 and can properly receive and handle signals like SIGTERM, ensuring correct behavior in container environments.

coderabbitai · 2025-07-07T22:13:03Z

backend/start.sh

+set -e
+
+# Install uv if not present
+pip install uv


🛠️ Refactor suggestion

Install uv only when absent and pin the version

Unconditionally invoking pip install uv adds unnecessary overhead on every start and may pull a newer, untested version. Guard the install and pin a version for reproducibility.

-# Install uv if not present -pip install uv +# Install uv if not present +if ! command -v uv >/dev/null 2>&1; then + pip install --no-cache-dir --upgrade 'uv>=0.100.0' +fi

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents

In backend/start.sh at line 5, the script unconditionally runs 'pip install uv', causing unnecessary overhead and potential version inconsistencies. Modify the script to first check if 'uv' is already installed, and only run 'pip install uv==<specific_version>' if it is absent. Replace <specific_version> with a fixed version number to ensure reproducibility.

coderabbitai · 2025-07-07T22:13:03Z

backend/README.md

+cd new-backend
+````


⚠️ Potential issue

Correct directory name in the quick-start step

The repo places this README under backend/, not new-backend/. The current command will fail.

-cd new-backend +cd backend

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

cd new-backend

````

cd backend

🤖 Prompt for AI Agents

In backend/README.md around lines 28 to 29, the quick-start step uses the incorrect directory name 'new-backend'. Change the directory name in the command from 'new-backend' to 'backend' to match the actual folder where the README and backend code reside.

coderabbitai · 2025-07-07T22:13:03Z

backend/main.py

+app.add_middleware(
+    CORSMiddleware,
+    allow_origins=["*"],
+    allow_credentials=True,
+    allow_methods=["*"],
+    allow_headers=["*"],
+)


⚠️ Potential issue

CORS configuration is too permissive for production.

Allowing all origins (allow_origins=["*"]) with credentials enabled poses security risks. Consider restricting origins based on environment.

+import os + +# Configure CORS origins based on environment +origins = ["*"] if os.getenv("ENVIRONMENT") == "development" else [ + "https://yourdomain.com", + "https://www.yourdomain.com" +] + app.add_middleware( CORSMiddleware, - allow_origins=["*"], + allow_origins=origins, allow_credentials=True, allow_methods=["*"], allow_headers=["*"], )

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

app.add_middleware(

CORSMiddleware,

allow_origins=["*"],

allow_credentials=True,

allow_methods=["*"],

allow_headers=["*"],

)

import os

# Configure CORS origins based on environment

origins = ["*"] if os.getenv("ENVIRONMENT") == "development" else [

"https://yourdomain.com",

"https://www.yourdomain.com"

]

app.add_middleware(

CORSMiddleware,

allow_origins=origins,

allow_credentials=True,

allow_methods=["*"],

allow_headers=["*"],

)

🤖 Prompt for AI Agents

In backend/main.py around lines 14 to 20, the CORS middleware configuration is too permissive by allowing all origins with credentials enabled, which is a security risk in production. Modify the allow_origins parameter to restrict it to a specific list of trusted origins based on the environment (e.g., development vs production). Ensure that allow_credentials remains True only if the origins are explicitly specified and trusted.

coderabbitai · 2025-07-07T22:13:04Z

backend/app/utils/fact_check_utils.py

+    if state.get("status") != "success":
+        print("❌ Claim extraction failed.")
+        return [], "Claim extraction failed."


⚠️ Potential issue

Fix the status check logic.

The code checks state.get("status") but should check result.get("status") since result contains the response from the claim extraction SDK.

Apply this diff to fix the status check:

- if state.get("status") != "success": + if result.get("status") != "success":

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if state.get("status") != "success":

print("❌ Claim extraction failed.")

return [], "Claim extraction failed."

if result.get("status") != "success":

print("❌ Claim extraction failed.")

return [], "Claim extraction failed."

🤖 Prompt for AI Agents

In backend/app/utils/fact_check_utils.py around lines 13 to 15, the status check incorrectly uses state.get("status") instead of result.get("status"). Update the condition to check result.get("status") to correctly verify the claim extraction SDK response status and handle failures properly.

coderabbitai · 2025-07-07T22:13:04Z

backend/app/modules/langgraph_nodes/generate_perspective.py

+        facts_str = "\n".join([f"Claim: {f['original_claim']}\n"
+                               "Verdict: {f['verdict']}\nExplanation: "
+                               "{f['explanation']}" for f in state["facts"]])


⚠️ Potential issue

Fix the string formatting syntax.

The f-string formatting is incorrect - the variable references are not properly wrapped in curly braces.

Apply this diff to fix the string formatting:

- facts_str = "\n".join([f"Claim: {f['original_claim']}\n" - "Verdict: {f['verdict']}\nExplanation: " - "{f['explanation']}" for f in state["facts"]]) + facts_str = "\n".join([f"Claim: {f['original_claim']}\n" + f"Verdict: {f['verdict']}\nExplanation: " + f"{f['explanation']}" for f in state["facts"]])

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

facts_str = "\n".join([f"Claim: {f['original_claim']}\n"

"Verdict: {f['verdict']}\nExplanation: "

"{f['explanation']}" for f in state["facts"]])

facts_str = "\n".join([f"Claim: {f['original_claim']}\n"

f"Verdict: {f['verdict']}\nExplanation: "

f"{f['explanation']}" for f in state["facts"]])

🤖 Prompt for AI Agents

In backend/app/modules/langgraph_nodes/generate_perspective.py around lines 40 to 42, the string formatting uses f-strings but the variable references inside the strings are not properly enclosed in curly braces. Fix this by ensuring the entire string is an f-string and all variable references are wrapped in curly braces within the string literals.

coderabbitai · 2025-07-07T22:13:04Z

backend/app/modules/facts_check/llm_processing.py

+
+load_dotenv()
+
+client = Groq(api_key=os.getenv("GROQ_API_KEY"))


⚠️ Potential issue

Add API key validation before initializing Groq client.

The Groq client is initialized without checking if the API key exists. This could lead to runtime errors if the environment variable is not set.

Consider adding validation:

-client = Groq(api_key=os.getenv("GROQ_API_KEY")) +api_key = os.getenv("GROQ_API_KEY") +if not api_key: + raise ValueError("GROQ_API_KEY environment variable is not set") +client = Groq(api_key=api_key)

🤖 Prompt for AI Agents

In backend/app/modules/facts_check/llm_processing.py at line 9, the Groq client is initialized directly with the API key from the environment without checking if the key exists. Add a validation step before initializing the client to check if the GROQ_API_KEY environment variable is set; if it is missing, raise an appropriate error or handle it gracefully to prevent runtime failures.

coderabbitai · 2025-07-07T22:13:04Z

backend/app/modules/facts_check/llm_processing.py

+            try:
+                parsed = json.loads(content)
+            except Exception as parse_err:
+                print(f"❌ LLM JSON parse error: {parse_err}")
+
+            results_list.append(parsed)
+


⚠️ Potential issue

Fix critical error: undefined variable usage after JSON parsing failure.

If JSON parsing fails, parsed remains undefined but is still appended to results_list on line 118, which will raise a NameError.

Apply this fix to handle parsing errors properly:

# Try parsing the JSON response try: parsed = json.loads(content) + results_list.append(parsed) except Exception as parse_err: print(f"❌ LLM JSON parse error: {parse_err}") - - results_list.append(parsed) + # Skip this result or add error placeholder + results_list.append({ + "verdict": "Error", + "explanation": f"Failed to parse LLM response: {parse_err}", + "original_claim": claim, + "source_link": source + })

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

try:

parsed = json.loads(content)

except Exception as parse_err:

print(f"❌ LLM JSON parse error: {parse_err}")

results_list.append(parsed)

# Try parsing the JSON response

try:

parsed = json.loads(content)

results_list.append(parsed)

except Exception as parse_err:

print(f"❌ LLM JSON parse error: {parse_err}")

# Skip this result or add error placeholder

results_list.append({

"verdict": "Error",

"explanation": f"Failed to parse LLM response: {parse_err}",

"original_claim": claim,

"source_link": source

})

🤖 Prompt for AI Agents

In backend/app/modules/facts_check/llm_processing.py around lines 113 to 119, the variable 'parsed' is appended to 'results_list' even if JSON parsing fails, causing a NameError. To fix this, ensure that 'parsed' is only appended if JSON parsing succeeds by moving the append statement inside the try block or adding a conditional check that 'parsed' is defined before appending.

coderabbitai · 2025-07-07T22:13:04Z

backend/app/modules/facts_check/llm_processing.py

+        return {
+            "claim": claim,
+            "verifications": results_list,
+            "status": "success",
+        }


⚠️ Potential issue

Fix variable scope issue: claim is undefined in return statement.

The claim variable on line 121 references the last value from the loop iteration, not a function-level claim. This appears to be incorrect logic.

The return structure should probably not include a single claim when processing multiple claims. Consider:

return { - "claim": claim, "verifications": results_list, "status": "success", }

Or if you need to return all claims, extract them properly:

return { - "claim": claim, + "claims": [result.get("claim") for result in search_results], "verifications": results_list, "status": "success", }

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents

In backend/app/modules/facts_check/llm_processing.py around lines 120 to 124, the variable 'claim' used in the return statement is undefined or incorrectly scoped as it refers to the last loop iteration variable rather than a function-level claim. To fix this, remove the single 'claim' from the return dictionary or replace it with a properly collected list of all claims processed. Ensure the return structure accurately reflects the data processed, either by returning all claims as a list or omitting the claim field if not applicable.

to resolve merge conflicts

added Dockerfile for backend-deployment on Hugging Face Spaces

533b118

ParagGhatage added 2 commits July 7, 2025 16:35

removed old backend

56804b6

renamed new-backend to backend

7d5eaf0

coderabbitai bot reviewed Jul 7, 2025

View reviewed changes

ParagGhatage added 2 commits July 7, 2025 16:49

added renamed backend folder to the git

bff2d65

Added GitHub Actions to deploy backend to Hugging Face Apce after eac…

b02db47

…h push to main branch

coderabbitai bot reviewed Jul 7, 2025

View reviewed changes

ParagGhatage added 2 commits July 8, 2025 04:08

change in backend url in /analyze/loading page

0712a25

Merge branch 'main' into backend-deployment

2f3400e

to resolve merge conflicts

ManavSarkar merged commit 5dac470 into main Jul 18, 2025
1 check passed

coderabbitai bot mentioned this pull request Aug 11, 2025

optimizations and formatting frontend and backend #114

Merged

coderabbitai bot mentioned this pull request Nov 24, 2025

Feat: Added User Authentication and Authorization #120

Open

4 tasks

-client = Groq(api_key=os.getenv("GROQ_API_KEY"))
+api_key = os.getenv("GROQ_API_KEY")
+if not api_key:
+    raise ValueError("GROQ_API_KEY environment variable is not set")
+client = Groq(api_key=api_key)


		load_dotenv()

		client = Groq(api_key=os.getenv("GROQ_API_KEY"))

Uh oh!

Conversation

ParagGhatage commented Jul 7, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Tasks Done:

Summary by CodeRabbit

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jul 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Changes

Sequence Diagram(s)

Possibly related PRs

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jul 7, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jul 7, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jul 7, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jul 7, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jul 7, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jul 7, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jul 7, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jul 7, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jul 7, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jul 7, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jul 7, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jul 7, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jul 7, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jul 7, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jul 7, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jul 7, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jul 7, 2025

Choose a reason for hiding this comment

Uh oh!

ParagGhatage commented Jul 7, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jul 7, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)