Skip to content

A fact-checking system that verifies user claims with evidence-backed explanations using web searches and AI, delivering clear, transparent verdicts through a simple interface.

License

Notifications You must be signed in to change notification settings

DarkenStars/VeriFact

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

VeriFact

Verifying Claims with Transparency

A robust fact-checking system designed to combat misinformation by verifying user-submitted claims with evidence-backed explanations. In a world where false information spreads rapidly, this tool provides clear, transparent, and accessible verdicts to help users discern truth from fiction. The backend processes claims using web searches, machine learning models, and text polishing, while the frontend offers a user-friendly interface for interaction.

Note

This project is currently in the experimental and development phase, and it may sometimes produce unsatisfactory results.


The Problem

Misinformation, including fake news and misleading claims, spreads quickly across digital platforms, often causing confusion, mistrust, or harm. Many existing fact-checking tools provide limited explanations or are not user-friendly for non-technical audiences. Additionally, the lack of accessible, evidence-based verification systems makes it challenging for individuals to validate claims they encounter online.


Our Solution

VeriFact addresses these challenges by offering a fact-checking system that:

  1. Verifies Claims: Users submit claims via a CLI, web interface, or Telegram bot, receiving verdicts such as "Likely True," "Likely False," or "Mixed/Uncertain."
  2. Provides Transparent Explanations: Instead of just a verdict, the system delivers detailed explanations backed by credible web sources.
  3. Ensures Accessibility: Multiple interfaces (CLI, Vue.js web app, Telegram bot) make fact-checking intuitive for all users.
  4. Caches Results: Stores results in a PostgreSQL database to improve efficiency for repeated queries.

Key Features

  • Explainable Verdicts: Generates clear, evidence-based explanations with references to credible sources
  • Machine Learning Integration: Uses Sentence Transformers for semantic similarity and BART-MNLI for natural language inference (NLI) to classify evidence
  • Text Polishing: Employs Pegasus-XSUM to rephrase explanations for clarity and readability
  • Efficient Caching: Stores results in PostgreSQL to avoid redundant processing
  • Multi-Channel Access: Web frontend, CLI, and Telegram bot integration
  • Scalable Architecture: FastAPI backend with async support for handling multiple requests
  • Web Search: Leverages Google Custom Search to retrieve relevant sources for analysis
  • Confidence Scoring: Provides numerical confidence in claim assessment

Architecture Overview:

System Components

The platform consists of several integrated layers:

  1. User Interfaces - Vue.js frontend, CLI, and Telegram bot
  2. API Layer - FastAPI backend for handling requests
  3. ML Pipeline - PyTorch-based models for claim analysis
  4. Data Layer - PostgreSQL database for caching and storage
  5. External Services - Google Search API and web scraping

How It Works

diagram-diagram-0 (1)

System Architecture

The system follows a modular, data-driven architecture that combines web scraping, machine learning, and database caching for robust fact-checking.

Data Flow

  1. Claim Input: Users submit a claim via the CLI (main.py), Vue frontend, or Telegram bot (bot_tele.py)
  2. Cache Check: The backend queries the PostgreSQL database (search_log table) to check for cached results using a normalized claim
  3. Web Search: If no cache is found, the system uses Google Custom Search API (search_claim) to fetch up to 10 relevant URLs (can be increased)
  4. Heuristic Analysis: The analyze_verdicts scores search results' titles and snippets using keyword-based heuristics to produce an initial verdict (Likely True, Likely False, Uncertain)
  5. Deep ML Analysis: The select_evidence_from_urls function in ml_models.py:
    • Fetches and cleans web content using Trafilatura
    • Extracts sentences and ranks them by similarity to the claim using Sentence Transformers (all-MiniLM-L6-v2)
    • Classifies sentences as ENTAILMENT, CONTRADICTION, or NEUTRAL using BART-MNLI (facebook/bart-large-mnli)
  6. Verdict Fusion: The simple_fuse_verdict combines heuristic and ML results to produce a final verdict
  7. Explanation Generation: The build_explanation constructs a factual explanation from supporting and contradicting evidence
  8. Text Polishing: The polish_text in text_polisher.py rephrases the explanation using Pegasus-XSUM (google/pegasus-xsum) for fluency
  9. Response and Caching: The final verdict, explanation, and evidence are returned to the user and stored in PostgreSQL via upsert_result

Project Structure

VeriFact/
├── backend/
│   ├── app/
│   │   ├── __init__.py
│   │   ├── api/
│   │   └── models/
│   ├── app.py                 # FastAPI application
│   ├── main.py                # CLI interface
│   ├── bot_tele.py            # Telegram bot integration
│   ├── ml_models.py           # ML model implementations
│   ├── text_polisher.py       # Text polishing module
│   ├── test_modules.py        # Unit tests
│   ├── test_refactoring.py    # Refactoring tests
│   ├── PR_REVIEW_CHANGES.md
│   └── REFACTORING_SUMMARY.md
├── frontend/
│   ├── src/
│   │   ├── components/
│   │   ├── App.vue
│   │   └── main.js
│   ├── public/
│   ├── package.json
│   ├── vite.config.js
│   ├── index.html
│   └── README.md
├── sample_data/
│   ├── fact_checks.json
│   └── sample_database.csv
├── .github/
│   └── workflows/
├── requirements.txt
├── readme.md
├── start.ps1
└── LICENSE

Tech Stack

Backend

Component Technology
Web Framework FastAPI (async REST API)
Server Uvicorn (ASGI server)
ML Framework PyTorch & Transformers (Hugging Face)
Sentence Similarity Sentence-Transformers (all-MiniLM-L6-v2)
NLI Model BART-MNLI (facebook/bart-large-mnli)
Text Polisher Pegasus-XSUM (google/pegasus-xsum)
Database Driver psycopg2 (PostgreSQL adapter)
Web Scraping Trafilatura (content extraction)
HTTP Client Requests
Environment python-dotenv

Frontend

Component Technology
Framework Vue.js 3
Build Tool Vite
Code Quality ESLint

Database

Component Technology
Database PostgreSQL (relational database for caching)

Machine Learning Models

  • Sentence Embedder: sentence-transformers/all-MiniLM-L6-v2 for ranking sentences
  • NLI Model: facebook/bart-large-mnli for evidence classification
  • Polisher Model: google/pegasus-xsum for explanation rephrasing

Installation Guide

Prerequisites

  • Python 3.8+
  • Node.js (v16+)
  • npm
  • PostgreSQL (local or cloud-hosted)
  • Google Custom Search API key and Engine ID
  • Git

Backend Setup

  1. Clone the Repository:

    git clone https://github.com/DarkenStars/VeriFact.git
    cd VeriFact/backend
  2. Create Virtual Environment (recommended):

    python -m venv .venv
    .venv\Scripts\Activate.ps1  # Windows
    # or
    source .venv/bin/activate    # Linux/Mac
  3. Install Dependencies:

    pip install -r requirements.txt
  4. Set Up Environment Variables:

    Create a .env file in backend/:

    API_KEY=<your-google-api-key>
    SEARCH_ENGINE_ID=<your-google-custom-search-engine-id>
    DB_NAME=<database-name>
    DB_USER=<database-user>
    DB_PASSWORD=<database-password>
    DB_HOST=<database-host e.g., localhost>
    DB_PORT=<database-port e.g., 5432>
  5. Run the Backend:

    Option 1 - CLI Mode:

    python main.py

    Option 2 - FastAPI Server:

    python app.py
    # or
    python -m uvicorn app:app --host 0.0.0.0 --port 5000 --reload
  6. Run the Telegram Bot (optional):

    python bot_tele.py

Frontend Setup

  1. Navigate to Frontend Directory:

    cd VeriFact/frontend
  2. Install Dependencies:

    npm install
  3. Run Development Server:

    npm run dev

    Access at http://localhost:5173 (or as shown in terminal)

  4. Build for Production:

    npm run build
  5. Lint Code:

    npm run lint

Quick Start with PowerShell Script

For Windows users, use the provided PowerShell script:

./start.ps1

OR 

Set-Location backend

.venv\Scripts\Activate.ps1

Start-Process powershell -ArgumentList "python -m uvicorn app:app --host 0.0.0.0 --port 5000 --reload"
python .\bot_tele.py

This script will:

  • Activate the virtual environment
  • Start the FastAPI server
  • Launch the Telegram bot

Usage

Web Interface

  1. Start the FastAPI backend and Vue frontend
  2. Navigate to http://localhost:5173
  3. Enter a claim in the input field
  4. Review the verdict, confidence score, and supporting evidence

CLI Interface

  1. Run python main.py
  2. Enter claims when prompted
  3. View results in the terminal

Telegram Bot

  1. Configure bot token in .env
  2. Run python bot_tele.py
  3. Interact with the bot on Telegram

Future Scope

  • Enhanced Frontend: Complete the Vue 3 frontend with features like claim history and source visualization
  • Multi-Source Search: Incorporate additional search APIs (e.g., X search, Bing) for broader evidence collection
  • Media Support: Add verification for images, videos, or audio using computer vision models
  • Multi-Language Support: Extend NLI and polishing models to handle non-English claims
  • Performance Optimization: Use lighter ML models or caching mechanisms to reduce latency
  • Real-time Analysis: Implement streaming results for long-running fact checks
  • User Accounts: Add authentication and personalized claim history

Contributors


Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

License

This project is licensed under the MIT License. See the LICENSE file for details.


Built with ❤️ to combat misinformation

About

A fact-checking system that verifies user claims with evidence-backed explanations using web searches and AI, delivering clear, transparent verdicts through a simple interface.

Topics

Resources

License

Stars

Watchers

Forks

Contributors 4

  •  
  •  
  •  
  •