A robust fact-checking system designed to combat misinformation by verifying user-submitted claims with evidence-backed explanations. In a world where false information spreads rapidly, this tool provides clear, transparent, and accessible verdicts to help users discern truth from fiction. The backend processes claims using web searches, machine learning models, and text polishing, while the frontend offers a user-friendly interface for interaction.
Note
This project is currently in the experimental and development phase, and it may sometimes produce unsatisfactory results.
Misinformation, including fake news and misleading claims, spreads quickly across digital platforms, often causing confusion, mistrust, or harm. Many existing fact-checking tools provide limited explanations or are not user-friendly for non-technical audiences. Additionally, the lack of accessible, evidence-based verification systems makes it challenging for individuals to validate claims they encounter online.
VeriFact addresses these challenges by offering a fact-checking system that:
- Verifies Claims: Users submit claims via a CLI, web interface, or Telegram bot, receiving verdicts such as "Likely True," "Likely False," or "Mixed/Uncertain."
- Provides Transparent Explanations: Instead of just a verdict, the system delivers detailed explanations backed by credible web sources.
- Ensures Accessibility: Multiple interfaces (CLI, Vue.js web app, Telegram bot) make fact-checking intuitive for all users.
- Caches Results: Stores results in a PostgreSQL database to improve efficiency for repeated queries.
- Explainable Verdicts: Generates clear, evidence-based explanations with references to credible sources
- Machine Learning Integration: Uses Sentence Transformers for semantic similarity and BART-MNLI for natural language inference (NLI) to classify evidence
- Text Polishing: Employs Pegasus-XSUM to rephrase explanations for clarity and readability
- Efficient Caching: Stores results in PostgreSQL to avoid redundant processing
- Multi-Channel Access: Web frontend, CLI, and Telegram bot integration
- Scalable Architecture: FastAPI backend with async support for handling multiple requests
- Web Search: Leverages Google Custom Search to retrieve relevant sources for analysis
- Confidence Scoring: Provides numerical confidence in claim assessment
The platform consists of several integrated layers:
- User Interfaces - Vue.js frontend, CLI, and Telegram bot
- API Layer - FastAPI backend for handling requests
- ML Pipeline - PyTorch-based models for claim analysis
- Data Layer - PostgreSQL database for caching and storage
- External Services - Google Search API and web scraping
The system follows a modular, data-driven architecture that combines web scraping, machine learning, and database caching for robust fact-checking.
- Claim Input: Users submit a claim via the CLI (
main.py), Vue frontend, or Telegram bot (bot_tele.py) - Cache Check: The backend queries the PostgreSQL database (
search_logtable) to check for cached results using a normalized claim - Web Search: If no cache is found, the system uses Google Custom Search API (
search_claim) to fetch up to 10 relevant URLs (can be increased) - Heuristic Analysis: The
analyze_verdictsscores search results' titles and snippets using keyword-based heuristics to produce an initial verdict (Likely True, Likely False, Uncertain) - Deep ML Analysis: The
select_evidence_from_urlsfunction inml_models.py:- Fetches and cleans web content using Trafilatura
- Extracts sentences and ranks them by similarity to the claim using Sentence Transformers (
all-MiniLM-L6-v2) - Classifies sentences as ENTAILMENT, CONTRADICTION, or NEUTRAL using BART-MNLI (
facebook/bart-large-mnli)
- Verdict Fusion: The
simple_fuse_verdictcombines heuristic and ML results to produce a final verdict - Explanation Generation: The
build_explanationconstructs a factual explanation from supporting and contradicting evidence - Text Polishing: The
polish_textintext_polisher.pyrephrases the explanation using Pegasus-XSUM (google/pegasus-xsum) for fluency - Response and Caching: The final verdict, explanation, and evidence are returned to the user and stored in PostgreSQL via
upsert_result
VeriFact/
├── backend/
│ ├── app/
│ │ ├── __init__.py
│ │ ├── api/
│ │ └── models/
│ ├── app.py # FastAPI application
│ ├── main.py # CLI interface
│ ├── bot_tele.py # Telegram bot integration
│ ├── ml_models.py # ML model implementations
│ ├── text_polisher.py # Text polishing module
│ ├── test_modules.py # Unit tests
│ ├── test_refactoring.py # Refactoring tests
│ ├── PR_REVIEW_CHANGES.md
│ └── REFACTORING_SUMMARY.md
├── frontend/
│ ├── src/
│ │ ├── components/
│ │ ├── App.vue
│ │ └── main.js
│ ├── public/
│ ├── package.json
│ ├── vite.config.js
│ ├── index.html
│ └── README.md
├── sample_data/
│ ├── fact_checks.json
│ └── sample_database.csv
├── .github/
│ └── workflows/
├── requirements.txt
├── readme.md
├── start.ps1
└── LICENSE
| Component | Technology |
|---|---|
| Web Framework | FastAPI (async REST API) |
| Server | Uvicorn (ASGI server) |
| ML Framework | PyTorch & Transformers (Hugging Face) |
| Sentence Similarity | Sentence-Transformers (all-MiniLM-L6-v2) |
| NLI Model | BART-MNLI (facebook/bart-large-mnli) |
| Text Polisher | Pegasus-XSUM (google/pegasus-xsum) |
| Database Driver | psycopg2 (PostgreSQL adapter) |
| Web Scraping | Trafilatura (content extraction) |
| HTTP Client | Requests |
| Environment | python-dotenv |
| Component | Technology |
|---|---|
| Framework | Vue.js 3 |
| Build Tool | Vite |
| Code Quality | ESLint |
| Component | Technology |
|---|---|
| Database | PostgreSQL (relational database for caching) |
- Sentence Embedder:
sentence-transformers/all-MiniLM-L6-v2for ranking sentences - NLI Model:
facebook/bart-large-mnlifor evidence classification - Polisher Model:
google/pegasus-xsumfor explanation rephrasing
- Python 3.8+
- Node.js (v16+)
- npm
- PostgreSQL (local or cloud-hosted)
- Google Custom Search API key and Engine ID
- Git
-
Clone the Repository:
git clone https://github.com/DarkenStars/VeriFact.git cd VeriFact/backend -
Create Virtual Environment (recommended):
python -m venv .venv .venv\Scripts\Activate.ps1 # Windows # or source .venv/bin/activate # Linux/Mac
-
Install Dependencies:
pip install -r requirements.txt
-
Set Up Environment Variables:
Create a
.envfile inbackend/:API_KEY=<your-google-api-key> SEARCH_ENGINE_ID=<your-google-custom-search-engine-id> DB_NAME=<database-name> DB_USER=<database-user> DB_PASSWORD=<database-password> DB_HOST=<database-host e.g., localhost> DB_PORT=<database-port e.g., 5432>
-
Run the Backend:
Option 1 - CLI Mode:
python main.py
Option 2 - FastAPI Server:
python app.py # or python -m uvicorn app:app --host 0.0.0.0 --port 5000 --reload -
Run the Telegram Bot (optional):
python bot_tele.py
-
Navigate to Frontend Directory:
cd VeriFact/frontend -
Install Dependencies:
npm install
-
Run Development Server:
npm run dev
Access at
http://localhost:5173(or as shown in terminal) -
Build for Production:
npm run build
-
Lint Code:
npm run lint
For Windows users, use the provided PowerShell script:
./start.ps1
OR
Set-Location backend
.venv\Scripts\Activate.ps1
Start-Process powershell -ArgumentList "python -m uvicorn app:app --host 0.0.0.0 --port 5000 --reload"
python .\bot_tele.pyThis script will:
- Activate the virtual environment
- Start the FastAPI server
- Launch the Telegram bot
- Start the FastAPI backend and Vue frontend
- Navigate to
http://localhost:5173 - Enter a claim in the input field
- Review the verdict, confidence score, and supporting evidence
- Run
python main.py - Enter claims when prompted
- View results in the terminal
- Configure bot token in
.env - Run
python bot_tele.py - Interact with the bot on Telegram
- Enhanced Frontend: Complete the Vue 3 frontend with features like claim history and source visualization
- Multi-Source Search: Incorporate additional search APIs (e.g., X search, Bing) for broader evidence collection
- Media Support: Add verification for images, videos, or audio using computer vision models
- Multi-Language Support: Extend NLI and polishing models to handle non-English claims
- Performance Optimization: Use lighter ML models or caching mechanisms to reduce latency
- Real-time Analysis: Implement streaming results for long-running fact checks
- User Accounts: Add authentication and personalized claim history
- Saksham Pahariya - Frontend Lead, Deployment
- Mehul Batham - Frontend, UI/UX, Documentation
- Yathartha Jain - Backend, Machine Learning
- Vasant Kumar Mogia - Backend, Machine Learning, Debugging
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
This project is licensed under the MIT License. See the LICENSE file for details.
Built with ❤️ to combat misinformation