Bachelor's diploma project. News aggregator website with text summarization, question answering and topic modelling. Utilizes airflow for scheduling the aggregation process
Features
- Aggregates news from different user-selected sources on a schedule
- Provides a concise summary for each news article
- Groups aggregated news by topic keywords
- Generates answers for questions user asks about an article
Todo:
- Use airflow/postgresql/django passwords and secret keys from environment variables
- Use a proper webserver
- Host this somewhere
- Don't show all of the aggregated articles at once, wait for user to scroll to the end of the page
Projet folders:
- "dags" - contains files needed for news aggregation process DAG, including text summarization and topic modelling
- "django_website" - contains files for django website
Container structure: Django website container <=> PostgreSQL container <=> Airflow containers for aggregation & NLP processes
How to launch and how to use: Project currently has issues on deployment related to database initialization, unfortunately