GitHub - masked-user/Real-time-news-recommendation

Real-Time News Recommendation System

The Real-Time News Recommendation System is designed to deliver personalized, up-to-date news content tailored to individual user preferences and behaviors. With the rapid growth of digital platforms as primary news sources, users face an overwhelming flood of information, making it challenging to find content relevant to their interests. This project addresses this issue by leveraging machine learning models, real-time data streaming, and a scalable framework to create a seamless and curated news experience.

Key Features

Personalized Recommendations: Uses advanced recommendation algorithms to match news content with individual user preferences.
Real-Time Processing: Processes and analyzes news data in real time to ensure recommendations are always relevant and timely.
User Behavior Analysis: Tracks user interactions, such as click-through rates and reading patterns, to gain insights into behavior.
Dynamic Feedback Loop: Continuously updates and improves recommendations based on user behavior and preferences.
Scalable Framework: Designed to handle high volumes of data and users, ensuring consistent performance.

Technical Overview

The system integrates several robust technologies:

Machine Learning Models: To predict and recommend content based on user preferences and behavior.
Real-Time Data Streaming: Ensures immediate processing and analysis of incoming data.
Scalable Infrastructure: Built with technologies like Kafka, BigQuery, and Databricks to handle massive data volumes.
User Interaction Monitoring: Tracks and analyzes click-through rates and engagement metrics to improve recommendations dynamically.
Frontend & Backend Integration: Seamless flow from user interaction to backend processing and recommendation delivery.

Deployment Steps

1. Kafka Deployment on GKE

Create a GKE cluster:

gcloud container clusters create kafka-cluster \  
  --num-nodes=3 \  
  --zone=us-central1-a

Authenticate the cluster:

gcloud container clusters get-credentials kafka-cluster --zone=us-central1-a  
kubectl get nodes

Install Helm:

curl https://raw.githubusercontent.com/helm/helm/master/scripts/get-helm-3 | bash  
helm repo add bitnami https://charts.bitnami.com/bitnami  
helm repo update

Deploy Kafka with Zookeeper using Helm:

helm install kafka bitnami/kafka \  
  --set replicaCount=3 \  
  --set zookeeper.replicaCount=3 \  
  --set externalAccess.enabled=true \  
  --set externalAccess.service.type=LoadBalancer \  
  --set externalAccess.autoDiscovery.enabled=true \  
  --set rbac.create=true \  
  --set controller.automountServiceAccountToken=true \  
  --set broker.automountServiceAccountToken=true

Check Kafka logs and metrics:

kubectl logs kafka-controller-0  
kubectl get svc

Install metrics-server for resource monitoring:

kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.6.1/components.yaml  
kubectl get deployment metrics-server -n kube-system  
kubectl logs deployment/metrics-server -n kube-system  
kubectl top pods

2. Cloud SQL Setup

Create a Cloud SQL database instance.
Set up the database and table to store user information.
Ensure the backend is updated with the database connection details, such as instance name, database name, and table name.
Open the required port (3306) in the firewall for database access.

3. BigQuery Setup

Create BigQuery credentials to enable secure access.
Set up a dataset and a table in BigQuery to store processed clickstream data.
Update the backend with the credentials in the required configuration file.

4. Databricks and ADF Setup

Create an Azure Data Factory (ADF) account and set up a new resource group.
Create a Databricks environment within the ADF account, ensuring proper configuration.
Connect BigQuery to Databricks:
- Provide BigQuery credentials to Databricks for connectivity and data processing.
- Configure the Databricks environment to read and write data to and from BigQuery.

5. Backend Deployment Using Cloud Run

Authenticate and set up Google Cloud services:

gcloud auth login  
gcloud config set project [PROJECT_ID]  
gcloud services enable artifactregistry.googleapis.com  
gcloud services enable run.googleapis.com  
gcloud services enable cloudbuild.googleapis.com

Add a Dockerfile to containerize the backend application.

Test the Docker build locally:

docker build -t flask-backend .  
docker run -p 8080:8080 flask-backend

Create an Artifact Registry repository:

gcloud artifacts repositories create my-repo --repository-format=docker --location=us-central1

Tag and push the Docker image:

docker tag flask-backend us-central1-docker.pkg.dev/[PROJECT_ID]/my-repo/flask-backend:latest  
gcloud auth configure-docker us-central1-docker.pkg.dev  
docker push us-central1-docker.pkg.dev/[PROJECT_ID]/my-repo/flask-backend:latest

Deploy the backend to Cloud Run:

gcloud run deploy flask-backend \  
  --image us-central1-docker.pkg.dev/[PROJECT_ID]/my-repo/flask-backend:latest \  
  --platform managed \  
  --region us-central1 \  
  --allow-unauthenticated \  
  --port 8080

6. Frontend Deployment Using Cloud Storage

Create a Cloud Storage bucket and upload all static frontend files.
Update the JavaScript files with the correct backend API endpoint.
Deploy the files to the bucket.

7. Configuration and Finalization

Open firewall ports:
- 3306 for MySQL
- 8080 for Cloud Run
- 9004 for Kafka
Ensure all services and instances are deployed in the same zone to reduce latency.
Configure IAM policies to allow Cloud Run to access Kafka, Cloud SQL, and BigQuery.

By following this sequence, The project will be deployed effectively.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
backend		backend
frontend		frontend
README.md		README.md
kafka-compose.yaml		kafka-compose.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Real-Time News Recommendation System

Key Features

Technical Overview

Deployment Steps

1. Kafka Deployment on GKE

2. Cloud SQL Setup

3. BigQuery Setup

4. Databricks and ADF Setup

5. Backend Deployment Using Cloud Run

6. Frontend Deployment Using Cloud Storage

7. Configuration and Finalization

About

Releases

Packages

Languages

masked-user/Real-time-news-recommendation

Folders and files

Latest commit

History

Repository files navigation

Real-Time News Recommendation System

Key Features

Technical Overview

Deployment Steps

1. Kafka Deployment on GKE

2. Cloud SQL Setup

3. BigQuery Setup

4. Databricks and ADF Setup

5. Backend Deployment Using Cloud Run

6. Frontend Deployment Using Cloud Storage

7. Configuration and Finalization

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages