brave-news-source-suggestion

Service for producing the source embedding representations and similarity matrix needed for source suggestion feature in Brave News.

Installation

pip install -r requirements.txt

Scripts

source-feed-accumulator.py: parses Brave News feed periodically, collecting articles for each source in articles_history.csv. For each article, we store the publisher_id attribute.

sources-similarity-matrix.py: takes as input the article history and produces a 384-dimensional embedding for each source, using the sentence-transformer package. More in particular:

all-MiniLM-L6-v2 for english language sources.
paraphrase-multilingual-MiniLM-L12-v2 for non-english language sources. Once all source embeddings are generated, a pairwise source similarity matrix is produced.

Running locally

To collect and accumulate article history:

export NO_UPLOAD=1
export NO_DOWNLOAD=1
python source-feed-accumulator.py

To computed source embeddings and produce the source similarity matrix:

export NO_UPLOAD=1
export NO_DOWNLOAD=1
python sources-similarity-matrix.py

Name		Name	Last commit message	Last commit date
Latest commit History 175 Commits
.github		.github
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
config.py		config.py
embeddings.py		embeddings.py
renovate.json		renovate.json
requirements.txt		requirements.txt
source-feed-accumulator.py		source-feed-accumulator.py
source-similarity-matrix.py		source-similarity-matrix.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

brave-news-source-suggestion

Installation

Scripts

Running locally

About

Releases

Packages

Contributors 8

Languages

License

brave/source-suggestions

Folders and files

Latest commit

History

Repository files navigation

brave-news-source-suggestion

Installation

Scripts

Running locally

About

Resources

License

Security policy

Stars

Watchers

Forks

Releases

Packages 0

Contributors 8

Languages

Packages