CBOW and Skip-Gram Word Embeddings Notebook

This is a notebook where I implement word embeddings for both Arabic and English languages using the CBOW (Continuous Bag-of-Words) and Skip-Gram algorithms. Word embeddings are a powerful technique for representing words as vectors in a high-dimensional space, which can be used to capture semantic relationships between words.

In this notebook, I explore how the CBOW and Skip-Gram algorithms can be used to generate word embeddings for both Arabic and English languages. I demonstrate how to preprocess text data, train the word embedding models, and visualize the resulting embeddings using t-SNE (t-Distributed Stochastic Neighbor Embedding) technique.

Here's a sample visualization of the word embeddings generated from the Skip-Gram algorithm on the English corpus:

Fig.1. Visualization of the word embeddings generated from the Skip-Gram algorithm on the English corpus showing how the word "bigger" and "grown" are close to each other, implying the similarity between them.

Fig.2. Visualization of the word embeddings generated from the Skip-Gram algorithm on the English corpus showing how the word "grown" is close to the word "bigger" (shown in Fig.1.), indicating their similarity.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
CBOW&SkipGram_Ar&Eng.ipynb		CBOW&SkipGram_Ar&Eng.ipynb
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CBOW and Skip-Gram Word Embeddings Notebook

About

Releases

Packages

Languages

License

Mahran-xo/CBOW-SKipGram

Folders and files

Latest commit

History

Repository files navigation

CBOW and Skip-Gram Word Embeddings Notebook

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages