This repo consists of the code snippets used for the talk.
Link to the slide deck: https://prezi.com/view/9emf6rIvvWXcAkxb8ULO/
References:
- https://towardsdatascience.com/word-embedding-in-nlp-one-hot-encoding-and-skip-gram-neural-network-81b424da58f2
- https://machinelearningmastery.com/how-to-one-hot-encode-sequence-data-in-python/
- http://jalammar.github.io/illustrated-word2vec/
- https://gist.github.com/aparrish/2f562e3737544cf29aaf1af30362f469
- https://www.analyticsvidhya.com/blog/2017/06/word-embeddings-count-word2veec/#:~:text=Word2vec%20is%20not%20a%20single,act%20as%20word%20vector%20representations.
- https://towardsdatascience.com/an-implementation-guide-to-word2vec-using-numpy-and-google-sheets-13445eebd281
- https://towardsdatascience.com/nlp-101-negative-sampling-and-glove-936c88f3bc68
- https://towardsdatascience.com/word-embeddings-in-2020-review-with-code-examples-11eb39a1ee6d
- https://kavita-ganesan.com/gensim-word2vec-tutorial-starter-code/#.X4QYPmP7TeM
- https://cai.tools.sap/blog/glove-and-fasttext-two-popular-word-vector-models-in-nlp/#:~:text=Instead%20of%20learning%20vectors%20for,and%20end%20of%20the%20word.
- https://towardsdatascience.com/nlp-extract-contextualized-word-embeddings-from-bert-keras-tf-67ef29f60a7b
- https://medium.com/@dhartidhami/understanding-bert-word-embeddings-7dc4d2ea54ca
- https://medium.com/@_init_/why-bert-has-3-embedding-layers-and-their-implementation-details-9c261108e28a
- https://colab.research.google.com/drive/1ZQvuAVwA3IjybezQOXnrXMGAnMyZRuPU#scrollTo=UeQNEFbUgMSf
- https://huggingface.co/bert-base-uncased
- Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781. Pennington, J., Socher, R., & Manning, C. D. (2014, October). Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP) (pp. 1532-1543).
- Bojanowski, P., Grave, E., Joulin, A., & Mikolov, T. (2017). Enriching word vectors with subword information. Transactions of the Association for Computational Linguistics, 5, 135-146.
- Peters, M. E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., & Zettlemoyer, L. (2018). Deep contextualized word representations. arXiv preprint arXiv:1802.05365.
- Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
- Cross Lingual Embeddings: https://ruder.io/cross-lingual-embeddings/