-
Algorithms
- Bag of Words
- N-gram model
- TF-IDF
- HMMs (Hidden Markov Models)
- Part 1: https://medium.com/@ompramod9921/hidden-markov-models-the-secret-sauce-in-natural-language-processing-98cde0372721
- Part 2: https://medium.com/@ompramod9921/hidden-markov-models-the-secret-sauce-in-natural-language-processing-38fbcc010bda
- https://web.stanford.edu/~jurafsky/slp3/A.pdf
- https://wisdomml.in/hidden-markov-model-hmm-in-nlp-python/
- Word2Vec
- GloVe
- NA
- RNN [Many to One]
- LSTM [One to Many]
- GRU [Many to Many Synced]
- Seq2Seq + Attention [Many to Many]
- Transformer
- BERT
- GPT-2
-
Datasets
- Dataset Blog
- Making a Python Library
- https://snap.stanford.edu/data/web-Amazon-links.html
- https://cseweb.ucsd.edu/~jmcauley/datasets/amazon_v2/
- https://github.com/sstoikov/piki-music-dataset/tree/main/data
- https://www.upf.edu/web/mtg/mard
- https://www.statmt.org/wmt14/translation-task.html#download
- https://www.statmt.org/europarl/ [Seq2Seq]
- https://github.com/micbuffa/WasabiDataset [Text Generation]
- https://huggingface.co/datasets/brunokreiner/genius-lyrics/tree/main [Text Generation]
- https://www.kaggle.com/datasets/nbroad/wiki-20220301-en-sci [BERT Pretraining]
- https://paperswithcode.com/dataset/lambada [GPT 2 Training]
-
Coding style
-
PIP
- https://betterscientificsoftware.github.io/python-for-hpc/tutorials/python-pypi-packaging/
- https://packaging.python.org/en/latest/guides/writing-pyproject-toml/
- https://github.com/github/gitignore/blob/main/Python.gitignore
- https://medium.com/@blackary/publishing-a-python-package-from-github-to-pypi-in-2024-a6fb8635d45d / https://youtu.be/90PWQEc--6k?si=NK3byavwtsiFZyHK
-
Documentation
- https://testdriven.io/blog/documenting-python/
- https://medium.com/@peterkong/comparison-of-python-documentation-generators-660203ca3804
- https://sphinx-rtd-tutorial.readthedocs.io/en/latest/docstrings.html
- https://samnicholls.net/2016/06/15/how-to-sphinx-readthedocs/ ***
- https://stackoverflow.com/a/71489121
- https://www.sphinx-doc.org/en/master/man/sphinx-apidoc.html [Sphinx RST Files Generator]
- https://www.sphinx-doc.org/en/master/tutorial/deploying.html [Deploying]
- https://redandgreen.co.uk/sphinx-to-github-pages-via-github-actions/
conda install pandoc
instead ofpip install pandoc
pandac -s <md path> -o <rst path>
- Sample Documentation: https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html
-
README
-
Logging