Pinned Loading
-
All_things_attention
All_things_attention PublicComparison of different kinds of attentions
Jupyter Notebook 1
-
-
OLMo
OLMo PublicForked from allenai/OLMo
Modeling, training, eval, and inference code for OLMo
Python
-
huggingface/trl
huggingface/trl PublicTrain transformer language models with reinforcement learning.
-
allenai/OLMo
allenai/OLMo PublicModeling, training, eval, and inference code for OLMo
-
deepseek-mla
deepseek-mla PublicImplementation of DeepSeek's Multihead Latent Attention architecture
Python
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.