-
· (nousresearch)
-
falcon-mamba-7b - tiiuae 🤗
-
DCLM-7B - apple 🤗
-
DCLM-1B - TRI-ML 🤗
-
Minitron - a nvidia Collection
· (huggingface)
-
SmolLM - blazingly fast and remarkably powerful
· (huggingface) · (huggingface)
· (huggingface)
-
H2O-Danube3 Technical Report,
arXiv, 2407.09276
, arxiv, pdf, cication: -1Pascal Pfeiffer, Philipp Singer, Yauhen Babakhin, Gabor Fodor, Nischay Dhankhar, Sri Satish Ambati · (huggingface)
-
gemini-nano - wave-on-discord 🤗
-
GEB-1.3B: Open Lightweight Large Language Model,
arXiv, 2406.09900
, arxiv, pdf, cication: -1Jie Wu, Yufeng Zhu, Lei Shen, Xuqing Lu
-
· (research.nvidia)
-
Nemotron-4-340B-Instruct - nvidia 🤗
-
WizardLM-2-8x22B - alpindale 🤗
· (wizardlm.github) · (WizardLM - victorsungo)
-
MAP-NEO - multimodal-art-projection
-
Snowflake Arctic: The Best LLM for Enterprise AI — Efficiently Intelligent, Truly Open
· (snowflake-arctic - Snowflake-Labs) · (huggingface) · (twitter)
-
OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework,
arXiv, 2404.14619
, arxiv, pdf, cication: -1Sachin Mehta, Mohammad Hossein Sekhavat, Qingqing Cao, Maxwell Horton, Yanzi Jin, Chenfan Sun, Iman Mirzadeh, Mahyar Najibi, Dmitry Belenko, Peter Zatloukal · (huggingface) · (corenet - apple)
-
Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models,
arXiv, 2404.12387
, arxiv, pdf, cication: -1Aitor Ormazabal, Che Zheng, Cyprien de Masson d'Autume, Dani Yogatama, Deyu Fu, Donovan Ong, Eric Chen, Eugenie Lamprecht, Hai Pham, Isaac Ong
-
Rho-1: Not All Tokens Are What You Need,
arXiv, 2404.07965
, arxiv, pdf, cication: -1Zhenghao Lin, Zhibin Gou, Yeyun Gong, Xiao Liu, Yelong Shen, Ruochen Xu, Chen Lin, Yujiu Yang, Jian Jiao, Nan Duan · (rho - microsoft)
-
MiniCPM: Unveiling the Potential of Small Language Models with Scalable Training Strategies,
arXiv, 2404.06395
, arxiv, pdf, cication: -1Shengding Hu, Yuge Tu, Xu Han, Chaoqun He, Ganqu Cui, Xiang Long, Zhi Zheng, Yewei Fang, Yuxiang Huang, Weilin Zhao · (MiniCPM - OpenBMB)
-
Stable LM 2 1.6B Technical Report,
arXiv, 2402.17834
, arxiv, pdf, cication: -1Marco Bellagente, Jonathan Tow, Dakota Mahan, Duy Phung, Maksym Zhuravinskyi, Reshinth Adithyan, James Baicoianu, Ben Brooks, Nathan Cooper, Ashish Datta
-
stablelm-2-12b - stabilityai 🤗
-
c4ai-command-r-plus - CohereForAI 🤗
-
JetMoE: Reaching Llama2 Performance with 0.1M Dollars,
arXiv, 2404.07413
, arxiv, pdf, cication: -1Yikang Shen, Zhen Guo, Tianle Cai, Zengyi Qin
-
JetMoE - myshell-ai
Reaching LLaMA2 Performance with 0.1M Dollars · (research.myshell) · (huggingface) · (qbitai)
-
Poro 34B and the Blessing of Multilinguality,
arXiv, 2404.01856
, arxiv, pdf, cication: -1Risto Luukkonen, Jonathan Burdge, Elaine Zosa, Aarne Talman, Ville Komulainen, Väinö Hatanpää, Peter Sarlin, Sampo Pyysalo · (huggingface)
-
MicroLlama - keeeeenw
Micro Llama is a small Llama based model with 300M parameters trained from scratch with $500 budget
-
grok-1 - xai-org
Grok open release · (x) · (huggingface) · (qbitai)
· (huggingface) · (mp.weixin.qq)
-
c4ai-command-r-v01 - CohereForAI 🤗
· (txt.cohere)
· (huggingface)
-
miqu-1-70b - miqudev 🤗
-
H2O-Danube-1.8B Technical Report,
arXiv, 2401.16818
, arxiv, pdf, cication: -1Philipp Singer, Pascal Pfeiffer, Yauhen Babakhin, Maximilian Jeblick, Nischay Dhankhar, Gabor Fodor, Sri Satish Ambati
-
Smaug-72B-v0.1 - abacusai 🤗
-
Smaug-34B-v0.1 - abacusai 🤗
-
bagel-34b-v0.2 - jondurbin 🤗
-
TinyLlama: An Open-Source Small Language Model,
arXiv, 2401.02385
, arxiv, pdf, cication: -1Peiyuan Zhang, Guangtao Zeng, Tianduo Wang, Wei Lu · (TinyLlama - jzhang38)
-
TigerBot: An Open Multilingual Multitask LLM,
arXiv, 2312.08688
, arxiv, pdf, cication: -1Ye Chen, Wei Cai, Liangmin Wu, Xiaowei Li, Zhanxuan Xin, Cong Fu
-
DeciLM-7B - Deci 🤗
-
DeciLM-7B-instruct - Deci 🤗
· (huggingface)
-
LLM360: Towards Fully Transparent Open-Source LLMs,
arXiv, 2312.06550
, arxiv, pdf, cication: -1Zhengzhong Liu, Aurick Qiao, Willie Neiswanger, Hongyi Wang, Bowen Tan, Tianhua Tao, Junbo Li, Yuqi Wang, Suqi Sun, Omkar Pangarkar
-
GPT4All: An Ecosystem of Open Source Compressed Language Models,
arXiv, 2311.04931
, arxiv, pdf, cication: -1Yuvanesh Anand, Zach Nussbaum, Adam Treat, Aaron Miller, Richard Guo, Ben Schmidt, GPT4All Community, Brandon Duderstadt, Andriy Mulyar · (gpt4all - nomic-ai)
-
OpenChat: Advancing Open-source Language Models with Mixed-Quality Data,
arXiv, 2309.11235
, arxiv, pdf, cication: -1Guan Wang, Sijie Cheng, Xianyuan Zhan, Xiangang Li, Sen Song, Yang Liu · (openchat - imoneoi) · (huggingface) · (openchat)
-
Zephyr: Direct Distillation of LM Alignment,
arXiv, 2310.16944
, arxiv, pdf, cication: 1Lewis Tunstall, Edward Beeching, Nathan Lambert, Nazneen Rajani, Kashif Rasul, Younes Belkada, Shengyi Huang, Leandro von Werra, Clémentine Fourrier, Nathan Habib · (alignment-handbook - huggingface)
-
H2O Open Ecosystem for State-of-the-art Large Language Models,
arXiv, 2310.13012
, arxiv, pdf, cication: -1Arno Candel, Jon McKinney, Philipp Singer, Pascal Pfeiffer, Maximilian Jeblick, Chun Ming Lee, Marcos V. Conde
-
BTLM-3B-8K: 7B Parameter Performance in a 3B Parameter Model,
arXiv, 2309.11568
, arxiv, pdf, cication: -1Nolan Dey, Daria Soboleva, Faisal Al-Khateeb, Bowen Yang, Ribhu Pathria, Hemant Khachane, Shaheer Muhammad, Zhiming, Chen, Robert Myers
-
OpenBA: An Open-sourced 15B Bilingual Asymmetric seq2seq Model Pre-trained from Scratch,
arXiv, 2309.10706
, arxiv, pdf, cication: -1Juntao Li, Zecheng Tang, Yuyang Ding, Pinzheng Wang, Pei Guo, Wangjie You, Dan Qiao, Wenliang Chen, Guohong Fu, Qiaoming Zhu · (openba - opennlg)
-
XGen-7B Technical Report,
arXiv, 2309.03450
, arxiv, pdf, cication: 3Erik Nijkamp, Tian Xie, Hiroaki Hayashi, Bo Pang, Congying Xia, Chen Xing, Jesse Vig, Semih Yavuz, Philippe Laban, Ben Krause
-
FLM-101B: An Open LLM and How to Train It with $100K Budget,
arXiv, 2309.03852
, arxiv, pdf, cication: 3Xiang Li, Yiqun Yao, Xin Jiang, Xuezhi Fang, Xuying Meng, Siqi Fan, Peng Han, Jing Li, Li Du, Bowen Qin · (huggingface)
-
adept-inference - persimmon-ai-labs
Inference code for Persimmon-8B
-
WizardLM - nlpxucan
Family of instruction-following LLMs powered by Evol-Instruct: WizardLM, WizardCoder
-
FreeWilly2 - stabilityai 🤗
-
xgen - salesforce
Salesforce open-source LLMs with 8k sequence length.
-
PolyLM: An Open Source Polyglot Large Language Model,
arXiv, 2307.06018
, arxiv, pdf, cication: 5Xiangpeng Wei, Haoran Wei, Huan Lin, Tianhao Li, Pei Zhang, Xingzhang Ren, Mei Li, Yu Wan, Zhiwei Cao, Binbin Xie
-
A Technical Report for Polyglot-Ko: Open-Source Large-Scale Korean Language Models,
arXiv, 2306.02254
, arxiv, pdf, cication: -1Hyunwoong Ko, Kichang Yang, Minho Ryu, Taekyoon Choi, Seungmu Yang, Jiwung Hyun, Sungho Park, Kyubyong Park
-
Examining User-Friendly and Open-Sourced Large GPT Models: A Survey on Language, Multimodal, and Scientific GPT Models,
arXiv, 2308.14149
, arxiv, pdf, cication: -1Kaiyuan Gao, Sunan He, Zhenyu He, Jiacheng Lin, QiZhi Pei, Jie Shao, Wei Zhang · (gpt_alternatives - GPT-Alternatives) · (jiqizhixin)
-
huggingface-llama-recipes - huggingface
-
Introducing Meta Llama 3: The most capable openly available LLM to date
-
llama3 - meta-llama
The official Meta Llama 3 GitHub site
-
Meta-Llama-3-8B-Instruct - meta-llama 🤗
-
Meta-Llama-3-70B-Instruct - meta-llama 🤗
-
Llama-3-Smaug-8B - abacusai 🤗
-
Llama-3-8B-16K - mattshumer 🤗
-
Llama-3-8B-Special-Tokens-Adjusted - astronomer 🤗
-
Llama-3-8b-64k-PoSE - winglian 🤗
-
dolphin-2.9-llama3-70b - cognitivecomputations 🤗
-
Llama-3-8B-Instruct-262k - gradientai 🤗
-
Meditron: An LLM suite especially suited for low-resource medical settings leveraging Meta Llama
-
llama-3-8b-256k-PoSE - winglian 🤗
-
Llama-3-8B-Instruct-Gradient-1048k - gradientai 🤗
-
Hermes-2-Pro-Llama-3-8B - NousResearch 🤗
-
Llama3-ChatQA-1.5-8B - nvidia 🤗
-
Meta-Llama-3-120B-Instruct - mlabonne 🤗
-
Planning for Distillation of Llama 3 70b -> 4x8b / 25b : r/LocalLLaMA
-
Llama-3-Refueled - refuelai 🤗
-
Smaug-Llama-3-70B-Instruct - abacusai 🤗
· (reddit)
-
Higgs-Llama-3-70B - bosonai 🤗
-
Hermes-2-Theta-Llama-3-70B - NousResearch 🤗
-
Llama-3-Refueled - refuelai 🤗
-
ArliAI-Llama-3-8B-Formax-v1.0 - ArliAI 🤗
-
Llama-3-Groq-8B-Tool-Use - Groq 🤗
-
Nexusflow.ai | Blog :: Athene-70B: Redefining the Boundaries of Post-Training for Open Models
-
Athene-70B - Nexusflow 🤗
-
Jamba: A Hybrid Transformer-Mamba Language Model,
arXiv, 2403.19887
, arxiv, pdf, cication: -1Opher Lieber, Barak Lenz, Hofit Bata, Gal Cohen, Jhonathan Osin, Itay Dalmedigos, Erez Safahi, Shaked Meirom, Yonatan Belinkov, Shai Shalev-Shwartz · (huggingface)
-
Jamba-v0.1-chat-multilingual - lightblue 🤗
-
Jambatypus-v0.1 - mlabonne 🤗
-
Introducing DBRX: A New State-of-the-Art Open LLM | Databricks
-
dbrx - databricks
Code examples and resources for DBRX, a large language model developed by Databricks
-
dbrx-instruct - databricks 🤗
-
Gemma 2: Improving Open Language Models at a Practical Size,
arXiv, 2408.00118
, arxiv, pdf, cication: 2Gemma Team, Morgane Riviere, Shreya Pathak, Pier Giuseppe Sessa, Cassidy Hardin, Surya Bhupatiraju, Léonard Hussenot, Thomas Mesnard, Bobak Shahriari, Alexandre Ramé
-
Gemma 2 Release - a google Collection
· (x)
-
gemma-1.1-7b-it - google 🤗
-
Gemma: Open Models Based on Gemini Research and Technology,
arXiv, 2403.08295
, arxiv, pdf, cication: -1Gemma Team, Thomas Mesnard, Cassidy Hardin, Robert Dadashi, Surya Bhupatiraju, Shreya Pathak, Laurent Sifre, Morgane Rivière, Mihir Sanjay Kale, Juliette Love
-
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context,
arXiv, 2403.05530
, arxiv, pdf, cication: -1Machel Reid, Nikolay Savinov, Denis Teplyashin, Dmitry Lepikhin, Timothy Lillicrap, Jean-baptiste Alayrac, Radu Soricut, Angeliki Lazaridou, Orhan Firat, Julian Schrittwieser
-
· (twitter)
-
gemma_pytorch - google
The official PyTorch implementation of Google's Gemma models
-
gemma-7b - google 🤗
· (ai.google) · (huggingface)
-
gemma.cpp - google
lightweight, standalone C++ inference engine for Google's Gemma models.
-
gemma-peft - 🤗
-
Understanding, Using, and Finetuning Gemma - a Lightning Studio by sebastian
· (jiqizhixin)
-
gemma-7b-dolly-chatml - philschmid 🤗
-
catch-me-if-you-can - cyzgab 🤗
-
GemMoE-Beta-1 - Crystalcareai 🤗
-
Hebrew-Gemma-11B - yam-peleg 🤗
-
Gemma-10M Technical Overview. Motivation | by Akshgarg | May, 2024 | Medium
· (gemma-2B-10M - mustafaaljadery)
-
bge-multilingual-gemma2 - BAAI 🤗
-
OLMo 1.7–7B: A 24 point improvement on MMLU | by AI2 | Apr, 2024 | AI2 Blog
-
OLMo-1.7-7B - allenai 🤗
-
OLMo: Accelerating the Science of Language Models,
arXiv, 2402.00838
, arxiv, pdf, cication: -1Dirk Groeneveld, Iz Beltagy, Pete Walsh, Akshita Bhagia, Rodney Kinney, Oyvind Tafjord, Ananya Harsh Jha, Hamish Ivison, Ian Magnusson, Yizhong Wang · (OLMo - allenai) · (allenai) · (allenai)
-
OLMo-7B - allenai 🤗
-
OLMo-7B-Instruct - allenai 🤗
-
Phi-3-mini-4k-instruct - microsoft 🤗
-
Phi-3-vision-128k-instruct - microsoft 🤗
-
models - 🤗
-
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone,
arXiv, 2404.14219
, arxiv, pdf, cication: -1Marah Abdin, Sam Ade Jacobs, Ammar Ahmad Awan, Jyoti Aneja, Ahmed Awadallah, Hany Awadalla, Nguyen Bach, Amit Bahree, Arash Bakhtiari, Harkirat Behl
-
phi-2 - microsoft 🤗
-
phi-1_5 - microsoft 🤗
-
phi-1 - microsoft 🤗
-
phi-2 - randomblock1 🤗
-
phixtral-4x2_8 - mlabonne 🤗
-
Mistral-Nemo-Base-2407 - mistralai 🤗
-
Mistral-7B-Instruct-v0.3 - mistralai 🤗
-
Cheaper, Better, Faster, Stronger | Mistral AI | Frontier AI in your hands
-
Mixtral-8x22B-v0.1 - mistralai 🤗
-
mixtral-8x22b-instruct-oh - fireworks-ai 🤗
-
Mistral-22B-v0.2 - Vezora 🤗
-
Mistral-22B-v0.1 - Vezora 🤗
-
zephyr-orpo-141b-A35b-v0.1 - HuggingFaceH4 🤗
· (huggingface)
-
Mixtral-8x22B-v0.1 - mistral-community 🤗
-
hackathon - mistralai-sf24
· (models.mistralcdn) · (jiqizhixin)
-
Mistral-7B-Instruct-v0.2 - mistralai 🤗
-
mistral-src - mistralai
Reference implementation of Mistral AI 7B v0.1 model. · (jiqizhixin)
-
mixtral - 🤗
-
llama-mistral - dzhulgakov
Inference code for Mistral and Mixtral hacked up into original Llama implementation
-
DiscoLM-mixtral-8x7b-v2 - DiscoResearch 🤗
-
Mixtral-8x7B-Instruct-v0.1 - mistralai 🤗
· (mp.weixin.qq)
-
mixtral-7b-8expert - DiscoResearch 🤗
· (huggingface)
-
mixtral-8x7b-32kseqlen - someone13574 🤗
-
mixtral-46.7b-chat - openskyml 🤗
-
Mixtral-8x7B-v0.1-GPTQ - TheBloke 🤗
-
MixtralKit - open-compass
A toolkit for inference and evaluation of 'mixtral-8x7b-32kseqlen' from Mistral AI
-
mistral-playground - marcofrodl 🤗
-
Mixtral-8x7B-Instruct-v0.1-bnb-4bit - ybelkada 🤗
-
notux-8x7b-v1 - argilla 🤗
-
mixtral-offloading - dvmazur
Run Mixtral-8x7B models in Colab or consumer desktops
-
mixtral-test-46.7b-chat - johann22 🤗
-
Nous-Hermes-2-Mixtral-8x7B-SFT - NousResearch 🤗
· (jiqizhixin)
-
Nous-Hermes-2-Mixtral-8x7B-DPO - NousResearch 🤗
-
Nous-Hermes-2-Mixtral-8x7B-DPO-adapter - NousResearch 🤗
-
miqu-1-70b - miqudev 🤗
-
Hermes-2-Pro-Mistral-7B - NousResearch 🤗
-
dolphin-2.8-mistral-7b-v02 - cognitivecomputations 🤗
-
dolphin-2.8-mistral-7b-v02 - cognitivecomputations 🤗
-
dolphin-2.9.1-mixtral-1x22b - cognitivecomputations 🤗
-
StripedHyena-Hessian-7B - togethercomputer 🤗
-
StripedHyena-Nous-7B - togethercomputer 🤗
· (together)
- BLOOMChat-176B-v1-GPTQ - TheBloke 🤗
GitHub - mosaicml/llm-foundry: LLM training code for MosaicML foundation models
- mpt-30b-chat - mosaicml 🤗
-
h2oGPT: Democratizing Large Language Models,
arXiv, 2306.08161
, arxiv, pdf, cication: -1Arno Candel, Jon McKinney, Philipp Singer, Pascal Pfeiffer, Maximilian Jeblick, Prithvi Prabhu, Jeff Gambera, Mark Landry, Shivam Bansal, Ryan Chesler
-
LiteLlama-460M-1T - ahxt 🤗
· (jiqizhixin)
-
Llama-2-7b-chat-mlx - mlx-llama 🤗
-
TinyLlama - jzhang38
The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
-
llama-recipes - facebookresearch
Examples and recipes for Llama 2 model · (mp.weixin.qq) · (jiqizhixin) · (mp.weixin.qq) · (d7mv45xi4m.feishu)
-
llama2-13b-orca-8k-3319 - OpenAssistant 🤗
-
pyllama - juncongmoo
LLaMA: Open and Efficient Foundation Language Models
-
llama-gpt - getumbrel
A self-hosted, offline, ChatGPT-like chatbot. Powered by Llama 2. 100% private, with no data leaving your device.
-
LLongMA-2-13b-16k - conceptofmind 🤗
-
LLongMA-2-13b - conceptofmind 🤗
-
LLongMA-2-7b-16k - conceptofmind 🤗
-
llama2-webui - liltom-eth
Run Llama 2 locally with gradio UI on GPU or CPU from anywhere (Linux/Windows/Mac). Supporting Llama-2-7B/13B/70B with 8-bit, 4-bit. Supporting GPU inference (6 GB VRAM) and CPU inference.
-
Flan-Open-Llama-13b - conceptofmind 🤗
-
Llama-2 - amitsangani
All the projects related to Llama
-
falcon-11B - tiiuae 🤗
-
Falcon-LLM - Sentdex
Helper scripts and examples for exploring the Falcon LLM models · (huggingface) · (huggingface)
[2304.01373] Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling
· ([pythia](https://github.com/EleutherAI/pythia) - EleutherAI) ![Star](https://img.shields.io/github/stars/EleutherAI/pythia.svg?style=social&label=Star)
-
Timeline of recent major LLM releases (past 2 months) : r/LocalLLaMA
-
The History of Open-Source LLMs: Imitation and Alignment (Part Three)
· (mp.weixin.qq)
-
os-llms - blog 🤗
-
Flacuna: Unleashing the Problem Solving Power of Vicuna using FLAN Fine-Tuning,
arXiv, 2307.02053
, arxiv, pdf, cication: 3Deepanway Ghosal, Yew Ken Chia, Navonil Majumder, Soujanya Poria
-
FastChat - lm-sys
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and FastChat-T5.
-
stanford_alpaca - tatsu-lab
Code and documentation to train Stanford's Alpaca models, and generate the data. · (crfm.stanford)
-
dolly - databrickslabs
Databricks’ Dolly, a large language model trained on the Databricks Machine Learning Platform · (huggingface) · (databricks)
-
YamshadowExperiment28-7B - automerger 🤗
· (twitter)
-
Starling-7B: Increasing LLM Helpfulness & Harmlessness with RLAIF
· (huggingface) · (huggingface)
-
Beagle14-7B - mlabonne 🤗
-
Improving Open-Source LLMs - Datasets, Merging and Stacking - The Abacus.AI Blog
-
CrystalChat - LLM360 🤗
-
btlm-3b-8k-chat - cerebras 🤗
-
stablelm-zephyr-3b - stabilityai 🤗
· (huggingface)
-
smol-7b - rishiraj 🤗
-
LLM-Blender: Ensembling Large Language Models with Pairwise Ranking and Generative Fusion,
arXiv, 2306.02561
, arxiv, pdf, cication: -1Dongfu Jiang, Xiang Ren, Bill Yuchen Lin · (huggingface) · (LLM-Blender - yuchenlin)
-
Intel Neural-Chat 7b: Fine-Tuning on Gaudi2 for Top LLM Performance
-
Starling-7B: Increasing LLM Helpfulness & Harmlessness with RLAIF
-
sparse-llama-gsm8k - neuralmagic 🤗
-
DeciLM-6b - Deci 🤗
-
GOAT-7B-Community - GOAT-AI 🤗
-
openchat - imoneoi
OpenChat: Less is More for Open-source Models · (mp.weixin.qq)
-
GPT-4-LLM - Instruction-Tuning-with-GPT-4
Instruction Tuning with GPT-4
-
Instruction Tuning with GPT-4,
arXiv, 2304.03277
, arxiv, pdf, cication: 182Baolin Peng, Chunyuan Li, Pengcheng He, Michel Galley, Jianfeng Gao · (instruction-tuning-with-gpt-4.github)
-
deepseek-coder-7b-instruct - deepseek-ai 🤗
-
UltraChat - thunlp
Large-scale, Informative, and Diverse Multi-round Chat Data (and Models) · (mp.weixin.qq) · (qbitai)
-
How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources,
arXiv, 2306.04751
, arxiv, pdf, cication: 40Yizhong Wang, Hamish Ivison, Pradeep Dasigi, Jack Hessel, Tushar Khot, Khyathi Raghavi Chandu, David Wadden, Kelsey MacMillan, Noah A. Smith, Iz Beltagy · (jiqizhixin) · (open-instruct - allenai)
-
DCLM-7B - apple 🤗
-
Taiwan LLM: Bridging the Linguistic Divide with a Culturally Aligned Language Model,
arXiv, 2311.17487
, arxiv, pdf, cication: 11Yen-Ting Lin, Yun-Nung Chen · (Taiwan-LLM - MiuLab)
-
Xmodel-LM Technical Report,
arXiv, 2406.02856
, arxiv, pdf, cication: -1Yichuan Wang, Yang Liu, Yu Yan, Xucheng Huang, Ling Jiang
· (XmodelLM - XiaoduoAILab)
-
MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series,
arXiv, 2405.19327
, arxiv, pdf, cication: -1Ge Zhang, Scott Qu, Jiaheng Liu, Chenchen Zhang, Chenghua Lin, Chou Leuang Yu, Danny Pan, Esther Cheng, Jie Liu, Qunshu Lin · (map-neo.github)
-
Yuan 2.0-M32: Mixture of Experts with Attention Router,
arXiv, 2405.17976
, arxiv, pdf, cication: -1Shaohua Wu, Jiangang Luo, Xi Chen, Lingjun Li, Xudong Zhao, Tong Yu, Chao Wang, Yue Wang, Fei Wang, Weixu Qiao
· (huggingface)
-
ChuXin: 1.6B Technical Report,
arXiv, 2405.04828
, arxiv, pdf, cication: -1Xiaomin Zhuang, Yufan Jiang, Qiaozhi He, Zhihua Wu
-
Tele-FLM Technical Report,
arXiv, 2404.16645
, arxiv, pdf, cication: -1Xiang Li, Yiqun Yao, Xin Jiang, Xuezhi Fang, Chao Wang, Xinzhang Liu, Zihan Wang, Yu Zhao, Xin Wang, Yuyao Huang
-
Chinese Tiny LLM: Pretraining a Chinese-Centric Large Language Model,
arXiv, 2404.04167
, arxiv, pdf, cication: -1Xinrun Du, Zhouliang Yu, Songyang Gao, Ding Pan, Yuyang Cheng, Ziyang Ma, Ruibin Yuan, Xingwei Qu, Jiaheng Liu, Tianyu Zheng · (huggingface)
-
Mengzi3 - Langboat
· (qbitai)
-
MiniCPM - OpenBMB
MiniCPM-2.4B: An end-side LLM outperforms Llama2-13B.
· (huggingface)
-
Orion-14B: Open-source Multilingual Large Language Models,
arXiv, 2401.12246
, arxiv, pdf, cication: -1Du Chen, Yi Huang, Xiaopu Li, Yongqiang Li, Yongqiang Liu, Haihui Pan, Leichao Xu, Dacheng Zhang, Zhipeng Zhang, Kun Han
-
Orion - OrionStarAI
Orion-14B is a family of models includes a 14B foundation LLM, and a series of models: a chat model, a long context model, a quantized model, a RAG fine-tuned model, and an Agent fine-tuned model. Orion-14B 系列模型包括一个具有140亿参数的多语言基座大模型以及一系列相关的衍生模型,包括对话模型,长文本模型,量化模型,RAG微调模型,Agent微调模型等。 · (Orion - OrionStarAI)
-
TeleChat Technical Report,
arXiv, 2401.03804
, arxiv, pdf, cication: -1Zihan Wang, Xinzhang Liu, Shixuan Liu, Yitong Yao, Yuyao Huang, Zhongjiang He, Xuelong Li, Yongxiang Li, Zhonghao Che, Zhaoxi Zhang
-
YAYI 2: Multilingual Open-Source Large Language Models,
arXiv, 2312.14862
, arxiv, pdf, cication: -1Yin Luo, Qingchao Kong, Nan Xu, Jia Cao, Bao Hao, Baoyu Qu, Bo Chen, Chao Zhu, Chenyang Zhao, Donglei Zhang
-
SeaLLMs -- Large Language Models for Southeast Asia,
arXiv, 2312.00738
, arxiv, pdf, cication: -1Xuan-Phi Nguyen, Wenxuan Zhang, Xin Li, Mahani Aljunied, Qingyu Tan, Liying Cheng, Guanzheng Chen, Yue Deng, Sen Yang, Chaoqun Liu
· (SeaLLMs - DAMO-NLP-SG)
-
YUAN 2.0: A Large Language Model with Localized Filtering-based Attention,
arXiv, 2311.15786
, arxiv, pdf, cication: -1Shaohua Wu, Xudong Zhao, Shenling Wang, Jiangang Luo, Lingjun Li, Xi Chen, Bing Zhao, Wei Wang, Tong Yu, Rongguo Zhang · (Yuan-2.0 - IEIT-Yuan)
-
Ziya2: Data-centric Learning is All LLMs Need,
arXiv, 2311.03301
, arxiv, pdf, cication: -1Ruyi Gan, Ziwei Wu, Renliang Sun, Junyu Lu, Xiaojun Wu, Dixiang Zhang, Kunhao Pan, Ping Yang, Qi Yang, Jiaxing Zhang
· (huggingface)
-
Skywork: A More Open Bilingual Foundation Model,
arXiv, 2310.19341
, arxiv, pdf, cication: 1Tianwen Wei, Liang Zhao, Lichang Zhang, Bo Zhu, Lijie Wang, Haihua Yang, Biye Li, Cheng Cheng, Weiwei Lü, Rui Hu · (jiqizhixin) · (qbitai) · (skywork - skyworkai)
-
Aquila2 - FlagAI-Open
The official repo of Aquila2 series proposed by BAAI, including pretrained & chat large language models. · (mp.weixin.qq)
-
ColossalAI - hpcaitech
Making large AI models cheaper, faster and more accessible · (qbitai)
-
VisCPM - OpenBMB
基于CPM基础模型的中英双语多模态大模型系列 · (jiqizhixin)
-
Yi: Open Foundation Models by 01.AI,
arXiv, 2403.04652
, arxiv, pdf, cication: -101. AI, :, Alex Young, Bei Chen, Chao Li, Chengen Huang, Ge Zhang, Guanwei Zhang, Heng Li, Jiangcheng Zhu
-
Yi-9B - 01-ai 🤗
-
Yi - 01-ai
A series of large language models trained from scratch by developers @01-ai
· (jiqizhixin)
-
InternLM-Math - InternLM
· (huggingface)
-
internlm2-math-plus-mixtral8x22b - internlm 🤗
-
InternLM2 Technical Report,
arXiv, 2403.17297
, arxiv, pdf, cication: -1Zheng Cai, Maosong Cao, Haojiong Chen, Kai Chen, Keyu Chen, Xin Chen, Xun Chen, Zehui Chen, Zhi Chen, Pei Chu
-
InternLM - InternLM
InternLM has open-sourced a 7 billion parameter base model, a chat model tailored for practical scenarios and the training system. · (qbitai) · (qbitai)
· (mp.weixin.qq) · (huggingface)
-
internlm2-chat-7b - internlm 🤗
-
DeepSeek-V2-Chat-0628 - deepseek-ai 🤗
-
DeepSeek-V2 - deepseek-ai
· (DeepSeek-V2 - deepseek-ai)
-
DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence,
arXiv, 2401.14196
, arxiv, pdf, cication: -1Daya Guo, Qihao Zhu, Dejian Yang, Zhenda Xie, Kai Dong, Wentao Zhang, Guanting Chen, Xiao Bi, Y. Wu, Y. K. Li
-
DeepSeek-MoE - deepseek-ai
· (huggingface)
-
DeepSeek-LLM - deepseek-ai
DeepSeek LLM: Let there be answers · (huggingface) · (mp.weixin.qq)
-
· (huggingface) · (jiqizhixin)
-
XVERSE-13B - xverse-ai
XVERSE-13B: A multilingual large language model developed by XVERSE Technology Inc. · (qbitai) · (huggingface)
-
Qwen2 Technical Report,
arXiv, 2407.10671
, arxiv, pdf, cication: 13An Yang, Baosong Yang, Binyuan Hui, Bo Zheng, Bowen Yu, Chang Zhou, Chengpeng Li, Chengyuan Li, Dayiheng Liu, Fei Huang
-
Qwen2 - QwenLM
Qwen2 is the large language model series developed by Qwen team, Alibaba Cloud.
-
Qwen2-72B-Instruct - Qwen 🤗
-
Magnum 72B – Provider Status and Load Balancing | OpenRouter
-
Qwen1.5-110B-Chat-demo - Qwen 🤗
-
Qwen1.5-32B - Qwen 🤗
-
Qwen1.5-32B: Fitting the Capstone of the Qwen1.5 Language Model Series | Qwen
-
Qwen-Agent - QwenLM
Agent framework and applications built upon Qwen1.5, featuring Function Calling, Code Interpreter, RAG, and Chrome extension.
-
Qwen1.5 - QwenLM
Qwen1.5 is the improved version of Qwen, the large language model series developed by Qwen team, Alibaba Cloud. · (qwenlm.github)
-
Qwen - QwenLM
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
-
Qwen-7B - QwenLM
The official repo of Qwen-7B (通义千问-7B) chat & pretrained large language model proposed by Alibaba Cloud. · (mp.weixin.qq) · (qbitai)
-
Qwen1.5-MoE-A2.7B - Qwen 🤗
-
qwen1.5-MoE-A2.7B-Chat-demo - Qwen 🤗
-
Qwen-72B-Chat-Demo - Qwen 🤗
-
d-Qwen1.5-0.5B - aloobun 🤗
-
Arcee-Spark - arcee-ai 🤗
-
Baichuan 2: Open Large-scale Language Models,
arXiv, 2309.10305
, arxiv, pdf, cication: 16Aiyuan Yang, Bin Xiao, Bingning Wang, Borong Zhang, Ce Bian, Chao Yin, Chenxu Lv, Da Pan, Dian Wang, Dong Yan · (Baichuan2 - baichuan-inc) · (cdn.baichuan-ai) · (mp.weixin.qq) · (jiqizhixin)
-
Baichuan-13B - baichuan-inc
A 13B large language model developed by Baichuan Intelligent Technology · (mp.weixin.qq)
-
baichuan-7B - baichuan-inc
A large-scale 7B pretraining language model developed by BaiChuan-Inc.
-
ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools,
arXiv, 2406.12793
, arxiv, pdf, cication: -1Team GLM, :, Aohan Zeng, Bin Xu, Bowen Wang, Chenhui Zhang, Da Yin, Diego Rojas, Guanyu Feng, Hanlin Zhao
-
GLM-4 - THUDM
GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型 · (huggingface)
-
ChatGLM3 - THUDM
ChatGLM3 series: Open Bilingual Chat LLMs | 开源双语对话语言模型 · (qbitai)
-
ChatGLM2-6B - THUDM
ChatGLM2-6B: An Open Bilingual Chat LLM | 开源双语对话语言模型 · (qbitai)
-
chatglm.cpp - li-plus
C++ implementation of ChatGLM-6B & ChatGLM2-6B
-
TigerBot - TigerResearch
TigerBot: A multi-language multi-task LLM · (qbitai)
-
Llama-Chinese - LlamaFamily
Llama中文社区,Llama3在线体验和微调模型已开放,实时汇总最新Llama3学习资料,已将所有代码更新适配Llama3,构建最好的中文Llama大模型,完全开源可商用
-
llama3-Chinese-chat - CrazyBoyM
Llama3 中文仓库(聚合资料:各种网友及厂商微调、魔改版本有趣权重 & 训练、推理、部署教程视频 & 文档)
-
Rethinking LLM Language Adaptation: A Case Study on Chinese Mixtral,
arXiv, 2403.01851
, arxiv, pdf, cication: -1Yiming Cui, Xin Yao · (Chinese-Mixtral - ymcui)
-
Aurora:Activating Chinese chat capability for Mistral-8x7B sparse Mixture-of-Experts through Instruction-Tuning,
arXiv, 2312.14557
, arxiv, pdf, cication: -1Rongsheng Wang, Haoming Chen, Ruizhe Zhou, Yaofei Duan, Kunyan Cai, Han Ma, Jiaxi Cui, Jian Li, Patrick Cheong-Iao Pang, Yapeng Wang
· (Aurora - WangRongsheng)
-
Taiwan-LLaMa - MiuLab
Traditional Mandarin LLMs for Taiwan
-
Chinese-LLaMA-Alpaca-2 - ymcui
中文LLaMA-2 & Alpaca-2大语言模型 (Chinese LLaMA-2 & Alpaca-2 LLMs)
-
TransGPT - DUOMO
· (jiqizhixin)
-
Llama2-Chinese - FlagAlpha
Llama中文社区,最好的中文Llama大模型,完全开源可商用
-
Chinese-Llama-2-7b - LinkSoul-AI
开源社区第一个能下载、能运行的中文 LLaMA2 模型!
-
ChatGLM-Efficient-Tuning - hiyouga
Fine-tuning ChatGLM-6B with PEFT | 基于 PEFT 的高效 ChatGLM 微调
-
BayLing: Bridging Cross-lingual Alignment and Instruction Following through Interactive Translation for Large Language Models,
arXiv, 2306.10968
, arxiv, pdf, cication: -1Shaolei Zhang, Qingkai Fang, Zhuocheng Zhang, Zhengrui Ma, Yan Zhou, Langlin Huang, Mengyu Bu, Shangtong Gui, Yunji Chen, Xilin Chen · (jiqizhixin) · (BayLing - ictnlp) · (huggingface)
-
FuxiTranyu: A Multilingual Large Language Model Trained with Balanced Data,
arXiv, 2408.06273
, arxiv, pdf, cication: -1Haoran Sun, Renren Jin, Shaoyang Xu, Leiyu Pan, Supryadi, Menglong Cui, Jiangcun Du, Yikun Lei, Lei Yang, Ling Shi
-
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context
-
LLMs-In-China - wgwang
中国大模型
-
· (mp.weixin.qq)
-
SeaLLMs 3: Open Foundation and Chat Multilingual Large Language Models for Southeast Asian Languages,
arXiv, 2407.19672
, arxiv, pdf, cication: -1Wenxuan Zhang, Hou Pong Chan, Yiran Zhao, Mahani Aljunied, Jianyu Wang, Chaoqun Liu, Yue Deng, Zhiqiang Hu, Weiwen Xu, Yew Ken Chia · (damo-nlp-sg.github)
-
EXAONE 3.0 7.8B Instruction Tuned Language Model,
arXiv, 2408.03541
, arxiv, pdf, cication: -1LG AI Research, :, Soyoung An, Kyunghoon Bae, Eunbi Choi, Stanley Jungkyu Choi, Yemuk Choi, Seokhee Hong, Yeonjung Hong, Junwon Hwang · (huggingface)
-
T-lite-instruct-0.1 - AnatoliiPotapov 🤗
-
LLaMAX: Scaling Linguistic Horizons of LLM by Enhancing Translation Capabilities Beyond 100 Languages,
arXiv, 2407.05975
, arxiv, pdf, cication: -1Yinquan Lu, Wenhao Zhu, Lei Li, Yu Qiao, Fei Yuan
· (LLaMAX - CONE-MT)
-
Aya 23: Open Weight Releases to Further Multilingual Progress,
arXiv, 2405.15032
, arxiv, pdf, cication: -1Viraat Aryabumi, John Dang, Dwarak Talupuru, Saurabh Dash, David Cairuz, Hangyu Lin, Bharat Venkitesh, Madeline Smith, Kelly Marchisio, Sebastian Ruder
-
aya-101 - CohereForAI 🤗
· (huggingface) · (cohere)
-
SUTRA: Scalable Multilingual Language Model Architecture,
arXiv, 2405.06694
, arxiv, pdf, cication: -1Abhijit Bendale, Michael Sapienza, Steven Ripplinger, Simon Gibbs, Jaewon Lee, Pranav Mistry
-
SambaLingo: Teaching Large Language Models New Languages,
arXiv, 2404.05829
, arxiv, pdf, cication: -1Zoltan Csaki, Bo Li, Jonathan Li, Qiantong Xu, Pian Pawakapan, Leon Zhang, Yun Du, Hengyu Zhao, Changran Hu, Urmish Thakker
-
Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order,
arXiv, 2404.00399
, arxiv, pdf, cication: -1Taishi Nakamura, Mayank Mishra, Simone Tedeschi, Yekun Chai, Jason T Stillerman, Felix Friedrich, Prateek Yadav, Tanmay Laud, Vu Minh Chien, Terry Yue Zhuo · (huggingface)
-
Sailor: Open Language Models for South-East Asia,
arXiv, 2404.03608
, arxiv, pdf, cication: -1Longxu Dou, Qian Liu, Guangtao Zeng, Jia Guo, Jiahui Zhou, Wei Lu, Min Lin · (twitter)
earning rate can have an even more impact on the dreaded phenomenon known as 𝚌𝚊𝚝𝚊𝚜𝚝𝚛𝚘𝚙𝚑𝚒𝚌 𝚏𝚘𝚛𝚐𝚎𝚝𝚝𝚒𝚗𝚐?
-
HyperCLOVA X Technical Report,
arXiv, 2404.01954
, arxiv, pdf, cication: -1Kang Min Yoo, Jaegeun Han, Sookyo In, Heewon Jeon, Jisu Jeong, Jaewook Kang, Hyunwook Kim, Kyung-Min Kim, Munhyong Kim, Sungju Kim
-
Nemotron-4 15B Technical Report,
arXiv, 2402.16819
, arxiv, pdf, cication: -1Jupinder Parmar, Shrimai Prabhumoye, Joseph Jennings, Mostofa Patwary, Sandeep Subramanian, Dan Su, Chen Zhu, Deepak Narayanan, Aastha Jhunjhunwala, Ayush Dattagupta
-
Aya Model: An Instruction Finetuned Open-Access Multilingual Language Model,
arXiv, 2402.07827
, arxiv, pdf, cication: -1Ahmet Üstün, Viraat Aryabumi, Zheng-Xin Yong, Wei-Yin Ko, Daniel D'souza, Gbemileke Onilude, Neel Bhandari, Shivalika Singh, Hui-Lee Ooi, Amr Kayid · (hf)
-
CroissantLLM: A Truly Bilingual French-English Language Model,
arXiv, 2402.00786
, arxiv, pdf, cication: -1Manuel Faysse, Patrick Fernandes, Nuno Guerreiro, António Loison, Duarte Alves, Caio Corro, Nicolas Boizard, João Alves, Ricardo Rei, Pedro Martins
-
MaLA-500: Massive Language Adaptation of Large Language Models,
arXiv, 2401.13303
, arxiv, pdf, cication: -1Peiqin Lin, Shaoxiong Ji, Jörg Tiedemann, André F. T. Martins, Hinrich Schütze · (huggingface)
-
Multilingual Instruction Tuning With Just a Pinch of Multilinguality,
arXiv, 2401.01854
, arxiv, pdf, cication: -1Uri Shaham, Jonathan Herzig, Roee Aharoni, Idan Szpektor, Reut Tsarfaty, Matan Eyal
-
LLaMA Beyond English: An Empirical Study on Language Capability Transfer,
arXiv, 2401.01055
, arxiv, pdf, cication: -1Jun Zhao, Zhihao Zhang, Qi Zhang, Tao Gui, Xuanjing Huang
-
FinGPT: Large Generative Models for a Small Language,
arXiv, 2311.05640
, arxiv, pdf, cication: -1Risto Luukkonen, Ville Komulainen, Jouni Luoma, Anni Eskelinen, Jenna Kanerva, Hanna-Mari Kupari, Filip Ginter, Veronika Laippala, Niklas Muennighoff, Aleksandra Piktus · (turkunlp)
-
NNsight and NDIF: Democratizing Access to Foundation Model Internals,
arXiv, 2407.14561
, arxiv, pdf, cication: -1Jaden Fiotto-Kaufman, Alexander R Loftus, Eric Todd, Jannik Brinkmann, Caden Juang, Koyena Pal, Can Rager, Aaron Mueller, Samuel Marks, Arnab Sen Sharma · (nnsight)
-
LLMZoo - FreedomIntelligence
⚡LLM Zoo is a project that provides data, models, and evaluation benchmark for large language models.⚡
-
open-llms - eugeneyan
📋 A list of open LLMs available for commercial use.
-
List of Open Sourced Fine-Tuned Large Language Models (LLM) | by Sung Kim | Medium
-
Awesome-Chinese-LLM - HqWu-HITCS
整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。
-
self-llm - datawhalechina
《开源大模型食用指南》基于AutoDL快速部署开源大模型,更适合中国宝宝的部署教程