Skip to content

Latest commit

 

History

History
847 lines (623 loc) · 77 KB

awesome_openllm.md

File metadata and controls

847 lines (623 loc) · 77 KB

Awesome opengpt

English

Foundation

  • Hermes 3 - NOUS RESEARCH

    · (nousresearch)

  • falcon-mamba-7b - tiiuae 🤗

  • DCLM-7B - apple 🤗

  • DCLM-1B - TRI-ML 🤗

  • Minitron - a nvidia Collection

    · (huggingface)

  • SmolLM - blazingly fast and remarkably powerful

    · (huggingface) · (huggingface)

    · (huggingface)

  • H2O-Danube3 Technical Report, arXiv, 2407.09276, arxiv, pdf, cication: -1

    Pascal Pfeiffer, Philipp Singer, Yauhen Babakhin, Gabor Fodor, Nischay Dhankhar, Sri Satish Ambati · (huggingface)

  • gemini-nano - wave-on-discord 🤗

  • GEB-1.3B: Open Lightweight Large Language Model, arXiv, 2406.09900, arxiv, pdf, cication: -1

    Jie Wu, Yufeng Zhu, Lei Shen, Xuqing Lu

  • NVIDIA Releases Open Synthetic Data Generation Pipeline for Training Large Language Models | NVIDIA Blog

    · (research.nvidia)

  • Nemotron-4-340B-Instruct - nvidia 🤗

  • WizardLM-2-8x22B - alpindale 🤗

    · (wizardlm.github) · (WizardLM - victorsungo) Star

  • MAP-NEO - multimodal-art-projection Star

  • Snowflake Arctic: The Best LLM for Enterprise AI — Efficiently Intelligent, Truly Open

    · (snowflake-arctic - Snowflake-Labs) Star · (huggingface) · (twitter)

  • OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework, arXiv, 2404.14619, arxiv, pdf, cication: -1

    Sachin Mehta, Mohammad Hossein Sekhavat, Qingqing Cao, Maxwell Horton, Yanzi Jin, Chenfan Sun, Iman Mirzadeh, Mahyar Najibi, Dmitry Belenko, Peter Zatloukal · (huggingface) · (corenet - apple) Star

  • Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models, arXiv, 2404.12387, arxiv, pdf, cication: -1

    Aitor Ormazabal, Che Zheng, Cyprien de Masson d'Autume, Dani Yogatama, Deyu Fu, Donovan Ong, Eric Chen, Eugenie Lamprecht, Hai Pham, Isaac Ong

  • Rho-1: Not All Tokens Are What You Need, arXiv, 2404.07965, arxiv, pdf, cication: -1

    Zhenghao Lin, Zhibin Gou, Yeyun Gong, Xiao Liu, Yelong Shen, Ruochen Xu, Chen Lin, Yujiu Yang, Jian Jiao, Nan Duan · (rho - microsoft) Star

  • MiniCPM: Unveiling the Potential of Small Language Models with Scalable Training Strategies, arXiv, 2404.06395, arxiv, pdf, cication: -1

    Shengding Hu, Yuge Tu, Xu Han, Chaoqun He, Ganqu Cui, Xiang Long, Zhi Zheng, Yewei Fang, Yuxiang Huang, Weilin Zhao · (MiniCPM - OpenBMB) Star

  • Stable LM 2 1.6B Technical Report, arXiv, 2402.17834, arxiv, pdf, cication: -1

    Marco Bellagente, Jonathan Tow, Dakota Mahan, Duy Phung, Maksym Zhuravinskyi, Reshinth Adithyan, James Baicoianu, Ben Brooks, Nathan Cooper, Ashish Datta

  • stablelm-2-12b - stabilityai 🤗

  • c4ai-command-r-plus - CohereForAI 🤗

  • JetMoE: Reaching Llama2 Performance with 0.1M Dollars, arXiv, 2404.07413, arxiv, pdf, cication: -1

    Yikang Shen, Zhen Guo, Tianle Cai, Zengyi Qin

  • JetMoE - myshell-ai Star

    Reaching LLaMA2 Performance with 0.1M Dollars · (research.myshell) · (huggingface) · (qbitai)

  • Poro 34B and the Blessing of Multilinguality, arXiv, 2404.01856, arxiv, pdf, cication: -1

    Risto Luukkonen, Jonathan Burdge, Elaine Zosa, Aarne Talman, Ville Komulainen, Väinö Hatanpää, Peter Sarlin, Sampo Pyysalo · (huggingface)

  • MicroLlama - keeeeenw Star

    Micro Llama is a small Llama based model with 300M parameters trained from scratch with $500 budget

  • grok-1 - xai-org Star

    Grok open release · (x) · (huggingface) · (qbitai)

    · (huggingface) · (mp.weixin.qq)

    · (x) · (qbitai)

  • c4ai-command-r-v01 - CohereForAI 🤗

    · (txt.cohere)

    · (huggingface)

  • miqu-1-70b - miqudev 🤗

  • H2O-Danube-1.8B Technical Report, arXiv, 2401.16818, arxiv, pdf, cication: -1

    Philipp Singer, Pascal Pfeiffer, Yauhen Babakhin, Maximilian Jeblick, Nischay Dhankhar, Gabor Fodor, Sri Satish Ambati

  • Smaug-72B-v0.1 - abacusai 🤗

  • Smaug-34B-v0.1 - abacusai 🤗

  • bagel-34b-v0.2 - jondurbin 🤗

  • TinyLlama: An Open-Source Small Language Model, arXiv, 2401.02385, arxiv, pdf, cication: -1

    Peiyuan Zhang, Guangtao Zeng, Tianduo Wang, Wei Lu · (TinyLlama - jzhang38) Star

  • TigerBot: An Open Multilingual Multitask LLM, arXiv, 2312.08688, arxiv, pdf, cication: -1

    Ye Chen, Wei Cai, Liangmin Wu, Xiaowei Li, Zhanxuan Xin, Cong Fu

  • DeciLM-7B - Deci 🤗

  • DeciLM-7B-instruct - Deci 🤗

    · (huggingface)

  • LLM360: Towards Fully Transparent Open-Source LLMs, arXiv, 2312.06550, arxiv, pdf, cication: -1

    Zhengzhong Liu, Aurick Qiao, Willie Neiswanger, Hongyi Wang, Bowen Tan, Tianhua Tao, Junbo Li, Yuqi Wang, Suqi Sun, Omkar Pangarkar

  • GPT4All: An Ecosystem of Open Source Compressed Language Models, arXiv, 2311.04931, arxiv, pdf, cication: -1

    Yuvanesh Anand, Zach Nussbaum, Adam Treat, Aaron Miller, Richard Guo, Ben Schmidt, GPT4All Community, Brandon Duderstadt, Andriy Mulyar · (gpt4all - nomic-ai) Star

  • OpenChat: Advancing Open-source Language Models with Mixed-Quality Data, arXiv, 2309.11235, arxiv, pdf, cication: -1

    Guan Wang, Sijie Cheng, Xianyuan Zhan, Xiangang Li, Sen Song, Yang Liu · (openchat - imoneoi) Star · (huggingface) · (openchat)

  • Zephyr: Direct Distillation of LM Alignment, arXiv, 2310.16944, arxiv, pdf, cication: 1

    Lewis Tunstall, Edward Beeching, Nathan Lambert, Nazneen Rajani, Kashif Rasul, Younes Belkada, Shengyi Huang, Leandro von Werra, Clémentine Fourrier, Nathan Habib · (alignment-handbook - huggingface) Star

  • H2O Open Ecosystem for State-of-the-art Large Language Models, arXiv, 2310.13012, arxiv, pdf, cication: -1

    Arno Candel, Jon McKinney, Philipp Singer, Pascal Pfeiffer, Maximilian Jeblick, Chun Ming Lee, Marcos V. Conde

  • BTLM-3B-8K: 7B Parameter Performance in a 3B Parameter Model, arXiv, 2309.11568, arxiv, pdf, cication: -1

    Nolan Dey, Daria Soboleva, Faisal Al-Khateeb, Bowen Yang, Ribhu Pathria, Hemant Khachane, Shaheer Muhammad, Zhiming, Chen, Robert Myers

  • OpenBA: An Open-sourced 15B Bilingual Asymmetric seq2seq Model Pre-trained from Scratch, arXiv, 2309.10706, arxiv, pdf, cication: -1

    Juntao Li, Zecheng Tang, Yuyang Ding, Pinzheng Wang, Pei Guo, Wangjie You, Dan Qiao, Wenliang Chen, Guohong Fu, Qiaoming Zhu · (openba - opennlg) Star

  • XGen-7B Technical Report, arXiv, 2309.03450, arxiv, pdf, cication: 3

    Erik Nijkamp, Tian Xie, Hiroaki Hayashi, Bo Pang, Congying Xia, Chen Xing, Jesse Vig, Semih Yavuz, Philippe Laban, Ben Krause

  • FLM-101B: An Open LLM and How to Train It with $100K Budget, arXiv, 2309.03852, arxiv, pdf, cication: 3

    Xiang Li, Yiqun Yao, Xin Jiang, Xuezhi Fang, Xuying Meng, Siqi Fan, Peng Han, Jing Li, Li Du, Bowen Qin · (huggingface)

  • adept-inference - persimmon-ai-labs Star

    Inference code for Persimmon-8B

  • WizardLM - nlpxucan Star

    Family of instruction-following LLMs powered by Evol-Instruct: WizardLM, WizardCoder

  • FreeWilly2 - stabilityai 🤗

  • xgen - salesforce Star

    Salesforce open-source LLMs with 8k sequence length.

  • PolyLM: An Open Source Polyglot Large Language Model, arXiv, 2307.06018, arxiv, pdf, cication: 5

    Xiangpeng Wei, Haoran Wei, Huan Lin, Tianhao Li, Pei Zhang, Xingzhang Ren, Mei Li, Yu Wan, Zhiwei Cao, Binbin Xie

  • A Technical Report for Polyglot-Ko: Open-Source Large-Scale Korean Language Models, arXiv, 2306.02254, arxiv, pdf, cication: -1

    Hyunwoong Ko, Kichang Yang, Minho Ryu, Taekyoon Choi, Seungmu Yang, Jiwung Hyun, Sungho Park, Kyubyong Park

  • Examining User-Friendly and Open-Sourced Large GPT Models: A Survey on Language, Multimodal, and Scientific GPT Models, arXiv, 2308.14149, arxiv, pdf, cication: -1

    Kaiyuan Gao, Sunan He, Zhenyu He, Jiacheng Lin, QiZhi Pei, Jie Shao, Wei Zhang · (gpt_alternatives - GPT-Alternatives) Star · (jiqizhixin)

Llama 3

Jamba

  • Jamba: A Hybrid Transformer-Mamba Language Model, arXiv, 2403.19887, arxiv, pdf, cication: -1

    Opher Lieber, Barak Lenz, Hofit Bata, Gal Cohen, Jhonathan Osin, Itay Dalmedigos, Erez Safahi, Shaked Meirom, Yonatan Belinkov, Shai Shalev-Shwartz · (huggingface)

  • Jamba-v0.1-chat-multilingual - lightblue 🤗

  • Jambatypus-v0.1 - mlabonne 🤗

Databricks

Gemma

OLMo

Phi

Mistral

StripedHyena-7B

BLOOM

Mosaic pretrained transformers (MPT)

GitHub - mosaicml/llm-foundry: LLM training code for MosaicML foundation models

h2oGPT

  • h2oGPT: Democratizing Large Language Models, arXiv, 2306.08161, arxiv, pdf, cication: -1

    Arno Candel, Jon McKinney, Philipp Singer, Pascal Pfeiffer, Maximilian Jeblick, Prithvi Prabhu, Jeff Gambera, Mark Landry, Shivam Bansal, Ryan Chesler

LLaMA

Falcon

Pythia

[2304.01373] Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling

 · ([pythia](https://github.com/EleutherAI/pythia) - EleutherAI) ![Star](https://img.shields.io/github/stars/EleutherAI/pythia.svg?style=social&label=Star)

Other

Finetuning

Vicuna

  • Flacuna: Unleashing the Problem Solving Power of Vicuna using FLAN Fine-Tuning, arXiv, 2307.02053, arxiv, pdf, cication: 3

    Deepanway Ghosal, Yew Ken Chia, Navonil Majumder, Soujanya Poria

  • FastChat - lm-sys Star

    An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and FastChat-T5.

Alpaca

Dolly

  • dolly - databrickslabs Star

    Databricks’ Dolly, a large language model trained on the Databricks Machine Learning Platform · (huggingface) · (databricks)

Misc

Mulitlingual (chinese)

Foundation

  • DCLM-7B - apple 🤗

  • Fetching Title#b7lb

  • Taiwan LLM: Bridging the Linguistic Divide with a Culturally Aligned Language Model, arXiv, 2311.17487, arxiv, pdf, cication: 11

    Yen-Ting Lin, Yun-Nung Chen · (Taiwan-LLM - MiuLab) Star

  • Xmodel-LM Technical Report, arXiv, 2406.02856, arxiv, pdf, cication: -1

    Yichuan Wang, Yang Liu, Yu Yan, Xucheng Huang, Ling Jiang

    · (XmodelLM - XiaoduoAILab) Star

  • MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series, arXiv, 2405.19327, arxiv, pdf, cication: -1

    Ge Zhang, Scott Qu, Jiaheng Liu, Chenchen Zhang, Chenghua Lin, Chou Leuang Yu, Danny Pan, Esther Cheng, Jie Liu, Qunshu Lin · (map-neo.github)

  • Yuan 2.0-M32: Mixture of Experts with Attention Router, arXiv, 2405.17976, arxiv, pdf, cication: -1

    Shaohua Wu, Jiangang Luo, Xi Chen, Lingjun Li, Xudong Zhao, Tong Yu, Chao Wang, Yue Wang, Fei Wang, Weixu Qiao

    · (huggingface)

  • ChuXin: 1.6B Technical Report, arXiv, 2405.04828, arxiv, pdf, cication: -1

    Xiaomin Zhuang, Yufan Jiang, Qiaozhi He, Zhihua Wu

  • Tele-FLM Technical Report, arXiv, 2404.16645, arxiv, pdf, cication: -1

    Xiang Li, Yiqun Yao, Xin Jiang, Xuezhi Fang, Chao Wang, Xinzhang Liu, Zihan Wang, Yu Zhao, Xin Wang, Yuyao Huang

  • Chinese Tiny LLM: Pretraining a Chinese-Centric Large Language Model, arXiv, 2404.04167, arxiv, pdf, cication: -1

    Xinrun Du, Zhouliang Yu, Songyang Gao, Ding Pan, Yuyang Cheng, Ziyang Ma, Ruibin Yuan, Xingwei Qu, Jiaheng Liu, Tianyu Zheng · (huggingface)

  • Mengzi3 - Langboat Star

    · (qbitai)

  • MiniCPM - OpenBMB Star

    MiniCPM-2.4B: An end-side LLM outperforms Llama2-13B.

    · (huggingface)

    · (shengdinghu.notion)

  • iFlytekSpark-13B: 讯飞星火开源-13B(iFlytekSpark-13B)

  • Orion-14B: Open-source Multilingual Large Language Models, arXiv, 2401.12246, arxiv, pdf, cication: -1

    Du Chen, Yi Huang, Xiaopu Li, Yongqiang Li, Yongqiang Liu, Haihui Pan, Leichao Xu, Dacheng Zhang, Zhipeng Zhang, Kun Han

  • Orion - OrionStarAI Star

    Orion-14B is a family of models includes a 14B foundation LLM, and a series of models: a chat model, a long context model, a quantized model, a RAG fine-tuned model, and an Agent fine-tuned model. Orion-14B 系列模型包括一个具有140亿参数的多语言基座大模型以及一系列相关的衍生模型,包括对话模型,长文本模型,量化模型,RAG微调模型,Agent微调模型等。 · (Orion - OrionStarAI) Star

  • TeleChat Technical Report, arXiv, 2401.03804, arxiv, pdf, cication: -1

    Zihan Wang, Xinzhang Liu, Shixuan Liu, Yitong Yao, Yuyao Huang, Zhongjiang He, Xuelong Li, Yongxiang Li, Zhonghao Che, Zhaoxi Zhang

  • YAYI 2: Multilingual Open-Source Large Language Models, arXiv, 2312.14862, arxiv, pdf, cication: -1

    Yin Luo, Qingchao Kong, Nan Xu, Jia Cao, Bao Hao, Baoyu Qu, Bo Chen, Chao Zhu, Chenyang Zhao, Donglei Zhang

  • SeaLLMs -- Large Language Models for Southeast Asia, arXiv, 2312.00738, arxiv, pdf, cication: -1

    Xuan-Phi Nguyen, Wenxuan Zhang, Xin Li, Mahani Aljunied, Qingyu Tan, Liying Cheng, Guanzheng Chen, Yue Deng, Sen Yang, Chaoqun Liu

    · (SeaLLMs - DAMO-NLP-SG) Star

  • YUAN 2.0: A Large Language Model with Localized Filtering-based Attention, arXiv, 2311.15786, arxiv, pdf, cication: -1

    Shaohua Wu, Xudong Zhao, Shenling Wang, Jiangang Luo, Lingjun Li, Xi Chen, Bing Zhao, Wei Wang, Tong Yu, Rongguo Zhang · (Yuan-2.0 - IEIT-Yuan) Star

  • Ziya2: Data-centric Learning is All LLMs Need, arXiv, 2311.03301, arxiv, pdf, cication: -1

    Ruyi Gan, Ziwei Wu, Renliang Sun, Junyu Lu, Xiaojun Wu, Dixiang Zhang, Kunhao Pan, Ping Yang, Qi Yang, Jiaxing Zhang

    · (huggingface)

  • Skywork: A More Open Bilingual Foundation Model, arXiv, 2310.19341, arxiv, pdf, cication: 1

    Tianwen Wei, Liang Zhao, Lichang Zhang, Bo Zhu, Lijie Wang, Haihua Yang, Biye Li, Cheng Cheng, Weiwei Lü, Rui Hu · (jiqizhixin) · (qbitai) · (skywork - skyworkai) Star

  • Aquila2 - FlagAI-Open Star

    The official repo of Aquila2 series proposed by BAAI, including pretrained & chat large language models. · (mp.weixin.qq)

  • ColossalAI - hpcaitech Star

    Making large AI models cheaper, faster and more accessible · (qbitai)

  • VisCPM - OpenBMB Star

    基于CPM基础模型的中英双语多模态大模型系列 · (jiqizhixin)

Yi-01

  • Yi-1.5 (2024/05) - a 01-ai Collection

  • Yi: Open Foundation Models by 01.AI, arXiv, 2403.04652, arxiv, pdf, cication: -1

    01. AI, :, Alex Young, Bei Chen, Chao Li, Chengen Huang, Ge Zhang, Guanwei Zhang, Heng Li, Jiangcheng Zhu

  • Yi-9B - 01-ai 🤗

  • Yi - 01-ai Star

    A series of large language models trained from scratch by developers @01-ai

    · (jiqizhixin)

InterLM

DeepSeek

Xverse

Qwen

Baichuan

  • Baichuan 2: Open Large-scale Language Models, arXiv, 2309.10305, arxiv, pdf, cication: 16

    Aiyuan Yang, Bin Xiao, Bingning Wang, Borong Zhang, Ce Bian, Chao Yin, Chenxu Lv, Da Pan, Dian Wang, Dong Yan · (Baichuan2 - baichuan-inc) Star · (cdn.baichuan-ai) · (mp.weixin.qq) · (jiqizhixin)

  • Baichuan-13B - baichuan-inc Star

    A 13B large language model developed by Baichuan Intelligent Technology · (mp.weixin.qq)

  • baichuan-7B - baichuan-inc Star

    A large-scale 7B pretraining language model developed by BaiChuan-Inc.

ChatGLM

  • Fetching Title#9hjh

  • ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools, arXiv, 2406.12793, arxiv, pdf, cication: -1

    Team GLM, :, Aohan Zeng, Bin Xu, Bowen Wang, Chenhui Zhang, Da Yin, Diego Rojas, Guanyu Feng, Hanlin Zhao

  • GLM-4 - THUDM Star

    GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型 · (huggingface)

  • ChatGLM3 - THUDM Star

    ChatGLM3 series: Open Bilingual Chat LLMs | 开源双语对话语言模型 · (qbitai)

  • ChatGLM2-6B - THUDM Star

    ChatGLM2-6B: An Open Bilingual Chat LLM | 开源双语对话语言模型 · (qbitai)

  • chatglm.cpp - li-plus Star

    C++ implementation of ChatGLM-6B & ChatGLM2-6B

  • TigerBot - TigerResearch Star

    TigerBot: A multi-language multi-task LLM · (qbitai)

Finetuning

  • Llama-Chinese - LlamaFamily Star

    Llama中文社区,Llama3在线体验和微调模型已开放,实时汇总最新Llama3学习资料,已将所有代码更新适配Llama3,构建最好的中文Llama大模型,完全开源可商用

  • llama3-Chinese-chat - CrazyBoyM Star

    Llama3 中文仓库(聚合资料:各种网友及厂商微调、魔改版本有趣权重 & 训练、推理、部署教程视频 & 文档)

  • Rethinking LLM Language Adaptation: A Case Study on Chinese Mixtral, arXiv, 2403.01851, arxiv, pdf, cication: -1

    Yiming Cui, Xin Yao · (Chinese-Mixtral - ymcui) Star

  • Aurora:Activating Chinese chat capability for Mistral-8x7B sparse Mixture-of-Experts through Instruction-Tuning, arXiv, 2312.14557, arxiv, pdf, cication: -1

    Rongsheng Wang, Haoming Chen, Ruizhe Zhou, Yaofei Duan, Kunyan Cai, Han Ma, Jiaxi Cui, Jian Li, Patrick Cheong-Iao Pang, Yapeng Wang

    · (Aurora - WangRongsheng) Star

  • Taiwan-LLaMa - MiuLab Star

    Traditional Mandarin LLMs for Taiwan

  • Chinese-LLaMA-Alpaca-2 - ymcui Star

    中文LLaMA-2 & Alpaca-2大语言模型 (Chinese LLaMA-2 & Alpaca-2 LLMs)

  • TransGPT - DUOMO Star

    · (jiqizhixin)

  • Llama2-Chinese - FlagAlpha Star

    Llama中文社区,最好的中文Llama大模型,完全开源可商用

  • Chinese-Llama-2-7b - LinkSoul-AI Star

    开源社区第一个能下载、能运行的中文 LLaMA2 模型!

  • ChatGLM-Efficient-Tuning - hiyouga Star

    Fine-tuning ChatGLM-6B with PEFT | 基于 PEFT 的高效 ChatGLM 微调

  • BayLing: Bridging Cross-lingual Alignment and Instruction Following through Interactive Translation for Large Language Models, arXiv, 2306.10968, arxiv, pdf, cication: -1

    Shaolei Zhang, Qingkai Fang, Zhuocheng Zhang, Zhengrui Ma, Yan Zhou, Langlin Huang, Mengyu Bu, Shangtong Gui, Yunji Chen, Xilin Chen · (jiqizhixin) · (BayLing - ictnlp) Star · (huggingface)

Other

Extra

  • SeaLLMs 3: Open Foundation and Chat Multilingual Large Language Models for Southeast Asian Languages, arXiv, 2407.19672, arxiv, pdf, cication: -1

    Wenxuan Zhang, Hou Pong Chan, Yiran Zhao, Mahani Aljunied, Jianyu Wang, Chaoqun Liu, Yue Deng, Zhiqiang Hu, Weiwen Xu, Yew Ken Chia · (damo-nlp-sg.github)

  • EXAONE 3.0 7.8B Instruction Tuned Language Model, arXiv, 2408.03541, arxiv, pdf, cication: -1

    LG AI Research, :, Soyoung An, Kyunghoon Bae, Eunbi Choi, Stanley Jungkyu Choi, Yemuk Choi, Seokhee Hong, Yeonjung Hong, Junwon Hwang · (huggingface)

  • T-lite-instruct-0.1 - AnatoliiPotapov 🤗

  • LLaMAX: Scaling Linguistic Horizons of LLM by Enhancing Translation Capabilities Beyond 100 Languages, arXiv, 2407.05975, arxiv, pdf, cication: -1

    Yinquan Lu, Wenhao Zhu, Lei Li, Yu Qiao, Fei Yuan

    · (LLaMAX - CONE-MT) Star

  • Aya 23: Open Weight Releases to Further Multilingual Progress, arXiv, 2405.15032, arxiv, pdf, cication: -1

    Viraat Aryabumi, John Dang, Dwarak Talupuru, Saurabh Dash, David Cairuz, Hangyu Lin, Bharat Venkitesh, Madeline Smith, Kelly Marchisio, Sebastian Ruder

  • aya-101 - CohereForAI 🤗

    · (huggingface) · (cohere)

  • SUTRA: Scalable Multilingual Language Model Architecture, arXiv, 2405.06694, arxiv, pdf, cication: -1

    Abhijit Bendale, Michael Sapienza, Steven Ripplinger, Simon Gibbs, Jaewon Lee, Pranav Mistry

  • SambaLingo: Teaching Large Language Models New Languages, arXiv, 2404.05829, arxiv, pdf, cication: -1

    Zoltan Csaki, Bo Li, Jonathan Li, Qiantong Xu, Pian Pawakapan, Leon Zhang, Yun Du, Hengyu Zhao, Changran Hu, Urmish Thakker

  • Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order, arXiv, 2404.00399, arxiv, pdf, cication: -1

    Taishi Nakamura, Mayank Mishra, Simone Tedeschi, Yekun Chai, Jason T Stillerman, Felix Friedrich, Prateek Yadav, Tanmay Laud, Vu Minh Chien, Terry Yue Zhuo · (huggingface)

  • Sailor: Open Language Models for South-East Asia, arXiv, 2404.03608, arxiv, pdf, cication: -1

    Longxu Dou, Qian Liu, Guangtao Zeng, Jia Guo, Jiahui Zhou, Wei Lu, Min Lin · (twitter)

    • earning rate can have an even more impact on the dreaded phenomenon known as 𝚌𝚊𝚝𝚊𝚜𝚝𝚛𝚘𝚙𝚑𝚒𝚌 𝚏𝚘𝚛𝚐𝚎𝚝𝚝𝚒𝚗𝚐?
  • HyperCLOVA X Technical Report, arXiv, 2404.01954, arxiv, pdf, cication: -1

    Kang Min Yoo, Jaegeun Han, Sookyo In, Heewon Jeon, Jisu Jeong, Jaewook Kang, Hyunwook Kim, Kyung-Min Kim, Munhyong Kim, Sungju Kim

  • Nemotron-4 15B Technical Report, arXiv, 2402.16819, arxiv, pdf, cication: -1

    Jupinder Parmar, Shrimai Prabhumoye, Joseph Jennings, Mostofa Patwary, Sandeep Subramanian, Dan Su, Chen Zhu, Deepak Narayanan, Aastha Jhunjhunwala, Ayush Dattagupta

  • Aya Model: An Instruction Finetuned Open-Access Multilingual Language Model, arXiv, 2402.07827, arxiv, pdf, cication: -1

    Ahmet Üstün, Viraat Aryabumi, Zheng-Xin Yong, Wei-Yin Ko, Daniel D'souza, Gbemileke Onilude, Neel Bhandari, Shivalika Singh, Hui-Lee Ooi, Amr Kayid · (hf)

  • CroissantLLM: A Truly Bilingual French-English Language Model, arXiv, 2402.00786, arxiv, pdf, cication: -1

    Manuel Faysse, Patrick Fernandes, Nuno Guerreiro, António Loison, Duarte Alves, Caio Corro, Nicolas Boizard, João Alves, Ricardo Rei, Pedro Martins

  • MaLA-500: Massive Language Adaptation of Large Language Models, arXiv, 2401.13303, arxiv, pdf, cication: -1

    Peiqin Lin, Shaoxiong Ji, Jörg Tiedemann, André F. T. Martins, Hinrich Schütze · (huggingface)

  • Multilingual Instruction Tuning With Just a Pinch of Multilinguality, arXiv, 2401.01854, arxiv, pdf, cication: -1

    Uri Shaham, Jonathan Herzig, Roee Aharoni, Idan Szpektor, Reut Tsarfaty, Matan Eyal

  • LLaMA Beyond English: An Empirical Study on Language Capability Transfer, arXiv, 2401.01055, arxiv, pdf, cication: -1

    Jun Zhao, Zhihao Zhang, Qi Zhang, Tao Gui, Xuanjing Huang

  • 2023, year of open LLMs

  • FinGPT: Large Generative Models for a Small Language, arXiv, 2311.05640, arxiv, pdf, cication: -1

    Risto Luukkonen, Ville Komulainen, Jouni Luoma, Anni Eskelinen, Jenna Kanerva, Hanna-Mari Kupari, Filip Ginter, Veronika Laippala, Niklas Muennighoff, Aleksandra Piktus · (turkunlp)

Toolkits

  • NNsight and NDIF: Democratizing Access to Foundation Model Internals, arXiv, 2407.14561, arxiv, pdf, cication: -1

    Jaden Fiotto-Kaufman, Alexander R Loftus, Eric Todd, Jannik Brinkmann, Caden Juang, Koyena Pal, Can Rager, Aaron Mueller, Samuel Marks, Arnab Sen Sharma · (nnsight)

  • LLMZoo - FreedomIntelligence Star

    ⚡LLM Zoo is a project that provides data, models, and evaluation benchmark for large language models.⚡

Products

Extra reference