llama : add RWKV models support

RWKV (100% RNN) language model, which is the only RNN (as of now) that can match transformers in quality and scaling, while being faster and saves memory.

Info: https://github.com/BlinkDL/ChatRWKV

RWKV is a novel large language model architecture, [with the largest model in the family having 14B parameters](https://huggingface.co/BlinkDL/rwkv-4-pile-14b). In contrast to Transformer with O(n^2) attention, RWKV requires only state from previous step to calculate logits. This makes RWKV very CPU-friendly on large context lenghts.

Experimental GGML port: https://github.com/saharNooby/rwkv.cpp

The lastest "Raven"-series Alpaca-style-tuned RWKV 14B & 7B models are very good.
Online demo: https://huggingface.co/spaces/BlinkDL/Raven-RWKV-7B
Download: https://huggingface.co/BlinkDL/rwkv-4-raven

----

*Edit by @ggerganov:*

Adding @BlinkDL's comment below to OP for visibility:

> v4 inference: https://github.com/BlinkDL/ChatRWKV/blob/main/RWKV_in_150_lines.py
>
> v5 inference: https://github.com/BlinkDL/ChatRWKV/blob/main/RWKV_v5_demo.py
>
> fast v4 & v5.2 inference: https://github.com/BlinkDL/ChatRWKV/blob/main/rwkv_pip_package/src/rwkv/model.py
>
> v5.2 1.5B demo (great for its size): https://huggingface.co/spaces/BlinkDL/ChatRWKV-gradio
>
> v5.2 1.5B benchmarks: https://twitter.com/BlinkDL_AI/status/1717543614434402661
>
> a few remarks:
> * rwkv models have RNN-style "one" mode, and GPT-style "seq" mode
> * i am actually using exp(-exp(w))
> * seems it's good to precompute embedding+emb_layernorm in bf16
> * when using fp16, i am doing /2 every 6 layers, to avoid overflow


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

llama : add RWKV models support #846

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

llama : add RWKV models support #846

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions