Skip to content

ServiceNow/apriel

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

1 Commit
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

Apriel-H1

Static Badge Static Badge Static Badge

Apriel to Apriel-H Transformation

/หˆษ‘ห.pri.ษ™l/

Apriel-H1 inference - vLLM plugin for the Apriel-H1 family of hybrid reasoning models.


๐Ÿ“Š Model Overview

Apriel-H1-15b-Thinker-SFT is a 15B-parameter hybrid reasoning model combining Transformer attention and Mamba State Space layers for high efficiency and scalability. Derived from Apriel-Nemotron-15B-Thinker through progressive distillation, Apriel-H1 replaces less critical attention layers with linear Mamba blocksโ€”achieving over 2ร— higher inference throughput in vLLM with minimal loss in reasoning, math, and coding performance.

Key Features

  • Model Size: 15B parameters
  • Context Length: 65K (target; runtime dependent)
  • Languages: English (best)
  • Hybrid Transformerโ€“SSM architecture
  • ~2ร— throughput improvement over the base Thinker model
  • Retains strong reasoning, math, and coding capabilities
  • Built via efficient distillationโ€”no training from scratch required

Technical report: Apriel-H1 Report

Training stack: Fast-LLM

Efficient and strong among hybrids

Throughput 1->16K

All models were evaluated with vllm server endpoints using FlashInfer (except for AI21-Jamba-Reasoning-3B which used FlashAttention2), mamba_cache was set to fp32 for models: NVIDIA-Nemotron-Nano-9B-v2 and AI21-Jamba-Reasoning-3B.

Comparing with Thinker ~2x speedup!

๐Ÿšง Stay tuned for the vLLM plugin!

๐Ÿ“– Citation

@misc{apriel_h1_2025,
  title        = {Apriel-H1: Towards Efficient Enterprise Reasoning Models},
  author       = {ServiceNow Language Models Lab},
  archivePrefix = {arXiv},
  eprint        = {2511.02651},
  primaryClass  = {cs.LG},
  url           = {https://arxiv.org/abs/2511.02651},
  note          = {Model available at \url{https://huggingface.co/ServiceNow-AI/Apriel-H1-15b-Thinker-SFT}},
  year          = {2025}
}

๐Ÿ”— Links

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published