Kithara is a lightweight library offering building blocks and recipes for tuning popular open source LLMs including Gemma2 and Llama3 on Google TPUs.
It provides:
- Frictionless scaling: Distributed training abstractions intentionally built with simplicity in mind.
- Multihost training support: Integration with Ray, GCE and GKE.
- Async, distributed checkpointing: Multi-host & Multi-device checkpointing via Orbax.
- Distributed, streamed dataloading: Per-process, streamed data loading via Ray.data.
- GPU/TPU fungibility: Same code works for both GPU and TPU out of the box.
- Native integration with HuggingFace: Tune and save models in HuggingFace format.
New to TPUs?
Using TPUs provides significant advantages in terms of performance, cost-effectiveness, and scalability, enabling faster training times and the ability to work with larger models and datasets. Check out our onboarding guide to getting TPUs.
📚 Documentation | Read Our Docs |
💾 Installation | Quick Pip Install |
✏️ Get Started | Intro to Kithara |
🌟 Supported Models | List of Models |
🌐 Supported Datasets | List of Data Formats |
🌵 SFT + LoRA Example | SFT + LoRA Example |
🌵 Continued Pretraining Example | Continued Pretraining Example |
⌛️ Performance Optimizations | Our Memory and Throughput Optimizations |
📈 Scaling up | Guide for Tuning Large Models |