Highlights
- Pro
Pinned Loading
-
RLCHF
RLCHF PublicRLCHF provides tools to generate diverse persona-based preference data and train/evaluate alignment models using RLHF/DPO workflows. It includes persona generation scripts, example persona datasets…
Jupyter Notebook
-
LLada-Reasoning
LLada-Reasoning PublicLLaDA-Reasoning: A long-context reasoning and diffusion-style model (8K context, dual KV cache) built with an end-to-end pipeline that includes pretraining, supervised fine-tuning (SFT), DPO + GRPO…
Python
-
CadQuery-Code-Generator
CadQuery-Code-Generator PublicThis project creates a CadQuery code generator model by fine-tuning vision-language models (Internvl 3).
Python
-
adrsimon/gomagnon
adrsimon/gomagnon PublicA Go multi-agent ecosystem of prehistoric humans trying to evolve as a society.
Go 1
If the problem persists, check the GitHub status page or contact support.


