What's New
- 📚 The DJ doc is refactored and improved, e.g., RecipeGallery, DeveloperGuide, DistributedProcess, DJ-related Competitions, typos bad links
- 🔎 More unit-tests added.
- 🎛 The data pre-split and export are improved.
- 🔮 A new data selection method, DaaR, is proposed. See Diversity as a Reward: Fine-Tuning LLMs on a Mixture of Domain-Undetermined Data.
Detailed PRs
- fix export error when export_stats columns is null in #557
- Resplit input dataset in ray mode in #549
- Refactor and improve doc for RecipeGallery, DeveloperGuide, DistributedProcess and DJ-related Competitions in #561
- Resolve most skipped unit-tests by in #559
- fix translation error in #562
- Add unittest for ray text dedup in #540
- [Typo]correct a small typo in #563
- update the 2.0 paper link & the DaaR news in #566
- Fix typos in #571
- Optimization for sdxl_prompt2prompt_mapper dependency importing by in #570
- Fix typos in #572
Acknowledgment
- @liuyuhanalex @co63oc made their first PRs
Full Changelog: v1.1.0...v1.2.0