Probably a more reasonable method of `packing` #2466

AIR-hl · 2024-12-12T12:05:59Z

Feature request

Hi! Here i m again : ) , according to #1850 i noticed that the packing method of SFTTrainer truncates overly long data directly and roughly, it's obviously that this will cause a lot of text information to be destroyed.

When browsing the codes of LLaMA-Factory, i find they use a different elegant way for packing. In short, they use the greedy knapsack algorithm to fill each packing sequence as much as possible, maximizing its utilization.

Maybe trl can be modified accordingly in the future. What's ur idea?

Motivation

https://github.com/hiyouga/LLaMA-Factory/blob/main/src/llamafactory/data/processors/supervised.py#L130

The text was updated successfully, but these errors were encountered:

qgallouedec · 2024-12-13T17:52:47Z

That's very interesting! It would be a nice improvement.

If you want to tackle this problem, you should be aware that packing will be implemented differently (in a simpler way) in the near future, see #2405. You should branch from there.

qgallouedec added ✨ enhancement New feature or request 🙋 help from community wanted Open invitation for community members to contribute 🏋 SFT Related to SFT 🧒 good second issue Good for contributors with basic project familiarity labels Dec 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Probably a more reasonable method of `packing` #2466

Probably a more reasonable method of `packing` #2466

AIR-hl commented Dec 12, 2024 •

edited

Loading

qgallouedec commented Dec 13, 2024

Probably a more reasonable method of packing #2466

Probably a more reasonable method of packing #2466

Comments

AIR-hl commented Dec 12, 2024 • edited Loading

Feature request

Motivation

qgallouedec commented Dec 13, 2024

Probably a more reasonable method of `packing` #2466

Probably a more reasonable method of `packing` #2466

AIR-hl commented Dec 12, 2024 •

edited

Loading