[Feature Request] massive raw data → sft data parallel generation #1414
Labels
Data
Related to camel data processing
enhancement
New feature or request
P0
Task with high level priority
Milestone
Required prerequisites
Motivation
for initial qa datagen more automaticly when user source pdf、weblink content etc are very long.we should make all datagen method input Context length to 1million tokens above
Solution
No response
Alternatives
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: