You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In EE-LLM and EE-Tuning paper, we use the jsonline format data provided by Data-Juicer. You can use the tools/preprocess_data.py to preprocess the data into binary format as shown in README of Megatron-LM.
Describe the solution you'd like
Could you provide a script to preprocess data? Maybe a demo is enough.
The text was updated successfully, but these errors were encountered: