-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DailyDialogue dataset #13
Comments
I don't have the spacefusion pre-training code. On dailydialog dataset, we keep the history of a fixed sequence length. We tried to follow the original paper setting: |
Thanks, so where can I get the daily dialog dataset you used in run_dialog_spacefusion.sh (../data/datasets/dailydialog_data/train.txt)? Or should I preprocess it myself? |
I'm afraid you have to pre-process it on your own. |
Sure, so for DailyDialog, since spacefusion doesn't provide any preprocessing code for the dataset, what criteria did you use for src and trgt, or what procedure did you use to split the original dailydialog in to src and trgt? Thanks in advance! |
Where can I get the preprocessed dailydialog dataset used for spacefusion pretraining code? Any suggestion on how to preprocess the original dailydialog would be appreciated! Thanks
The text was updated successfully, but these errors were encountered: