-
Notifications
You must be signed in to change notification settings - Fork 108
Issues: allenai/dolma
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
Is this a bug?
if not LANGDETECT_AVAILABLE:
in LinguaTagger
#220
opened Nov 29, 2024 by
Nevermetyou65
[Dolma Tutorial (https://allenai.github.io/docs): 707NotFound]
#192
opened Aug 27, 2024 by
yushengsu-thu
Is there explicitly instruction-following data in the version of Dolma used to train Olmo v1?
#177
opened Jul 15, 2024 by
john-hewitt
Data out of bounds when using ‘dolma tokens --dtype uint32’
#142
opened Mar 25, 2024 by
Jackwaterveg
Support providing streams into mixer via CLI
enhancement
New feature or request
#130
opened Feb 29, 2024 by
soldni
not_alphanum_paragraph_v1 tagger takes forever to run on certain inputs.
#123
opened Feb 15, 2024 by
peterbjorgensen
ProTip!
Updated in the last three days: updated:>2024-11-26.