-
Notifications
You must be signed in to change notification settings - Fork 182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding segmenter model option to datagen #3669
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Praise: Clean solution
"thaidict".into(), | ||
"Thai_codepoints_exclusive_model4_heavy".into(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Issue: It's definitely a breaking change to remove cjdict from icu_testdata. I agree it would be nice to get rid of the JSON file but it's more important that it stays in the postcard file we ship. Let's at least split that to its own PR that we can discuss and not hold up this PR on it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the non-shipping testdata. Which reminds me, I should update the other testdata script.
This broke main CI
|
Fixes #3408
Explicitly removing the
cjdict
from testdata, as it's a 10MB JSON file.