-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cleaners and to-replace should also be dataset-specific #359
Comments
It seems that we don't have |
I think that's fine. It's a bit advanced, and there isn't an obvious way (to me) to create the interaction in the wizard. I think it's alright if we just document it in the docs and tell people to adjust the configuration file if necessary. |
It may confuse the user to set |
Good idea! |
this should be fixed at a later date by fixing #359
this should be fixed at a later date by fixing #359
this should be fixed at a later date by fixing #359
I still think we should have cleaners defined on the
everyvoice.config.text_config.TextConfig
but we should rename them toglobal_cleaners
andglobal_to_replace
. There are some cleaners/to_replace rules that only apply to certain datasets, and those should be defined oneveryvoice.config.preprocessing_config.Dataset
.In addition to adding the cleaners here, we also need to:
The text was updated successfully, but these errors were encountered: