-
-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expanding Malayalam normalization rules. Add Unit test #10
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR Type: Enhancement
PR Summary: This pull request introduces enhancements to the Malayalam normalization rules within the Whisper_normalizer project. It updates the README.md to reflect these changes and the expansion of support for Indic languages, specifically focusing on Malayalam. Additionally, it adds a new unit test file, ml_test.py, to ensure the correctness of the Malayalam normalization process. The changes in the whisper_normalizer/indic_normalizer.py file include a specific rule to improve the normalization of Malayalam text.
Decision: Comment
📝 Type: 'Enhancement' - not supported yet.
- Sourcery currently only approves 'Typo fix' PRs.
✅ Issue addressed: this change correctly addresses the issue or implements the desired feature.
No details provided.
📝 Complexity: the changes are too large or complex for Sourcery to approve.
- Unsupported files: the diff contains files that Sourcery does not currently support during reviews.
General suggestions:
- Consider providing more detailed examples in the README.md to showcase the breadth of normalization changes and their impact on various Malayalam text scenarios.
- It might be beneficial to include a brief explanation or comments within the code, especially for the new normalization rule added, to help future contributors understand its purpose and functionality quickly.
- Given the unit tests have identified failures, it would be prudent to address these before merging. If these failures are expected or known limitations, documenting them would help set the right expectations.
- For the unit tests, adding more comprehensive test cases covering a wider range of Malayalam text variations could further solidify the robustness of the normalization process.
Thanks for using Sourcery. We offer it for free for open source projects and would be very grateful if you could help us grow. If you like it, would you consider sharing Sourcery on your favourite social media? ✨
Thank you @kavyamanohar for your PR. I will merge it soon |
Expanding Malayalam normalization rules. Add Unit test #10
No description provided.