Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expanding Malayalam normalization rules. Add Unit test #10

Merged
merged 6 commits into from
Feb 10, 2024

Conversation

kavyamanohar
Copy link
Collaborator

No description provided.

Copy link

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Type: Enhancement

PR Summary: This pull request introduces enhancements to the Malayalam normalization rules within the Whisper_normalizer project. It updates the README.md to reflect these changes and the expansion of support for Indic languages, specifically focusing on Malayalam. Additionally, it adds a new unit test file, ml_test.py, to ensure the correctness of the Malayalam normalization process. The changes in the whisper_normalizer/indic_normalizer.py file include a specific rule to improve the normalization of Malayalam text.

Decision: Comment

📝 Type: 'Enhancement' - not supported yet.
  • Sourcery currently only approves 'Typo fix' PRs.
✅ Issue addressed: this change correctly addresses the issue or implements the desired feature.
No details provided.
📝 Complexity: the changes are too large or complex for Sourcery to approve.
  • Unsupported files: the diff contains files that Sourcery does not currently support during reviews.

General suggestions:

  • Consider providing more detailed examples in the README.md to showcase the breadth of normalization changes and their impact on various Malayalam text scenarios.
  • It might be beneficial to include a brief explanation or comments within the code, especially for the new normalization rule added, to help future contributors understand its purpose and functionality quickly.
  • Given the unit tests have identified failures, it would be prudent to address these before merging. If these failures are expected or known limitations, documenting them would help set the right expectations.
  • For the unit tests, adding more comprehensive test cases covering a wider range of Malayalam text variations could further solidify the robustness of the normalization process.

Thanks for using Sourcery. We offer it for free for open source projects and would be very grateful if you could help us grow. If you like it, would you consider sharing Sourcery on your favourite social media? ✨

Share Sourcery

Help me be more useful! Please click 👍 or 👎 on each comment to tell me if it was helpful.

@kurianbenoy
Copy link
Owner

Thank you @kavyamanohar for your PR. I will merge it soon

kurianbenoy added a commit that referenced this pull request Feb 10, 2024
Expanding Malayalam normalization rules. Add Unit test #10
@kurianbenoy kurianbenoy merged commit ebdfedb into kurianbenoy:main Feb 10, 2024
kurianbenoy added a commit that referenced this pull request Feb 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants