Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Thai W2P #511

Merged
merged 22 commits into from
Jan 7, 2021
Merged

Add Thai W2P #511

merged 22 commits into from
Jan 7, 2021

Conversation

wannaphong
Copy link
Member

@wannaphong wannaphong commented Dec 29, 2020

What does this changes

I add Thai Word-to-Phoneme converter to pythainlp.transliterate.pronunciate. It's converter thai word to thai phoneme.

GitHub: https://github.com/wannaphong/Thai_W2P

Your checklist for this pull request

🚨Please review the guidelines for contributing to this repository.

  • Passed code styles and structures
  • Passed code linting checks and unit test

@pep8speaks
Copy link

pep8speaks commented Dec 29, 2020

Hello @wannaphong! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2021-01-07 13:59:28 UTC

@wannaphong
Copy link
Member Author

@coveralls
Copy link

coveralls commented Dec 29, 2020

Coverage Status

Coverage increased (+0.03%) to 95.855% when pulling d25fe08 on Add-thai-word2phoneme into 907f2d6 on dev.

@wannaphong wannaphong requested a review from bact December 30, 2020 07:50
@wannaphong
Copy link
Member Author

wannaphong commented Dec 30, 2020

Model Card

Model Details

Intended Use

  • Converter thai word to thai phoneme
  • Not suitable for other language.

Factors

  • Based on thai word to thai phoneme problems.

Metrics

  • Evaluation metrics include phoneme error rate (number error / number phonemes)

Training Data

Thai W2P

Evaluation Data

Thai W2P

Quantitative Analyses

epoch: 100
step: 100, loss: 0.03179970383644104
step: 200, loss: 0.04126007482409477
step: 300, loss: 0.01877519115805626
step: 400, loss: 0.03311225399374962
per: 0.0432
per: 0.0419

Ethical Considerations

thai phoneme based on website (wiktionary, Royal Institute et cetera). It may not be the dialect that you use in everyday life.

Caveats and Recommendations

  • 1 Thai word only

@bact bact added the enhancement enhance functionalities label Dec 31, 2020
Copy link
Member

@bact bact left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it possible now for the transliterate() function to convert multiple Thai words into their sound representation?

If not, I suggest that it should. Because that's how other engine do. We should keep the types of input and output consistent.

This can be done by tokenized the input first, then do the w2p for each token.

pythainlp/transliterate/core.py Outdated Show resolved Hide resolved
@wannaphong
Copy link
Member Author

I changed w2p from pythainlp.transliterate.transliterate to pythainlp.transliterate.pronunciate.

@bact bact added this to the 2.3 milestone Jan 7, 2021
@bact
Copy link
Member

bact commented Jan 7, 2021

I think we're good to go. Thanks @wannaphong

@bact
Copy link
Member

bact commented Jan 7, 2021

All Thai G2P, Thai W2P, etc. can be converted to the singleton pattern but we can do that later.

@wannaphong
Copy link
Member Author

@bact Thank you.

@wannaphong wannaphong merged commit 17891a3 into dev Jan 7, 2021
@bact bact deleted the Add-thai-word2phoneme branch January 8, 2021 12:40
@wannaphong wannaphong mentioned this pull request Apr 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement enhance functionalities
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants