Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

there is 1 redundant space before and after the punctuation #2

Open
bk111 opened this issue Jun 2, 2023 · 5 comments
Open

there is 1 redundant space before and after the punctuation #2

bk111 opened this issue Jun 2, 2023 · 5 comments

Comments

@bk111
Copy link

bk111 commented Jun 2, 2023

  1. after convert, there is 1 redundant space before and after the punctuation
  2. pinyin-jyutping needs double time than pinyin-jyutping-sentence
  3. pinyin-jyutping-sentence missed Chinese punctuation
  4. please try
    items1 = """25、这件事急不得,表面要装镇定,以免打草惊蛇。
    21、这次行动千万要保密,不能打草惊蛇。
    22、消息指她们都比平日"格外小心",以免打草惊蛇,故媒体也未能得知她们的身份。
    """
@bk111
Copy link
Author

bk111 commented Jun 4, 2023

please check the picture
https://forum.chinese-learning.me/viewtopic.php?f=5&t=417

@luc-vocab
Copy link
Contributor

luc-vocab commented Jun 4, 2023

I understand what you mean now, each character of punctuation has a space that precedes it, which wasn't present in the chinese text. Let me think of a way to fix that. In your opinion, is it important to preserve the whitespace of the original text ?

FYI python-pinyin-jyutping won't be developed anymore, but I can try to improve the speed of pinyin-jyutping. Can you tell me what your expectation is in terms of speed ?

@bk111
Copy link
Author

bk111 commented Jun 4, 2023

I understand what you mean now, each character of punctuation has a space that precedes it, which wasn't present in the chinese text. Let me think of a way to fix that. In your opinion, is it important to preserve the whitespace of the original text ?

FYI python-pinyin-jyutping won't be developed anymore, but I can try to improve the speed of pinyin-jyutping. Can you tell me what your expectation is in terms of speed ?

if it's possible, please keep up same with the original text. Speed is ok. but why does the new version is slower than previous edition?

@luc-vocab
Copy link
Contributor

What's your expectation for this input text ?
25、这件
I can produce the following output easily (space after the punctuation, but not before), which matches latin language convention
25、 zhèjiàn

another example:
input: 請問,你叫什麼名字?
output: qǐngwèn, nǐ jiào shénme míngzi

Let me know whether this would work for you.

@bk111
Copy link
Author

bk111 commented Jul 2, 2023

25、这件
25、 zhèjiàn # ------------this is wrong. (there is a redundant space before ’zhè‘)
25、zhèjiàn # ------------this is right.

input: 請問,你叫什麼名字?
output: qǐngwèn, nǐ jiào shénme míngzi # -----------this is wrong. (there is a redundant space before ’nǐ‘)
output: qǐngwèn,nǐ jiào shénme míngzi # -----------this is right.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants