Skip to content
This repository has been archived by the owner on Mar 9, 2023. It is now read-only.

Fix a bug causing … is converted to "", "", "…" #121

Merged
merged 4 commits into from
Jun 2, 2020
Merged

Conversation

sorami
Copy link
Collaborator

@sorami sorami commented Jun 2, 2020

Apply the same fix as this PR WorksApplications/Sudachi#118 for Java implementation.

Related: #120

When there are more tokens than the original, due to the normalization, set the original to the first output token, not the last.

For example, currently,

$ echo … | sudachipy
	補助記号,句点,*,*,*,*	.
	補助記号,句点,*,*,*,*	.
…	補助記号,句点,*,*,*,*	.
EOS

This will be fixed to

$ echo … | sudachipy
…	補助記号,句点,*,*,*,*	.
	補助記号,句点,*,*,*,*	.
	補助記号,句点,*,*,*,*	.
EOS

@sorami sorami self-assigned this Jun 2, 2020
@sorami sorami requested a review from kazuma-t June 2, 2020 01:30
@sorami sorami marked this pull request as ready for review June 2, 2020 01:30
Copy link
Member

@kazuma-t kazuma-t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem!

@sorami sorami merged commit 1a6649b into develop Jun 2, 2020
@sorami sorami deleted the fix-cdots branch June 2, 2020 03:24
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants