Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detokenizer throws exception with certain inputs #90

Closed
laeubli opened this issue Dec 18, 2015 · 1 comment
Closed

Detokenizer throws exception with certain inputs #90

laeubli opened this issue Dec 18, 2015 · 1 comment

Comments

@laeubli
Copy link

laeubli commented Dec 18, 2015

I've come across a minor bug in the newly added detokenization routine, where some inputs result in a java.lang.UnsupportedOperationException. Example:

com.twitter.penguin.korean.TwitterKoreanProcessor.detokenize(List("", "제품을", "사용하겠습니다"))
// throws java.lang.UnsupportedOperationException: empty.init

It seems like this could be easily fixed by initialising the list to be output differently. For now, I'm circumventing the problem by always prepending an empty string to the input:

com.twitter.penguin.korean.TwitterKoreanProcessor.detokenize(List("", "", "제품을", "사용하겠습니다"))
// works

This is neither critical nor urgent, but it would be nice if this could be fixed in some of the future releases of this great library.

@hohyon-ryu hohyon-ryu mentioned this issue Dec 19, 2015
@hohyon-ryu
Copy link
Contributor

Thanks for filing the issue. I've created a patch in #92.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants