Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(nlp): Czech tokenizer, stemmer and stopwords added #1113

Merged
merged 3 commits into from
Nov 19, 2020

Conversation

elozano98
Copy link
Contributor

Depends on #1110. Please review it first. ⚠️

Description

Czech tokenizer, stemmer, and stopwords have been added to contentful nlp.

Context

Adding them will make it possible to process Czech text.

Approach taken / Explain the design

The tokenizer and the stemmer used are from the nlpjs library while the stopwords have been collected from a github repository.

Testing

The pull request...

  • ✔️ has unit tests

Base automatically changed from contentful/el to master November 19, 2020 14:35
@elozano98 elozano98 merged commit e057e23 into master Nov 19, 2020
@elozano98 elozano98 deleted the contentful/cs branch November 19, 2020 14:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants