Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Contentful/lazy language engines #1160

Merged
merged 6 commits into from
Dec 13, 2020
Merged

Conversation

dpinol
Copy link
Contributor

@dpinol dpinol commented Dec 3, 2020

Depends on #1118

Description

Only import nlp-js stemmers and tokenizers when using them.
removed rootLocale, as it did the same as languageFromLocale

Context

  • For performance reasons, we only want to load the required stemmers and tokenizers
  • Also, it's also possible that a stemmers and tokenizers cannot be loaded due to a bug in its code or in the browser. We shouldn't crash if we're not even using it.

Approach taken / Explain the design

  • Now we use a map of singletons so that we only load the tokenizer/stemmer right before the 1st use

Testing

The pull request has unit tests

@dpinol dpinol force-pushed the contentful/lazy-language-engines branch from 8a98d6c to 46fa59b Compare December 3, 2020 18:07
@dpinol dpinol marked this pull request as ready for review December 3, 2020 18:12
Copy link
Contributor

@asastre asastre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing refactor @dpinol 💯 💯 👏!!

Comment on lines 151 to 152
[locales.CROATIAN]: () => new TokenizerHr(),
[locales.SLOVAK]: () => new TokenizerSk(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why TokenizerHr and TokenizerSk are always imported? Shouldn't they be lazy loaded too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't do it because require() failed initially, but I found out that it works it you specify the extension

Comment on lines 84 to 85
hr: () => new StemmerHr(),
sk: () => new StemmerSk(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why StemmerHr and StemmerSk are always imported? Shouldn't they be lazy loaded too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same, done

sk: new StemmerSk(),
}
const stemmers = new SingletonMap<Stemmer>({
ca: () => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be better to use the constants for the keys (locales.CATALAN instead of ca)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done!

@dpinol dpinol force-pushed the contentful/lazy-language-engines branch from 46fa59b to 977f052 Compare December 12, 2020 16:38
for performance reasons, and also to avoid failures in case an unused language tokenizer/stemmer cannot be loaded (eg. due to bug in browser)
languageFromLocale does the same
for performance reasons, and also to avoid failures in case an unused language tokenizer/stemmer cannot be loaded (eg. due to bug in browser)
@dpinol dpinol force-pushed the contentful/lazy-language-engines branch from 977f052 to 71951aa Compare December 13, 2020 14:07
@dpinol dpinol merged commit f8ae344 into master Dec 13, 2020
@dpinol dpinol deleted the contentful/lazy-language-engines branch December 13, 2020 14:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants