Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update stopwords_ukr.js #329

Merged
merged 1 commit into from
Dec 3, 2024
Merged

Update stopwords_ukr.js #329

merged 1 commit into from
Dec 3, 2024

Conversation

imposeren
Copy link
Contributor

Manually selected top frequency pronounces, conjunctions, adpositions, grammatical particles and adverbs from Ukrainian corpus and added them to stopwords_ukr.js . Also moved some stopwords from the end of the list to their alphabetical position: now whole list is more or less in alphabetical order (first letter for sure, second letter I think is also fine, but I did not check all existing words and did not use automatic sorting to minimize the changes)

@imposeren imposeren force-pushed the patch-1 branch 2 times, most recently from 79736a3 to 2862555 Compare December 3, 2024 07:11
@eklem
Copy link
Collaborator

eklem commented Dec 3, 2024

Nice, thanks @imposerene!
I need the tests to run again, and then I'll publish to NPM.

@imposeren
Copy link
Contributor Author

Sorry, pushed some more changes. I think that's all for now. But yeah, I saw that tests failed before I pushed again... Will look at what fails and try to fix (maybe I just missed coma somewhere)

@imposeren
Copy link
Contributor Author

Sorry. Used wrong email in my commit (was using work one). Amended to not relate my current company to the commit. Sorry, for slopy commits...

@eklem
Copy link
Collaborator

eklem commented Dec 3, 2024

No problem 😊

Manually selected top frequency pronounces, conjunctions, adpositions, grammatical particles and adverbs from Ukrainian corpus and added them to stopwords_ukr.js . Also moved some stopwords from the end of the to their alphabetical position: now whole list is more or less in alphabetical order (first letter for sure, second letter I think is also fine, but I did not check all existing words and did not use automatic sorting to minimize the changes)
@imposeren
Copy link
Contributor Author

Any suggestions on how to quickly run tests locally? I'm not a JS developer, but if there is an easy way to run under linux or in docker, then I can manage to check the tests locally. I already pushed potential fix (test will now expect "на" adposition to be removed), but would be greate if I could actually run the test locally

@imposeren
Copy link
Contributor Author

All tests are ok after the small fix I introduced earlier

@eklem
Copy link
Collaborator

eklem commented Dec 3, 2024

Yes, I saw that. Just one test with a little hickup on cleaning up. So I'll merge and publish!

@eklem eklem merged commit a6d8fac into fergiemcdowall:main Dec 3, 2024
3 checks passed
@eklem
Copy link
Collaborator

eklem commented Dec 3, 2024

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants