Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chore: update simple_spell_checker package #2139

Merged

Conversation

CatHood0
Copy link
Collaborator

@CatHood0 CatHood0 commented Aug 23, 2024

Description

Was updated since the last version of the package added this new things:

New separator regexp

At the version 1.1.7 was fixed some issues related with the characters because the separator not works as expected. This is the old regexp that only accept spanish, english, a some deutsch or italian characters.

'''(\s+|[,;:'`´^¨`\'\.°\|\*\•µ\[\]\(\)\!\¡\¿\?\¶\$\%\&\/\\=\}
\{\+\-©℗ⓒ·~½¬ſþˀ\_\«\»\<\>\¢\@\€\←\↓\→\ð\ø\¢\”\“\„\"]|
[\wẃĺĸẗŕýảẻủỷỉỏẢẺỶỦỈỎƙïßśŔŸËẄÄŸÏÖÜüÍÁẂÉÚÝÓÁäëÿïößðẅẍæëïüãñõáéíóúýâêîôûöáéíñǵñÑü
ÜçÇàèìòùÀÈÌÒÙâêîôûÂÊÎÔÛãõÃÕćóúüñÁÉÍÓÚÜÑ]+)'''

As you see, it wont work with chinese, arabic, or hebrew. By this issue we now use a regexp using unicode standard to make more simple the regexp and letting to all users use the SimpleSpellChecker without loss any characters.

// matches with any visible or invisible whitespaces format
const String _whitespaces = r'''\p{Z}''';
// Accept all characters from any languages
const String _allWords =
    r'''\p{L}\p{M}\p{Lm}\p{Lo}\p{Script=Arabic}\p{Script=Armenian}\p{Script=Bengali}\p{Script=Bopomofo}\p{Script=Braille}\p{Script=Buhid}\p{Script=Canadian_Aboriginal}\p{Script=Cherokee}\p{Script=Cyrillic}\p{Script=Devanagari}\p{Script=Ethiopic}\p{Script=Georgian}\p{Script=Greek}\p{Script=Gujarati}\p{Script=Gurmukhi}\p{Script=Han}\p{Script=Hangul}\p{Script=Hanunoo}\p{Script=Hebrew}\p{Script=Hiragana}\p{Script=Inherited}\p{Script=Kannada}\p{Script=Katakana}\p{Script=Khmer}\p{Script=Lao}\p{Script=Latin}\p{Script=Limbu}\p{Script=Malayalam}\p{Script=Mongolian}\p{Script=Myanmar}\p{Script=Ogham}\p{Script=Oriya}\p{Script=Runic}\p{Script=Sinhala}\p{Script=Syriac}\p{Script=Tagalog}\p{Script=Tagbanwa}\p{Script=Tamil}\p{Script=Telugu}\p{Script=Thaana}\p{Script=Thai}\p{Script=Tibetan}\p{Script=Yi}''';
// Accept non letter characters like dots, or emojis
const String _nonWordsCharacters =
    r'''\p{P}\p{N}\p{Pd}\p{Nd}\p{Nl}\p{Pi}\p{No}\p{Pf}\p{Pc}\p{Ps}\p{Cf}\p{Co}\p{Cn}\p{Cs}\p{Pe}\p{S}\p{Sm}\p{Sc}\p{Sk}\p{So}\p{Cc}\p{Po}\p{Mc}''';

final RegExp separatorRegExp = RegExp('''([$_whitespaces]+|[$_allWords]+|[$_nonWordsCharacters])''', unicode: true);

New translations:

  • German Switzerland - de-ch
  • English (United Kingdom) - en-gb
  • Catalan (Standard) - ca
  • Arabic (Standard) - ar
  • Danish - da
  • Bulgarian - bg
  • Dutch - nl
  • Korean (Standard) - ko
  • Estonian (Standard) - et
  • Hebrew (Standard) - he
  • Slovak - sk

Checking even the dictionary is empty

This feature give us the ability to check our texts even the language is not founded and the dictionary is practicaly empty using worksWithoutDictionary param.

Note: This feature need to have safeDictionaryLoad active to avoid goes into a exception.

MultiSpellChecker

This feature let us create an instance like SimpleSpellChecker but this one accept more than one language. More info in PR.

  • New feature: Adds new functionality without breaking existing features.
  • 🛠️ Bug fix: Resolves an issue without altering current behavior.
  • 🧹 Code refactor: Code restructuring that does not affect behavior.
  • Breaking change: Alters existing functionality and requires updates.
  • 🧪 Tests: Adds new tests or modifies existing tests.
  • 📝 Documentation: Updates or additions to documentation.
  • 🗑️ Chore: Routine tasks, or maintenance.
  • Build configuration change: Changes to build or deploy processes.

@CatHood0 CatHood0 changed the title XZChore: update simple_spell_checker package to support most of the new… Chore: update simple_spell_checker package to support most of the new… Aug 23, 2024
@CatHood0 CatHood0 marked this pull request as draft August 23, 2024 06:33
@CatHood0 CatHood0 marked this pull request as ready for review August 23, 2024 06:40
@CatHood0 CatHood0 changed the title Chore: update simple_spell_checker package to support most of the new… Chore: update simple_spell_checker package Aug 23, 2024
@singerdmx singerdmx merged commit 1828eca into singerdmx:master Aug 23, 2024
2 checks passed
@CatHood0 CatHood0 deleted the update_simple_spell_checker_dependency branch August 23, 2024 16:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants