Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add good_words to project dict more efficiently #1261

Merged
merged 1 commit into from
Oct 14, 2023

Conversation

windymilla
Copy link
Collaborator

Previously it saved the dictionary after reading each word from the good_words.txt. Now it reads them all, then does a single save at the end.

Fixes #1255

Previously it saved the dictionary after reading each word from
the `good_words.txt`. Now it reads them all, then does a single
save at the end.

Fixes DistributedProofreaders#1255
@windymilla windymilla requested review from cpeel and srjfoo October 2, 2023 19:44
@windymilla
Copy link
Collaborator Author

Testing notes: You can use almost any text file as a good_words.txt file (5000 lines long is about a 10 second wait on my computer in master). Once you have created your good words file, open any other file (e.g. mytextfile.txt) in GG, run Tools->Spell Query, and in the dialog, use Add good/bad words. You can also create a bad_words.txt and load those via the same method. Each line in a good/bad words file is (always has been) treated as a good/bad "word". After running Add good/bad words you can edit mytextfile.dic and you should see all the good words listed (as Perl code), then below all the bad words.

@@ -226,18 +226,16 @@ sub spelladdword {
}

#
# Add a word to the project dictionary
# Optional second argument if it's a bad word
# Add a single word to the project dictionary and save it - slow if used for bulk additions
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This branch is undoubtedly faster than master, but this comment seems to imply that it's still saving after each addition? Or am I misinterpreting something?

Copy link
Collaborator Author

@windymilla windymilla Oct 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That function spelladdword is still used for adding a single word to the project dictionary - I added the word "single" to the comment, and that it saves the dictionary, to emphasise that. However, it used to be called multiple times when a good word list was added - I now add all the words to the internal structures, then save the dictionary once for that case, so don't use this function at all for that. The comment that it is "slow if used for bulk additions" is intended to alert a future developer not to use it for adding lots of words.

@windymilla windymilla merged commit d036742 into DistributedProofreaders:master Oct 14, 2023
1 check passed
@windymilla windymilla deleted the good-words branch October 14, 2023 14:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Large good_words files are very slow to load
3 participants