-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Undoing casefolding? #469
Comments
I'm not opposed. Would you send a PR? You'll just remove |
Just to confirm, you mean delete |
On Fri, Nov 4, 2022 at 6:11 PM Hossep Dolatian ***@***.***> wrote:
Just to confirm, you mean delete casefold: true and not simply change it
to casefold: false?
Yea, it appears to default to false if I’m reading correctly. If I’m wrong
we’ll know.
Sadly, I don't think I have a good enough computer/internet to rescrape
everything :(
It isn’t computationally intensive at all, it only takes a while because of
Wiktionary’s rate limiting. But if you create the rest of the PR, test it
out on a language or two, we could probably take it from there.
… |
Did a PR |
We have a hint of this in our notion of "filtered" vs. "unfiltered", this could just be an additional layer. |
I was working on this and trying to run step 1 of "the big scrape", but I ran into a weird error with some languages not being recognized, details here. |
The commandline lets the user choose to apply casefolding so that entries like
English
can be changed to eitherEnglish
orenglish
. But for the scraped data on the repo, it seems you apply casefolding by default. Would it be more useful if the online data didn't do casefolding? That way,Right now, if the user wants to get the original cases, then they have to run the terminal option (which takes a while).
The text was updated successfully, but these errors were encountered: