-
-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature Request - allow non-ASCII characters in geography #5417
Comments
Some of the specimens I handle in Arctos have special characters in the locality (I often use Cyrillic and Norse characters, for instance). Allowing non-ascii characters would be very helpful, especially in instances where the location in the English transliteration isn't as precise or clearly understood as the local alphabet would be. |
@szaborac is that geography or locality? Locality has accepted Unicode characters for some time, but also see ArctosDB/documentation-wiki#291 - one (primary) purpose of spec locality is to provide data that machines can understand, and machines mostly have English biases. (But I just fed GeoLocate - the machine we use most for this - 'Магадан' and it did what it should have, so maybe this isn't a concern? Still needs understood and documented.)
This is why I suspect you're not talking about geography: All Arctos geography is spatial, there can be no ambiguity. (There are also strings and those are great at confusing people, but the shape is perfectly precise and what really defines geography.) |
Ah - I see what you mean. You are correct. I mean locality. My apologies!
…On Thu, Jan 5, 2023, 8:06 AM dustymc ***@***.***> wrote:
@szaborac <https://github.com/szaborac> is that geography or locality?
Locality has accepted Unicode characters for some time, but also see ArctosDB/documentation-wiki#291
<ArctosDB/documentation-wiki#291> - one (primary) purpose
of spec locality is to provide data that machines can understand, and
machines mostly have English biases. (But I just fed GeoLocate - the
machine we use most for this - 'Магадан' and it did what it should have, so
maybe this isn't a concern? Still needs understood and documented.)
isn't as precise
This is why I suspect you're not talking about geography: All Arctos
geography is spatial, there can be no ambiguity. (There are also strings
and those are great at confusing people, but the shape is perfectly precise
and what really defines geography.)
—
Reply to this email directly, view it on GitHub
<#5417 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ASBR7WCFTBBPFWWTFICRKLTWQ3WQJANCNFSM6AAAAAATGEDCWI>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
From #5486
Not that I can see in my data. I do try to do something with eg and I can maybe-probably dig that out of remarks and match it up to gadm's data, but I don't expect anyone else to. Those aren't even always the same data that I can access, but that's a really nice identifier and I'd be happy to use it if you can convince them to! |
I asked this in the contact form - let's see what we get back. |
AWG issues discussed: stick with ASCII TODO: rebuild deasciiifier to use this, maybe do something special with unaltered search term? |
Is your feature request related to a problem? Please describe.
Geography has traditionally disallowed non-ASCII characters for various reason, but now
Describe what you're trying to accomplish
More closely "do what GADM does," which is increasingly important for creating new geography.
Describe the solution you'd like
Describe alternatives you've considered
Additional context
I think unaccent would allow eg record bulkloading to continue to work as it does now, but there's some (small, I think) chance that this would somehow complicate SOMETHING for a few areas of the world.
Example
https://arctos.database.museum/place.cfm?action=detail&geog_auth_rec_id=10007445
I'd like to change
Eastern Province
toAsh-Sharqīyah
, which contains 'LATIN SMALL LETTER I WITH MACRON' (U+012B).Priority
Relatively low, but some geography creation requests could drastically change my outlook on this.
Does anyone have any reason not to do this?
The text was updated successfully, but these errors were encountered: