-
-
Notifications
You must be signed in to change notification settings - Fork 356
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Street names are automatically title-cased in locales where they shouldn't be #4784
Comments
So this first needs some research in which languages (or countries?) a title case is not used. Would you care to do this research? |
By the way, you can tap on the street it belongs to instead of writing it each time manually. |
Yep, I'm aware of that; However, this also happens for place names (which are quite common here, in the form of microdistricts, and here I can't use that feature with place names). |
I think a good first approximation would be to look at endonyms of countries, to see which ones are and aren't using title-casing: https://en.wikipedia.org/wiki/List_of_countries_and_dependencies_and_their_capitals_in_native_languages |
I found https://bugs.ruby-lang.org/issues/14839 which is quite related
hunting down that "Unicode data" part of Unicode may be helpful |
Also, this might be somewhat useful: https://en.wikipedia.org/wiki/Capitalization |
Given situation with Georgian alphabet this is a bug in the Curiously
on Kotlin playground is without triggering this bug. |
https://github.com/streetcomplete/StreetComplete/blob/master/app/src/main/java/de/westnordost/streetcomplete/screens/user/statistics/CircularFlagView.kt#L154 looks like internal use But there is also https://github.com/streetcomplete/StreetComplete/blob/master/app/src/main/java/de/westnordost/streetcomplete/data/meta/Abbreviations.kt#L77 used when abbreviations are expanded, I guess that it also should be fixed (maybe it is not triggered right now if no abbreviations are defined for Georgia) |
It's an issue for Bulgarian too (Стара планина becomes Стара Планина but it shouldn't). My wife, who's fluent in Russian, says it would be wrong in Russian too. So maybe for all languages using Cyrillic script? |
@rhhsm Can you open a new issue? Here it can be come lost |
Uh, we use title case function now. If that function returns false results, it is not in our hands to fix it. The fix for Georgian should have been the fix for any language. |
Um, it behaves very strange to me too. Is that titlecase() supposed to behave differently depending on the language used (because it works the same for me in English and in Croatian)? I mean, generally (outside of android), titlecase simply means that "all words are capitalized, except for minor words (typically articles, short prepositions, and some conjunctions) that are not the first or last word of the title". And what android There are several issues with such approach:
So I would suggest using sentence case in all countries except USA and maybe a few others (where title case might be more proper). small_SVID_20230419_223637_1.mp4 |
Do you know a source where one can find a list of languages where title case should be used as opposed to sentence case? For street names, place names? |
Uh, unfotunately not, and quick search does not reveal that to me. Wikipedia seems to imply that titlecase it is used mostly in English speaking countries:
Quick checking of most common suspects would indicate those do use titlecase:
But of course there are likely more. |
Now, if one were feeling adventurous, chatGPT says those countries might be using title case for street or place names:
But I wouldn't trust (for anything important) that language model predictor farther than I could throw it 😄 |
What about street names named after people? In Warsaw we have for example street |
This source http://www.bibnet.be/files/download/525b23b4-d008-433b-bbc2-e24b32872d2b/Regelgeving/vlacc_bronnen/vlacc_bronnenTitelbeschrijven/Hoofdlettergebruik%20per%20taal.html states that compound geographic names (Samengestelde geografische namen) should be capitalised in Dutch, English, Spanish and German, while French is more complicated. |
I believe the "Sentance case" mentioned above in its definition covers that - note that it does not say than only first letter of the sentence should be capitalized; but also proper nouns (which include personal names as such, I believe). Of course, the actual implementation of Sentence case would like go from most basic (like only First letter of the sentence is capitalized) to more advanced (databases of words / names to capitalize or keep lowercase, or local common usage, or common cloud usage etc).
Yes, in Croatia we also have names like "Trg bana Josipa Jelačića" (where "Josip Jelačić" is a name of the person, and "ban" was his title) but in majority of the cases only the first letter would be capitalized "Zagrebačka cesta", "Glogov put", "Taborska ulica", "Vukovarska avenija".
That sounds like a good idea! It can be argued that it should be the keyboards job to do proper capitalization / word correction (if it is advanced enough to offer such functionality). The SC might as well just make a first letter uppercase (as that sounds pretty common mostly everywhere?) and call it a day, and if user is unhappy with how it works in their current keyboard, they can choose from plethora of other keyboards. |
Well, this issue is specifically about the case where it's not true: in Georgian, upper-case letters are only used for ALL CAPS and not for title-case or sentence-case. |
Interesting. But I suspect that if titlecase() does not use upper-case letters in Georgian (as it seems not to, as previous solution seem to have worked?), it probably would not do it either in sentence case (e.g. capitalize()? or whatever else is used these days) as it says it will follow locale settings. It would need testing to confirm that, of course. |
What is the source of info that made you conclude not to include Bulgaria? |
I scrolled through the map. E.g. here, ~all road names are title cased: https://www.openstreetmap.org/#map=17/45.66213/25.60465 And here too: https://www.openstreetmap.org/#map=17/44.44219/26.10019 |
That's Romania. But still looks similar in actual Bulgaria, e.g. https://www.openstreetmap.org/#map=17/42.12727/24.74976 |
Whoops, sorry. But I definitely looked in real Bulgaria too yesterday |
Why look on the map when you could have asked me :)
All streets names in this example are "title case" because they're all named after people (very very common in Bulgaria). The only 2-word name that's an exception is Шар Планина which is wrongly capitalised (it should be Шар планина, See here https://wiki.openstreetmap.org/wiki/Multilingual_names#Bulgaria for transliteration guidelines for Bulgarian: the examples are copy-pasted from the quoted law. Maybe the map of the Sofia Metro would be more convincing? The cases where second or later words are capitalised are because they themselves are names. |
Apart from Bulgaria, are there more countries apart from those I already found where not titlecase should be used? Apparently I did not look closely enough in my swipe. |
At Spain, it depends on language used. Note that value of "name" can be in any of the following languages.
|
Note that words are currently only automacially titlecased if the word has more than 3 characters. That should take care of all the el, la, las, de, del, van, von, etc... (but misses out on "vía", but well 🤷 ) |
I started implementing this with a huge list of languages in which no title case should be applied automatically but stopped this after several hours because I realized this automatism is just not worth the extra complexity. After all, it will probably be never 100% correct, there are many languages after all and it was just a small convenience for regions where everything is in title case. The user might as well just press shift then 🤷 |
If it's a matter of already having done the basic work but not yet completed the list of languages, then it's worth implementing what has been done so far. Applying title case to languages that shouldn't do that generates wrong data, and that should be prevented as much as possible. If the list of languages that don't apply title case is longer than the one for languages that do, then maybe sentence case should be the default? |
No, I am telling you, it is not worth implementing. That's what I wrote in my previous comment. Did you read it? #5105 100% solves this issue by not using titlecase automatically at all, something that cannot be achieved either by a inclusive or exclusive list of languages (because languages are plenty). Which is fine, because it is not exactly a major inconvenience for the user to press shift, plus for names ("Avenue Lenin"), the keyboard app usually suggests to titlecase it anyway. |
Sorry for the misunderstanding, I don't even understand what a pull request is :) I agree that leaving the capitalisation to the keyboard app is the best option. |
It is a technical term for a suggested change that can be viewed, discussed and commented on on github. |
* do not automatically titlecase words for names (fixes #4784) * remove unnecessary import
Because of this line, whenever one is typing in a name for a street or a place, it always capitalizes the first letter of each word in the street name. While this is great for many languages where uppercase letters are used at the start of each word in toponyms, this is a problem in other languages where that's not the case. For example, in Georgian language, while capital letters are included in the Unicode standard, they are not used for title casing. This makes it really annoying to input street names, as one has to edit the words afterwards to remove the capitalization.
I'm not sure what the best solution for this would be. Perhaps add a list of "exception" languages which are not to be titlecased, or even a simple toggle to disable the feature.
How to Reproduce
Start solving a "Street Name" quest somewhere in Georgia, type in street name in Georgian. The first letters are automatically capitalized, even though they shouldn't be.
Here's what it looks like:
23-01-31-22-32-25.mp4
Expected Behavior
No tilte-casing occurs.
Versions affected
v50.2 (latest from F-Droid)
The text was updated successfully, but these errors were encountered: