-
-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add an option to choose the preferred apostrophe #202
Conversation
@@ -6,6 +6,13 @@ | |||
from usdb_syncer import settings | |||
|
|||
|
|||
@dataclass(frozen=True) | |||
class MiscOptions: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately this cannot be added to TxtOptions
since they might not be present when "Download Song file" is unchecked. This option always has to be present since it's also used to sanitize the paths made up of artist & title.
I didn't have a good idea for a class name.
@@ -123,7 +123,7 @@ def new( | |||
) -> Context: | |||
txt_str = usdb_scraper.get_notes(details.song_id, logger) | |||
txt = SongTxt.parse(txt_str, logger) | |||
txt.sanitize() | |||
txt.sanitize(options.misc_options.apostrophe.value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This starts an ugly long chain of passing this value from sanitize
to fix
to fix_apostrophes_and_quotation_marks
to replace_false_apostrophes_and_quotation_marks
, but I didn't find a nicer place to put this without moving a lot of unrelated things around.
for apostrophe in settings.Apostrophe: | ||
self.comboBox_apostrophes.addItem(str(apostrophe), apostrophe) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has apparently been the final straw for pylint
to rate the _populate_comboboxes
function as "too complex".
I personally don't think this is a useful measure and should be deactivated. Alternatively, one could put the combobox names and setting enums into lists and have an outer for loop that iterates over these, but this isn't much prettier imo
Choose between typographers apostrophe (default) and upright apostrophe. Used when sanitizing file paths and song artist, title and lyrics
62954bf
to
13b4086
Compare
Sorry for not responding earlier. I truly appreciate your efforts, but I have to be honest here, as my heart not only beats for singing/karaoke, but also for typography. Referring to the article about apostrophes on the German Wikipedia, the upright apostrophe is a typographically wrong character and stems from the old 7-bit ASCII character sets where symbol count was very limited (or, for that matter, from even older analog typing devices, see below). The upright apostrophe is solely a substitute character when the correct apostrophe is not available. Since we are moving away from ASCII/CP1252 or any other codepage encoding towards UTF8, we should embrace the versatility and the typographic possibilities this enables: supporting songs of any language/script (without needing to resort to romanization), we can support artists like "A★Teens" or titles like Prince’s "Sign “☮” The Times". I suspect you dislike the typographer’s apostrophe because you have been used to the substitute character for so long. That reminds me of my father, who grew up using old typewriters that only had one key for O (vowel) and 0 (zero), so still today he sometimes types "hell0" instead of "hello", just because he has gotten so used to it. I agree that in the past, the typographer’s apostrophe sometimes lead to headaches, because it is not part of ASCII (but of CP1252, Pos. 92) and if the wrong encoding was used, it screwed up the text display ("’" instead of the simple typographer’s apostrophe - similarly German Umlauts were displayed as "ä" (ä), "ö" (ö), "ü" (ü) and so forth). But with the support of UTF8, this is or should be a thing of the past. The same goes for using the typographer’s apostrophe for file and folder names - all somewhat modern operating systems support UTF8 filenames, so unless you’re using Windows 95, the typographer’s apostrophe in file- and folder names should not lead to trouble, neither from the OS side nor from the karaoke software side. So to sum up, I truly believe we should aim for the correct way of doing it, and not continue using a crutch that dates back to 7-bit character encodings (1960s). Search queries, both in-game and on usdb, should handle the correct and the substitute apostrophe transparently, and at least for USDX, this was something you were already working on. Hopefully, I was able to make it a little relatable why these decisions were made and why I am not in favor of accepting this PR, even though I truly appreciate the efforts. Are there any pain points with the typographer’s apostrophe besides the way the apostrophe looks? Is anything not working that we should be aware of and that we haven’t taken into consideration? P.S. I’ve seen pretty much all the wrong apostrophe symbols mentioned in the above Wikipedia article being used as apostrophe in songs uploaded to usdb. This is what initially bothered me ("don`t" or "don´t" – using accent characters – are truly ugly) and what made me find the (typographically) correct solution. |
@bohning one important difference to your analogy (O vs 0) is that there is no a simple/universal key for the typographer's apostrophe. It might sound like a negligible pain point, but it's the reason why I still continue to use the upright apostrophe (as you can see here). The method for typing Would you also want to force the use of typographically correct symbols in programming languages? (genuine question) In my case you could probably just call it laziness or resistance to change, but in other cases it's a matter of accessibility. Also, I just tried typing the typographer's apostrophe in UltraStar Deluxe's song search on Linux and it doesn't work for me. How am I supposed to search for songs with an apostrophe then? (edit: it was pointed out to me that this was addressed by your PR here, but we still need normalization) But the point still stands: even today some software doesn't properly support it or doesn't do normalization which allows you to search using both versions of the apostrophe. Yes, you can blame that software, but that doesn't help the user. I definitely understand where you're coming from, but at the end of the day I think it should be up the the user to decide. So setting the typographically correct symbols as the default is reasonable, but I'd argue that it should still be a choice. |
Thank you for your detailed answer! @irgendwr has covered most of what I wanted to say already. While I understand your argument about correctness, and just being more used to the upright apostrophe etc., I believe this is not something that will ever change. The typographer's apostrophe is simply impractical to type, which is the main reason why comparatively few people are using it at the moment, and are still (and probably always will be) used to the upright apostrophe. To me, this is mostly about customizability. Even though the upright apostrophe might be wrong, it is the most natural option for, I believe, the majority of people. As such I think it is reasonable to support this as an option in the syncer. This is a choice anyone can make for themselves, or is there any way this might affect other users that I am overlooking right now? Regarding your question about pain points: I had issues with songs not being recognized by USDX in the past if they had typographer's apostrophes in the title/artist name, but the same was true for most non-ASCII characters... I solved this by fixing my broken locale settings (iirc) so might not have any broader relevance. Other than that and the WIP search normalization in USDX, no problems from my side. |
Thanks for your responses and the discussion, I really like it. I understand the raised issue of accessibility (typing the typographer’s apostrophe is a pain in the butt for most people, no question about it), BUT: I am not asking anyone to type it. Regarding file and folder names - unless I hear from someone otherwise, I expect all major operating systems to support UTF8 filenames. Thus this should not be an issue for anyone. If it is, then this only implies that no UTF8 characters should be used in file and folder names, but that they can and should be used inside the text files, just not for referencing the files in the header tags. The only time a user really has to type apostrophes is when typing search queries: The search in USDB should be adopted to behave the same way for those not using the Syncer. But tbh, there’s always a way to search for a song by omitting the word that contains an apostrophe, so at least, there is a workaround. The search in USDX (and all the other karaoke software) needs be adapted to behave the same way. This will automatically be solved when the transliteration of the search is in place, and this is something that is needed anyhow when UTF8 is supported (the vast majority of UTF8 characters cannot be typed on a regular keyboard!). Until this is in place, the same goes here for the workaround mentioned above. Ideally (IMHO), all song creation tools and lyrics websites would already correct upright to typographer’s apostrophes so that you never have to type it manually (spoiler: they don’t). But even if you upload a song to usdb with upright apostrophes, that’s fine, since the Syncer will fix the apostrophes (amongst other things) automatically when downloading. To summarize, the whole idea is to have the correct typographer’s apostrophes without having to worry a single bit about them. They just work, both on file/folder level and in all softwares. And when you need to type apostrophes, use whichever you can type easier (which is of course the upright apostrophe, at least for everyone else but me - I changed my keyboard layout so that SHIFT+# produces the ’ and not ', that’s why I needed PR791). |
Add an option to choose between typographers apostrophe (default) and upright apostrophe, to be used when sanitizing file paths and song artist, title and lyrics, as I requested on #20
Changing the value has no immediate affect, only when a file is re-downloaded, it changes the folder/file names and the content of the txt. This also works when the file hasn't been changed on USDB side.