-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal for formatting contents of <note>
field
#85
Comments
I am ok with this change. Will wait for others to chime in (@miguelsousa @typesupply), but if ok, would you be able to PR the changes (and tests)? |
I'm fine with this, but I have another idea that would retain the current behavior and enable the proposed behavior: The line length is governed by the I have no strong feeling about this so I have no objection to the proposal above to not pass any text to textwrap like we currently do. |
If we agree I am happy to write the patch. |
No objections from me. |
Alright, @cmyr, however you want to write the PR (do see what @typesupply proposes above) is cool. I think we have enough agreement on this to make the change. |
Okay, I'll get this ready shortly. I'm not totally sure about the change to allow specifying a custom wrap width; it feels like in practice that would only possibly be used for this case? it also means that there are two codepaths to maintain. So for now I'll start prepare a patch that does the dumbest thing possible (no line formatting, ever) and if there is a strong desire for this to be an option I can make that change during review. Thanks all! |
This changes the behaviour around normalizing the <note> field of .glif files. Previously the contents of this element would be broken into lines and reindented, but this was a potentially destructive operation. With this patch, any content of a <note> element is passed through unchanged. For more discussion, see: unified-font-object#85
This changes the behaviour around normalizing the <note> field of .glif files. Previously the contents of this element would be broken into lines and reindented, but this was a potentially destructive operation. With this patch, any content of a <note> element is passed through unchanged. For more discussion, see: unified-font-object#85
* Do not modify contents of <note> tag This changes the behaviour around normalizing the <note> field of .glif files. Previously the contents of this element would be broken into lines and reindented, but this was a potentially destructive operation. With this patch, any content of a <note> element is passed through unchanged. For more discussion, see: #85 * Address code review comments - make sure we're escaping XML correctly, and test that - remove dedent_tabs function - add a test for correctly encoding 'é'
I would like to acknowledge that this particular issue is likely of little significance, and I am not hugely invested in my position. Ultimately, to me, this is about something like "correctness".
Breaking this out from #84, now that I have had some time to dig into the code and have a more informed opinion.
current status
The current situation: currently,
ufonormalizer
does a bunch of opinionated formatting of the contents of<note>
. Essentially it breaks the text into 72 character lines, and then indents each of these lines to the level of the parent<note>
element, plus 1. This current text, so formatted, looks like:This has the advantage that it looks nice, but it has the disadvantage that it is destructive: the actual text of the note will change, and other tools don't have any way of telling, for a given UFO, whether or not the contents of the file have been normalized (and so should be "denormalized"?) or if they are the original intended data.
The problem with this approach is that it modifies the user's data. Although this may not matter often, it can certainly matter sometimes. A trivial example of where this would matter is if the note includes something with significant whitespace, like a bit of python code, or a boxart diagram, or a poem.
Proposal
Basically: if the note contains only whitespace, we discard it; this is the current behaviour. Otherwise we escape the string as usual, and just write a "simple element".
This does mean that the XML will be "uglier" in the case where there are long unbroken lines in the note, but there are a bunch of advantages:
In this world, the XML for our sample paragraph looks like (with the parent tags for context),
And if it contained some manual linebreaks, those lines would not respect the outer indent level, like
An alternative I had considered is stripping leading and trailing whitespace and then always putting the note body on the next line, with a new intent level, but ultimately I think this is unnecessary fanciness and extra logic for minimal concrete gain.
... okay, I'm done. If anyone has any thoughts, feel free to chime in :)
The text was updated successfully, but these errors were encountered: