Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added script for updating messages and basic usage help for other message scripts #1000

Open
wants to merge 3 commits into
base: clarin-dev
Choose a base branch
from

Conversation

cyplas
Copy link
Collaborator

@cyplas cyplas commented Nov 18, 2021

Main change: new script update-message-xml-translations.py, which may be handy on forks with other languages during upgrades.

@cyplas
Copy link
Collaborator Author

cyplas commented Nov 18, 2021

Aha, also made the existing scripts independent of dspace-l10n-check.py, which is still python2 (and perhaps not worth updating?).

@TomazErjavec
Copy link

TomazErjavec commented Nov 18, 2021

Well, if this goes to UFAL, then it would be nice to have them all in Python 3, methinks.

Just to clarify: the scripts are now in Python 3. That's part of the reason why I've made them independent of the pre-existing dspace-l10n-check.py, which is still in python2.

@kosarko kosarko self-assigned this Nov 26, 2021
@kosarko
Copy link
Member

kosarko commented Dec 2, 2021

Does this deprecate dspace-l10n-check.py, can it be removed?
Does https://github.com/ufal/clarin-dspace/wiki/Language-localisation need updating?

@cyplas
Copy link
Collaborator Author

cyplas commented Dec 3, 2021

Does this deprecate dspace-l10n-check.py, can it be removed?

I would say so. But that sort of assumes that users everywhere prefer my script. :)

Does https://github.com/ufal/clarin-dspace/wiki/Language-localisation need updating?

It doesn't make anything on that page obsolete, but it does offer an extra (and I hope preferable) script for the last section: https://github.com/ufal/clarin-dspace/wiki/Language-localisation#checking-for-message-key-changes-when-upgrading. If you like, I can edit that section myself, but I'd ask you to test out my new script first yourself.

@kosarko
Copy link
Member

kosarko commented Dec 6, 2021

I would say so. But that sort of assumes that users everywhere prefer my script. :)
@cyplas seems (https://github.com/ufal/clarin-dspace/blob/4e892cfe0b2d53525d671941a67afdcd7b6a48f8/utilities/project_helpers/scripts/dspace-l10n-check.py) that you are the only contributor to that file. Maybe it was grabbed from some other source; I'm not sure (maybe https://wiki.lyrasis.org/pages/viewpage.action?pageId=19006307). However, if it's not mentioned in the wiki and if non of the documented (in the wiki) scripts actually uses it, it should be save to remove.

Feel free to draft the wiki update here in the comments. I'll let you know if I got the desired output based on the draft. Does that sound ok?

@cyplas
Copy link
Collaborator Author

cyplas commented Dec 8, 2021

@cyplas seems (https://github.com/ufal/clarin-dspace/blob/4e892cfe0b2d53525d671941a67afdcd7b6a48f8/utilities/project_helpers/scripts/dspace-l10n-check.py) that you are the only contributor to that file. Maybe it was grabbed from some other source; I'm not sure (maybe https://wiki.lyrasis.org/pages/viewpage.action?pageId=19006307). However, if it's not mentioned in the wiki and if non of the documented (in the wiki) scripts actually uses it, it should be save to remove.

Right, I think you or Jozef pointed me to this script when I took this up. I added it and used it, but did not change it. So I agree and removed it.

Feel free to draft the wiki update here in the comments. I'll let you know if I got the desired output based on the draft. Does that sound ok?

Ok, I suggest this:

When upgrading your clarin-dspace repository to a new release, there may be new or changed message keys (usually XML, but potentially also JS) needing translation. You can check for missing translations with check-message-translations.py:

$ python utilities/project_helpers/scripts/check-message-translations.py de

For the XML messages, you can also automatically add new message keys based on the English ones with:

$ python utilities/project_helpers/scripts/update-message-xml-translations.py de

This will add any new keys from the English messages.xml files to the German messages_de.xml. The content will be copied from the English, and the <message> will be given a TODO attribute (TODO="translate"). Note that the script will also purge messages_de.xml of any XML comments and reorder the messages to match the order in the English files.

There is also a script which checks for missing and superfluous keys relative to what is referenced in the code (although this is not totally reliable since some keys are generated dynamically):

$ python utilities/project_helpers/scripts/check-message-usages.py de

Copy link
Member

@kosarko kosarko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cyplas can you have a look at my review comments?

Comment on lines +41 to +42
for element in message.xpath('descendant-or-self::*'):
element.tag = element.tag[element.tag.index('}')+1:]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do these lines do? Seems that you are looking for } in the tag name?

moreover

sources/clarin-dspace$ (gh-merge/1000)> python utilities/project_helpers/scripts/update-message-xml-translations.py sl

Constructing temporary joint xml /tmp/messages-en.xml from all English messages.xml:
  /mnt/c/Users/ko_ok/sources/clarin-dspace/dspace-xmlui/src/main/webapp/i18n/messages.xml
  /mnt/c/Users/ko_ok/sources/clarin-dspace/dspace-xmlui/src/main/resources/aspects/SwordClient/i18n/messages.xml
  /mnt/c/Users/ko_ok/sources/clarin-dspace/dspace-xmlui/src/main/resources/aspects/Discovery/i18n/messages.xml
  /mnt/c/Users/ko_ok/sources/clarin-dspace/dspace-xmlui/src/main/resources/aspects/XMLWorkflow/i18n/messages.xml
Traceback (most recent call last):
  File "utilities/project_helpers/scripts/update-message-xml-translations.py", line 42, in <module>
    element.tag = element.tag[element.tag.index('}')+1:]
ValueError: substring not found

But I must say python utilities/project_helpers/scripts/update-message-xml-translations.py sl finished on one of four runs, which is even more confusing

Copy link
Collaborator Author

@cyplas cyplas Dec 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this is hacky, and, as your test runs indicate, not robust. Curly braces are used to wrap a namespace in tag names (e.g., "{http://apache.org/cocoon/i18n/2.1}message"). In constructing the temporary joint English messages.xml (which is in check_message_lib.py, already from before), I had to deal with the fact that for some reason, some of the English messages.xml files have a default namespace (xmlns attribute) and others don't. But it looks like the order in which I construct the joint xml isn't deterministic (oops, set instead of list!), and that the namespace in the joint xml is not guaranteed. If and when this joint xml doesn't have a default namespace, your ValueError must be triggered.

Ok, I can fix the joint xml construction, and maybe it's best to make sure it does not have a namespace (since messages_cs.xml and messages_sl.xml don't). And then these lines can and must be removed.


import argparse
import os
import lxml.etree as lxml
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll be a bit nitpicky here, but do we need lxml?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha, probably not. I'll try without.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants