Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VersionedProperty support in the JSPs #432

Closed
eggrobin opened this issue Apr 6, 2023 · 8 comments · Fixed by #659
Closed

VersionedProperty support in the JSPs #432

eggrobin opened this issue Apr 6, 2023 · 8 comments · Fixed by #659

Comments

@eggrobin
Copy link
Member

eggrobin commented Apr 6, 2023

The invariant test have a versioned property syntax, e.g., \p{U4.1:XID_Continue}, implemented in VersionedProperty.

static class VersionedProperty {
private String propertyName;
private String version;
private UnicodeProperty.Factory propSource;
private UnicodeProperty property;
private final transient PatternMatcher matcher = new UnicodeProperty.RegexMatcher();
public VersionedProperty set(String xPropertyName) {
xPropertyName = xPropertyName.trim();
if (xPropertyName.contains(":")) {
final String[] names = xPropertyName.split(":");
if (names.length != 2 || !names[0].startsWith("U")) {
throw new IllegalArgumentException("Too many ':' fields in " + xPropertyName);
}
if (names[0].equalsIgnoreCase("U-1")) {
version = LAST_VERSION;
} else {
version = names[0].substring(1);
}
xPropertyName = names[1];
} else {
version = LATEST_VERSION;
}
;
propertyName = xPropertyName;
propSource = getProperties(version);
property = propSource.getProperty(xPropertyName);
if (property == null) {
propSource = getIndexedProperties(version);
property = propSource.getProperty(xPropertyName);
if (property == null) {
throw new IllegalArgumentException(
"Can't create property from name: "
+ propertyName
+ " and version: "
+ version);
}
}
return this;
}

Something like that would be useful for list-unicodeset.jsp.
It sounds like there is limited support for multiple versions with the β syntax, but I cannot figure out how that one is implemented.

@macchiati
Copy link
Member

macchiati commented Apr 6, 2023 via email

@eggrobin
Copy link
Member Author

eggrobin commented Apr 6, 2023

For the beta, the idea is that you have multiple release versions plus 1 dev version.

So, matching the subfolders of ucd? That was what I had in mind.

Do you think we need multiple?

Multiple different βs? That sounds like it would confuse things rather than help.

@macchiati
Copy link
Member

macchiati commented Apr 6, 2023 via email

@eggrobin
Copy link
Member Author

eggrobin commented May 25, 2023

@macchiati It seems like unicodetools has (at least) two ways to look at versioned properties:

  1. ToolUnicodePropertySource (which is used to compute the derived properties to produce the new data files, as well as used by VersionedProperty in the invariants above, but has correctness issues for old versions, see The versioning of ToolUnicodePropertySource is neither tested nor correct #484);
  2. IndexUnicodeProperties (which is used by, e.g., TestProperties; I don’t know if it has similar issues to ToolUnicodePropertySource).

IndexUnicodeProperties seems more modern (compare the pre-2013 histories of IndexUnicodeProperties and ToolUnicodePropertySource), and since it does not have the derivations, it might have a better chance of being correct for older versions than ToolUnicodePropertySource. Should the invariants use it instead of ToolUnicodePropertySource?

@eggrobin
Copy link
Member Author

Should the invariants use it instead of ToolUnicodePropertySource?

Wait, they currently use both, with a preference for ToolUnicodePropertySource. So I guess the question is whether this should be inverted.

         propSource = getProperties(version);  // ToolUnicodePropertySource.
         property = propSource.getProperty(xPropertyName); 
         if (property == null) { 
             propSource = getIndexedProperties(version);   // IndexUnicodeProperties.
             property = propSource.getProperty(xPropertyName); 

@macchiati
Copy link
Member

macchiati commented May 25, 2023 via email

@eggrobin
Copy link
Member Author

eggrobin commented May 25, 2023

it should be inverted

Ah, that doesn’t work, because IndexUnicodeProperties, while closer to the truth for non-default values, it has issues with default values (I think it relies on @‌missing lines that are not always there).

At a glance, it is missing defaults at least for:

  1. CCC,
  2. the non-special values of the properties in SpecialCasing (those used to have @‌missing lines, which IndexUnicodeProperties.txt knows how to read, but they got removed in 7.0),
  3. the case foldings.

(The failings of IndexUnicodeProperties should probably turn into their own issue, but I am on a train with spotty internet, so I will be happy if this comment goes through.)

@macchiati
Copy link
Member

macchiati commented May 25, 2023 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants