Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] add localized unit names #974

Conversation

lenyturmel
Copy link

WORK IN PROGRESS (WIP) to manage the localization of unit names
You can find tags "WIP LTU" in the code to see the changes

  • Add singular and plural names for LenghtUnit.Meter in the JSON file
  • Add properties SingularName & PluralName in Localization to be set a JSON parsing and to be get later
  • In QuantityJsonFilesParser, build the SingularName and PluralName using the prefixes in the method GetLocalizationForPrefixUnit
  • add TU UnitLocalizedNameTests to test the localization of unit names

lenyturmel added 2 commits October 6, 2021 17:36
WORK IN PROGRESS  (WIP) to manage the localization of  unit names
You can find tags "WIP LTU" in the code to see the changes

- Add singular and plural names for LenghtUnit.Meter in the JSON file
- Add properties SingularName & PluralName  in Localization to be set a JSON parsing and to be get later
- In QuantityJsonFilesParser, build the SingularName and PluralName using the prefixes in the method GetLocalizationForPrefixUnit
- add TU UnitLocalizedNameTests to test the localization of unit names
FEAT:
- add agin the Singular/PluralName properties in QuantityJsonFilesParser (mandatory to have default names)
- Call to CodeGen for the generation of dynamic classes
-  Add classes UnitLocalizationCache and UnitValuetoStringLoopup to manage a cache of unit values and string representations
- UnitAbbreviationCache now use UnitLocalizationCache as an helper of cache (public API has not been changed)

FIX
- TU UnitParserTests.Parse_AmbiguousUnitsThrowsException : force parsing unit with "en-US" culture (the TU may fail if the current UI thread has another culture, like "fr-FR")

TU :
- all are ok
@lenyturmel
Copy link
Author

lenyturmel commented Oct 7, 2021

WORK IN PROGRESS
I finally undo the suppression of Singular/PluralName properties in QuantityJsonFilesParser
I think it is mandatory to have default values for Singular and plural name. The names of the localization should be optional.
And more, it prevents from breaking the backward compatibility of the public API.

I added classes UnitLocalizationCache and UnitValuetoStringLoopup to have a more generic approach about caching string representations mapped to unit values
UnitValuetoStringLoopup is used by UnitLocalizationCache

Any specific cache (UnitAbbreviationCache, UnitSingularNameCache, ...) can be composed of an instance of UnitLocalizationCache in order to manage its specific cache
So, UnitAbbreviationCache now use UnitLocalizationCache as an helper of cache, with no changes of the public API

FIX

  • TU UnitParserTests.Parse_AmbiguousUnitsThrowsException : force parsing unit with "en-US" culture (the TU may fail if the current UI thread has another culture, like "fr-FR")

[EDIT 10/08/2021]
We can now define and use localizes singular names
I write some simple tests and some JSON data for testing and demo purpose

@angularsen could you please take a look to this draft to see if I need to make some modifications before doing the same job for supporting the plural names of units?
thx

TODO Create caches UnitPluralNamesCache
TODO Add TUs for UnitPluralNamesCache
TODO I am not sure but it seems that a lof of methods MapXXX of the public API are not useful and should only bu used of internal purpose of the cache. As this maangement is now delegated to UnitLocalizationCache, maybbe it would be better to remove those methods MapXXX from UnitAbbreviationCache

- Add localization for long names of the prefix in PrefixInfo
- Add cache and cache generators for managing singular names
- Add generation of the cache of singular names
- Units with no localized singular name for a culture will use the default singular name of a unit (J
- Add russian singular names  for the length units for tests & samples

TU
- Add TU UnitSingularNamesCacheTests for testing the localization of singular names
- all TU passed with success

Generation of dynamic classes done with success
@lenyturmel lenyturmel closed this Oct 8, 2021
@lenyturmel lenyturmel reopened this Oct 8, 2021
@angularsen
Copy link
Owner

@lenyturmel I'll try to get to it soon.

Copy link
Owner

@angularsen angularsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some initial comments, I haven't gotten through it all yet.
Impressive amount of solid work here! 👏

UnitsNet/CustomCode/UnitSingularNamesCache.cs Outdated Show resolved Hide resolved
@@ -17,21 +17,12 @@ namespace UnitsNet
/// <summary>
/// Cache of the mapping between unit enum values and unit abbreviation strings for one or more cultures.
/// A static instance <see cref="Default"/> is used internally for ToString() and Parse() of quantities and units.
///
/// TODO It may be a good thing to remove all unesfull methods MapXXX from the public API. MapXXX should be only used for internal purpose of the cache <see cref="UnitAbbreviationsCache"/>
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Public Map methods are exposed to support third party quantities and units, both for conversions and parsing. See HowMuchUnit as an example in tests.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

/// Create an instance of the cache
/// </summary>
/// <param name="stringName">Information about what the stored strings represents (abbreviation, singular name, ...). It could be useful for error messages or debug logs</param>
/// <param name="generatedLocalizedStrings">For each culture, for each unit type, for each vlaue, give a localized representation of the unit value</param>
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo vlaue

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@@ -0,0 +1,1730 @@
//------------------------------------------------------------------------------
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, this is not obvious, but Windows Runtime Component is no longer maintained.
The idea is to let it receive new units from JSON, but we are not adding more features to it.

You already went through the effort here so we can keep the change if you like and as long as it compiles and runs, but I would generally advise against making changes to WRC because it has bitten us in the ass too many times with its restrictive type system that breaks our builds on innocent changes. I also don't have a runtime available locally to test it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure to understand how it works.
I made the modification to be sure there is no regression. You confirm I still have to run the CodeGen but I do not need to maintain this file?

Copy link
Owner

@angularsen angularsen Nov 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I meant is that you don't need to update files in CodeGen\Generators\UnitsNetWrcGen, UnitsNetWrcGenerator.cs etc. to receive the new localization feature.

This way, WRC still receives new updates when new units are added to JSON files and code is regenerated, but it does not receive new features like new methods and properties in generated code. WRC is also being completely removed in #982 anyway.

{
_output.WriteLine(UnitSingularNamesCache.Default.GetDefaultSingularName(LengthUnit.Meter, RussianCulture));
_output.WriteLine(UnitSingularNamesCache.Default.GetDefaultSingularName(LengthUnit.Decimeter, RussianCulture));
_output.WriteLine(UnitSingularNamesCache.Default.GetDefaultSingularName(LengthUnit.Kilometer, RussianCulture));
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would remove these, they look like they exist for debugging purposes.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, you are right. Sorry

}

[Fact]
public void GetDefaultASingularNameFallsBackToUsEnglishCulture()
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo ASingularName

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you push your latest fixes? The typo is still here and also the debugging ones mentioned above. If you are not done yet that is totally fine, just checking.

UnitsNet.Tests/UnitSingularNamesCacheTests.cs Outdated Show resolved Hide resolved
/// </summary>
public UnitSingularNamesCache()
{
_cache = new UnitLocalizationCache("singular name", GeneratedLocalizations);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is better if the empty ctor doesn't load all the generated localizations. This way it is possible to test and reuse without getting all the defaults along with it.

If I recall correctly, UnitAbbreviationsCache and UnitConverter both offer a static Default property with all the built-in mappings and converters from JSON. The ctor itself, however, creates an empty cache or empty unit converter that you can use for third-party units.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure to understand. For me, the behavior is similar to the previous one except that the call to the LoadGeneratedAbbreviations is done in UnitLocalizationCache now:

/// </summary>
public UnitSingularNamesCache()
{
	_lookupsForCulture = new Dictionary<IFormatProvider, UnitTypeToLookup>();
	LoadGeneratedAbbreviations();

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extract GeneratedLocalizations to an optional ctor parameter, and pass it when assigning Default in the static ctor:

Default = new UnitSingularNamesCache(GeneratedLocalizations);

This way, when calling new UnitSingularNamesCache() without a parameter, you get an empty cache that you can call MapUnitToSingularName and GetUnitSingularName without having all the default ones loaded. Useful for tests, useful for consumers of UnitsNet that wants to reuse the structure for third-party units or a subset of the built-in units.

Did that make sense?


namespace UnitsNet.Tests
{
[CollectionDefinition(nameof(UnitLocalizedNamesCacheFixture), DisableParallelization = true)]
Copy link
Owner

@angularsen angularsen Oct 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only reason we use CollectionDefininition from before is to disable parallelization of tests that manipulate static values like CurrentCulture, so maybe we don't need multiple of these.

In another project, we have been using

    public static class XunitTestCollections
    {
        /// <summary>
        ///     Disables concurrency for test classes annotated with <see cref="CollectionAttribute"/> and this name.
        ///     <br/><br/>
        ///     Use this collection on test classes with tests that are not thread-safe, such as manipulating static objects
        ///     like <see cref="CultureInfo.CurrentUICulture" /> and <see cref="Thread.CurrentThread" />.
        ///     Each test method is already run serially per test class.
        /// </summary>
        [CollectionDefinition(nameof(NotThreadSafe), DisableParallelization = true)]
        public class NotThreadSafe
        {
        }
    }

Maybe we can rename UnitAbbreviationsCacheFixture to NotThreadSafe instead and copy this xmldoc there?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CollectionDefininition itself is used to shared a context (data for example between a collection) of tests . The parallelization is just an additionnal option.
I think that:
1°) you should add a getter in UnitLocalizedNamesCacheFixture to get the Culture instance
2°) you should modify the constructors of all your TUs to inject the fixtue each time you need to get the culture

Links:
https://xunit.net/docs/shared-context
https://blog.somewhatabstract.com/tag/collectiondefinition/

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, could you give an example of how this could be used to test ToString() and Parse(), where no culture is given as parameter, but we want a consistent way to test specific cultures in the current thread? I'm very interested to learn if there is a better way to do this.

CodeGen/Generators/QuantityJsonFilesParser.cs Show resolved Hide resolved
leny.turmel added 2 commits November 16, 2021 16:32
- Clean TUs
- Remove not retained TODO
- Remove array of singular names (only use 1 singular name)
@lenyturmel
Copy link
Author

Hi : Sorry for the delay.

thank you for having reviewer my proposal.
According to your comments. I made some changes in the code and add some comments.

[PublicAPI]
public string[] GetUnitSingularNames<TUnitType>(TUnitType unit, IFormatProvider? formatProvider = null) where TUnitType : Enum =>
_cache.GetUnitStrings(unit, formatProvider);
public string GetUnitSingularNames<TUnitType>(TUnitType unit, IFormatProvider? formatProvider = null) where TUnitType : Enum =>
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename to GetUnitSingularName (singular).

Copy link
Owner

@angularsen angularsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if this is ready for review yet or not, but I took another look at the recent changes.


namespace UnitsNet.Tests
{
[CollectionDefinition(nameof(UnitLocalizedNamesCacheFixture), DisableParallelization = true)]
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, could you give an example of how this could be used to test ToString() and Parse(), where no culture is given as parameter, but we want a consistent way to test specific cultures in the current thread? I'm very interested to learn if there is a better way to do this.

}

[Fact]
public void GetDefaultASingularNameFallsBackToUsEnglishCulture()
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you push your latest fixes? The typo is still here and also the debugging ones mentioned above. If you are not done yet that is totally fine, just checking.

/// </summary>
public UnitSingularNamesCache()
{
_cache = new UnitLocalizationCache("singular name", GeneratedLocalizations);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extract GeneratedLocalizations to an optional ctor parameter, and pass it when assigning Default in the static ctor:

Default = new UnitSingularNamesCache(GeneratedLocalizations);

This way, when calling new UnitSingularNamesCache() without a parameter, you get an empty cache that you can call MapUnitToSingularName and GetUnitSingularName without having all the default ones loaded. Useful for tests, useful for consumers of UnitsNet that wants to reuse the structure for third-party units or a subset of the built-in units.

Did that make sense?

@@ -52,6 +52,7 @@ public static void Generate(string rootDir, Quantity[] quantities)

Log.Information("");
GenerateUnitAbbreviationsCache(quantities, $"{outputDir}/UnitAbbreviationsCache.g.cs");
GenerateUnitSingularNamesCache(quantities, $"{outputDir}/UnitSingularNamesCache.g.cs");
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned, I believe you can revert this entire file. Windows Runtime Component build target is no longer receiving new features.


// All units must have a unit abbreviation, so fallback to the default singular name if no localized name is defined in JSON
// var singularNames = string.IsNullOrEmpty(localization.SingularName) ? $"\"{unit.SingularName}\"" : $"\"{localization.SingularName}\"";
var singularNames = string.IsNullOrEmpty(localization.SingularName) ? "\"BOB\"" : $"\"{localization.SingularName}\"";
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BOB, is that a placeholder? Should we maybe use null or empty string instead?

@stale
Copy link

stale bot commented Mar 2, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants