[Proposal] Localizable String Interpolation #7529

kg · 2023-09-12T22:50:58Z

kg
Sep 12, 2023
Collaborator

Localizable String Interpolation

In C# 6 string interpolation using $"..." was introduced. It's the bees' knees (or for readers in en-US-CA, "hella sick"). Over time significant improvements have been made like the introduction of DefaultInterpolatedStringHandler for high-performance interpolation, and in .NET 8 the CompositeFormat class was introduced that approaches the problem from a different direction, allowing pre-parsing of old-style string.Format format strings. This means that users seeking high-performance non-localized logging, debug messages, etc need look no further than $"..." and users of classic string.Format with its {0}s and {1:R}s can easily benefit from performance improvements without losing any existing localization functionality.

But what about localizing interpolated strings? And how can we localize them easily and efficiently, without lots of boxing, temporary allocations or redundant parsing?

Interpolated string handlers allow zero-allocation construction of complex format strings at runtime, while allowing the use of $"..." for clear and straightforward code. However, the literal components of the string are lowered directly into the compiled code, as is the order of those literals. This makes the resulting string near-impossible to localize without abandoning the use of $"...". Were you to use the FormattableString type that would unlock the facility to localize the string (yourself) by looking up its .Format in a string table and using .GetArguments or .GetArgument to fetch its arguments, but this involves boxing and multiple temporary allocations, so the performance is not great.
The new CompositeFormat API combined with existing Resource infrastructure would allow a disciplined development team to take each of their $"..." literals and hand-convert it into an old-style string.Format format string, then put those format strings into .resx files and look up the appropriate format at runtime, much like how exception messages are localized in the BCL today. The use of old-style format strings comes with many disadvantages that remain even with the aid of CompositeFormat, unfortunately - mandatory boxing for strings with more than a small number of insertions, more room for error due to positional instead of semantic expansions, and less implicit context for translators due to positional expansions.

An ideal solution would combine our existing tooling and features with some new smarts to enable localizing $"..." interpolated literals with minimal changes to existing code, which would enable every C# developer to easily add localization to their software in an incremental fashion without losing development velocity or introducing new bugs. If done right, this new solution could also enable new scenarios for high-performance string formatting, logging, and localization.

The examples I provide below are real-world examples from my main domain of expertise, game development, so they don't necessarily map 1:1 to localization scenarios in realms like web development or enterprise software, but hopefully they will be comprehensible.

What An Ideal Solution Would Offer

First, the mandatory:

Developers can keep their existing $"..." string literals, more or less as-is, albeit with a little work to make them localizable.
This might mean assigning the literal to a local of a specific type (using implicit conversion to tell Roslyn to get to work), or annotating it in some way. But an interpolated literal like $"hello {name}, you are our {customerCount++}th customer today!" should be localizable without too much work from the developer.
Developers should be able to leverage existing APIs like DefaultInterpolatedStringHandler and CompositeFormat, not to mention existing technologies like .resx files, though ideally all of this would happen automatically. Having to introduce an entire new set of APIs and types to support this would be undesirable.

Then the nice-to-haves:

A given $"..." literal would transform into a sort of closure at compile time, containing a localizable reference to the format along with each value needed to format it. This closure would be a struct, allowing its use and indefinite storage without a heap allocation. (Some use cases would involve unavoidable boxing of the closure, however.)
The localizable formats would retain the text of the interpolation expressions in some form i.e. {expr1} {expr2} instead of {0} {1}. This provides valuable context for a translator, makes it easier to identify a given string when skimming a string table or inspecting state in the debugger, and makes structurally-identical-but-semantically-different strings distinct from each other.
A stored literal instance would be reusable and retargetable, allowing you to format it in multiple cultures dynamically using the values captured at creation time.
Literals would be composable without the need for intermediate ToString calls. i.e.:

var damageText = Data.FinalDamage > 0 ? $"{Data.FinalDamage} {Data.Type}" : "no";
var logMessage = (Actors.Attacker == Actors.Target)
    ? $"{Actors.Target} lost {Data.FinalDamage} {Stat}"
    : $"{sourceText} dealt {damageText} damage to {targetText}";

in this case we would want to be able to capture the expression Data.FinalDamage > 0 ? $"{Data.FinalDamage} {Data.Type.ToKeyword().Text}" : "no" as a strongly typed literal instance, and then utilize it in the construction of logMessage without ever calling ToString on it.

Basic branching scenarios like (cond ? $"a" : $"b") would work, by turning the ternary expression into an instance of a single closure type that selects a different format string based on the value of cond. This would be hard to do automatically, but is a very valuable tool to have. For the example above, this would allow damageText to always have the same type regardless of which arm is selected.
Basic table lookup scenarios would work, for example mapping enum values to localized strings with a switch would be possible with an instance of a single closure type that selects a different format string based on the enum value. This is basically a superset of localizing ternary expressions. i.e.:

var onTurnText = (onTurnIndex - currentTurnIndex) switch {
  0 => $"{something} on this turn",
  1 => $"{something} on next turn",
  _ => $"{something} on turn {onTurnIndex}"
}

The Vague Shape Of A Possible Solution

I've recently prototyped an implementation of some of these ideas, without the benefit of any changes to the C# language or roslyn compiler stack, to feel out the benefits and identify challenges. So far in testing with a moderately complex video game, it has successfully reduced allocation counts and improved performance, while also making it easy to allow hot-swapping between languages at run-time or loading a modified string table after application start. Based on this experience, here's my rough proposal for what a good solution would look like:

We introduce a 'localized interpolation literal', with functionality and syntax equivalent to $"...". I don't know what the syntax for this would look like, so for the purposes of this proposal I will use the intentionally-not-feasible $🌎"...". Changing an existing interpolation literal to this new syntax would cause its type to change from string to an unnamed type conforming to certain protocols, much like the unnamed type generated by new { a = 1 }.
Each localized interpolation literal produces an unnamed struct type that contains the following:
- A unique string table key which can be used to find its format string (or CompositeFormat instance) in a localization table
- The value of each interpolation expression, stored in an appropriately-named and appropriately-typed field.
- An emitter method that is responsible for emitting a specified interpolation expression's value, i.e.

void EmitValue<TOutput> (ref TOutput output, string key) { 
  if (key == "a") 
    output.AppendFormatted(this.a); 
  else 
    throw new ArgumentOutOfRangeException(nameof(key));
}

Each localized interpolation literal produces an entry in some sort of compiler-generated string table. In my prototype, this requires the developer to assign a unique name to identify each string literal by moving it to a separate 'string definitions' file. In an ideal implementation the compiler could generate a default name by hashing the source string, and we could allow assigning it a name via an attribute, a helper method, or some new syntax, i.e. for System.Text.LocalizedInterpolation.Localized("TheValueOfA", $🌎"a's value is {a}.") we would get a string table entry somewhere - perhaps somewhere resource-y - like:

<stringtable culture="unspecified">
  <!-- ... -->
  <string key="TheValueOfA">a's value is {a}.</string>
  <!-- ... -->
</stringtable>

At runtime a given string table can be loaded, with each string being parsed into a CompositeFormat-like representation I'll call an InterpolationFormat. This can probably actually be done via CompositeFormat with a little extra work. An InterpolationFormat can be used as the driver to convert a given localized interpolation literal into text, i.e.

using static System.Text.LocalizedInterpolation;
var output = new StringBuilder();
var literal = Localized("TheValueOfA", $🌎"a's value is {a}."); // unnamed type
InterpolationFormat fmt = Resources.TheValueOfA;
fmt.Format(output, in literal);

would under the hood use the data from the parsed InterpolationFormat to step through and either append string literals or call the literal's EmitValue method as appropriate.

Generated literal types provide a default ToString implementation and implement the relevant interfaces like IFormattable or whatever's appropriate, which will look up the right format string in the appropriate string table based on the thread's current/default culture. This allows taking one of these literals and passing it to methods that only accept String quite easily (it could even expose an implicit conversion operator to string, but probably shouldn't.)
I don't know how to support ternaries or switch expressions in this proposed model. Maybe whatever source generator or compiler machinery handles this would be able to recognize common scenarios, and if it fails to reduce the ternary-or-switch to a single closure type, you would get an error much like you do for ambiguous implicitly-typed ternary expressions today. In my prototype the implementation of the StringTableKey getter on the interpolated literal type is itself a switch expression that selects a different key depending on the value inside the closure. For the onTurnText example from above, we might have something like:

using static System.Text.LocalizedInterpolation;
var onTurnText = Localized("OnTurnText", (onTurnIndex - currentTurnIndex) switch {
  0 => $🌎"{something} on this turn",
  1 => $🌎"{something} on next turn",
  _ => $🌎"{something} on turn {onTurnIndex}"
});

with corresponding string table entries for each arm:

<stringtable culture="unspecified">
  <!-- ... -->
  <string key="OnTurnText_0">{something} on this turn</string>
  <string key="OnTurnText_1">{something} on next turn</string>
  <string key="OnTurnText_default">{something} on turn {onTurnIndex}</string>
  <!-- ... -->
</stringtable>

and a generated key selector like:

readonly struct __OnTurnText : ILocalizedInterpolationString {
  public readonly int __arm;
  // ...
  
  public string StringTableKey => __arm switch {
    0 => "OnTurnText_0",
    1 => "OnTurnText_1",
    _ => "OnTurnText_default"
  }
  
  // ...
}

note that in this case it's fine for a given arm to not use all the values available in the closure, and the closure type for a switch with multiple arms needs to make sure it captures every interpolation expression that might appear in the string table.

Ideally the string table generation process would preserve context to ease the translation process and reduce ambiguity. For example, my prototype generates string table entries like the following for a switch-expression:

    <!-- (canUse.Result) == CanUseActionResult.NoValidTargets -->
    <Literal Name="EncounterLog_CanUseReasonText_38a11c2c4c7deca7">No valid targets</Literal>
    <!-- (canUse.Result) == CanUseActionResult.InsufficientStamina -->
    <String Name="EncounterLog_CanUseReasonText_ba2ece7361a3bd09">{cta.Caster} has insufficient Stamina</String>
    <!-- (canUse.Result) == CanUseActionResult.InsufficientInitiative -->
    <String Name="EncounterLog_CanUseReasonText_56b502b1bc548589">{cta.Caster} has insufficient Initiative</String>
    <!-- (canUse.Result) == CanUseActionResult.CostNotMet -->
    <String Name="EncounterLog_CanUseReasonText_fc699e9a2499f138">{cta.Caster} has insufficient resources</String>

Q&A

This section attempts to answer common questions and clarify some points.

Is the primary concern performance? Or developer experience?
- I believe the primary concern here is developer experience - if we get this right, any C# application that already uses interpolated strings can be localized with minimal effort, which will increase the number of localized applications out there in the community. Lots of software goes unlocalized because it takes a lot of work and expertise from the developer to get to the point where there are string tables to hand to a translator.
- Performance is essential for being able to say "and this won't make your application worse" - any interpolated strings in performance-critical code will still be efficient, so the developer can safely localize all their strings. But some amount of slowdown compared to existing interpolated strings is probably unavoidable. We just want to get close enough.
Is the text duplicated?
- No, the 'template' text from the interpolated literals lives only in the string table generated by the build process. The template text no longer lives in the assembly next to other string literals, though the invariant table (and other tables) could be resources embedded into the assembly.
Do we need the string key like TheValueOfA in your examples?
- No, in many cases an auto-generated key (probably a hash of the template) will suffice, and that should be what a user gets when they first transition their application to localized interpolated literals. It's just important to be able to assign semantic names to strings for localization purposes.
Can we inline the EmitValue method?
- I don't know of any easy way to do this. However, if the generated closure for each literal is a struct and EmitValue is defined in an interface, the emitter can accept the closure type as a generic argument, and invocations of EmitValue will not be virtual or require boxing.
Should the EmitValue key be a string? Won't that be expensive?
- Probably not, but I consider that an implementation detail. In my prototype it hasn't been a problem since the keys are interned strings and roslyn already knows how to optimize a switch over string literals.
What happens if I edit the text in my C# source files and don't edit the string tables?
- The invariant table will be re-generated - you shouldn't ever have a reason to edit it.
- The tables for other languages will be out of date, which will either result in missing strings or incorrect localized text. I consider this okay, this is a problem you would run into with other approaches. The community can come up with tooling to make this problem easier to avoid (like string table diffing, etc.)
Why are synthetic localization keys important? Why not just use a hash?
- For most cases a hash of the invariant text would suffice as a localization key, because it uniquely identifies the string at its use location and it means that if the same interpolated literal appears in multiple places, it only needs to be translated once. This makes a hash a fantastic default when no synthetic key is provided.
- However, in practice the same interpolated string needs to be localized differently depending on context and depending on the semantic meaning of the interpolation expressions. Three simple examples:
- For the toy example of $"Hello, {firstName} {lastName}!", in some cultures you would reverse the order of the first and last name. If the template is $"Hello, {0} {1}!" that semantic meaning is lost, and it is harder to tell whether it is {firstName} {lastName} or perhaps {prefix} {lastName} i.e. "Mr. Smith".
- For the toy example of $"Hello, {prefix} {lastName}!", the words and grammar used when localizing depend on context. In some cultures, you use different words and different grammar depending on the status of both the speaker and the listener. This also makes it valuable to keep the semantic meaning. (This also provides a use case for ternaries/switch expressions as described above.)
- For a string like $"{count} item(s) found", in some cultures the word 'items' may be translated into one of many different words depending on the nature of the item being counted. Given that, it is valuable to be able to assign a semantic name to this string, and switch expressions could also be useful.

Proposal In Action

Let's examine a sample scenario and walk through the process of how a developer would localize it and maintain it. Our hypothetical end user Sarah Developer is building some client software that updates a local file, and has started writing an error handler. When an error occurs while saving changes, she selects an appropriate error message to show to the user in a MessageBox based on the cause of the error:

var errorMessage = error.code switch {
  DbSaveError.RecordLocked => $"The record '{recordName}' is locked and cannot be edited.",
  DbSaveError.RecordInUse => $"The record '{recordName}' is in use by {error.inUseByUserName}.",
  DbSaveError.InvalidColumnData => $"The value for column '{error.columnName}' is invalid.",
  _ => $"An error occurred while saving: '{error.message}'."
};
MessageBox.Show(errorMessage);

Everything is going well, and then Sarah gets a request to prepare the error handler for localization so that the company's users can see these error messages in their preferred language. Using this proposal, her first step is to turn each of the interpolated string literals into localizable ones, and then update the use of the message based on its new type:

var errorMessage = error.code switch {
  DbSaveError.RecordLocked => $🌎"The record '{recordName}' is locked and cannot be edited.",
  DbSaveError.RecordInUse => $🌎"The record '{recordName}' is in use by {error.inUseByUserName}.",
  DbSaveError.InvalidColumnData => $🌎"The value for column '{error.columnName}' is invalid.",
  _ => $🌎"An error occurred while saving: '{error.message}'."
};
MessageBox.Show(errorMessage.ToString());

At this point, the type of errorMessage has changed from string to what we'll call __interpolated_string_1, a struct. The switch expression now lowers to the creation of a struct-typed closure:

var errorMessage = new __interpolated_string_1 { 
  error_code = error.code, 
  recordName = recordName, 
  error_inUseByUserName = error.inUseByUserName, 
  error_columnName = error.columnName,
  error_message = error.message,
};
MessageBox.Show(errorMessage.ToString());

and the generated closure type contains a key selector that encodes the logic of the now-missing switch expression, along with an emitter method:

internal struct __interpolated_string_1 : ILocalizableInterpolatedString {
  // ... record fields elided ...
  readonly string ILocalizableInterpolatedString.StringKey => this.error_code switch {
    DbSaveError.RecordLocked => "654788b7aec8e0e90927bc88a3d4036ddd547f5bbfb0ff2c2444530fa420e72a",
    DbSaveError.RecordInUse => "40e41d4d7683111e3f0c663e9ce0e608d3dacb8ec739e9722bc67245abd4ed23",
    DbSaveError.InvalidColumnData => "1d184c13249805c575717196f581aafe0354d76bb86e7255ab2b9a09409232e8",
    _ => "1fc381273cd5ee4ac32fee9e249e5de19d8c3dc7c8b57ecf5468ae16991256c0"  
  }
  // ... implementation details elided ...
  readonly void ILocalizableInterpolatedString.EmitValue (StringBuilder output, string key) {
    // NOTE: output would not actually be a StringBuilder. The correct type is unclear
    switch (key) {
      case "error.code":
        output.Append(this.error_code);
        break;
      case "recordName":
        output.Append(this.recordName);
        break;
      // ... more cases elided ...
      default:
        throw new ArgumentOutOfRangeException(nameof(key));
    }
  }
}

alongside all of this, some part of the build pipeline (either the compiler or an analyzer) has generated an invariant string table containing the literals that used to live with the rest of the assembly's string literals (this is probably an embedded resource), and it looks something like this:

<strings culture="invariant" there-would-be-more-attributes-here="but i omitted them">
  <!-- ... -->
  <template id="654788b7aec8e0e90927bc88a3d4036ddd547f5bbfb0ff2c2444530fa420e72a">
    <text>The record '{recordName}' is locked and cannot be edited.</text>
  </template>
  <template id="40e41d4d7683111e3f0c663e9ce0e608d3dacb8ec739e9722bc67245abd4ed23">
    <text>The record '{recordName}' is in use by {error.inUseByUserName}.</text>
  </template>
  <template id="1d184c13249805c575717196f581aafe0354d76bb86e7255ab2b9a09409232e8">
    <text>The value for column '{error.columnName}' is invalid.</text>
  </template>
  <template id="1fc381273cd5ee4ac32fee9e249e5de19d8c3dc7c8b57ecf5468ae16991256c0">
    <text>An error occurred while saving: '{error.message}'.</text>
  </template>
  <!-- ... -->
</strings>

At runtime, some part of the stack is responsible for loading this data - let's assume we're using .resx and ResourceManager. During this loading process the template strings could be validated in advance (instead of at time of first use), and we could pre-parse them using CompositeFormat for better performance - but those details aren't important right now. The lowered code above followed the creation of the closure with a call to ToString, in order to show the message in a MessageBox. An unoptimized ToString implementation might look something like this:

internal struct __interpolated_string_1 : ILocalizableInterpolatedString {
  // ... everything else elided ...
  public readonly override string ToString () {
    var key = this.StringTableKey;
    var template = __interpolated_strings_implementation_details.ResourceManager.GetString(key);
    var parsedTemplate = System.Text.LocalizedInterpolation.ParseTemplate(template);
    var builder = new StringBuilder();
    parsedTemplate.FormatInto<__interpolated_string_1>(ref readonly this, builder);
    return builder.ToString();
  }
}

At this point it's now possible to call ToString on our localized interpolation string closure and get a string out of it, and the temporary allocations seen in the sample code are all possible to optimize out. Let's move on to the important question: How do you localize it?

Our protagonist Sarah Developer has finished updating her code to use localized interpolation, and checked the compiler-generated .resx file into source control. (This duplication is unpleasant, but is already present if you look at how .resx localization works in a repository like dotnet/runtime. Critically, it is automatic.) She hands the .resx file off to the localization team to be localized.

The localization team immediately comes back and asks: "Why don't these strings have identifiers or descriptions?" A fantastic question. Sarah reads the relevant documentation and realizes that she can easily assign this interpolated string a name, and does so by updating the original code:

using static System.Text.LocalizedInterpolation;
var errorMessage = Localized("UserRecordSaveFailed", error.code switch {
  DbSaveError.RecordLocked => $🌎"The record '{recordName}' is locked and cannot be edited.",
  DbSaveError.RecordInUse => $🌎"The record '{recordName}' is in use by {error.inUseByUserName}.",
  DbSaveError.InvalidColumnData => $🌎"The value for column '{error.columnName}' is invalid.",
  _ => $🌎"An error occurred while saving: '{error.message}'."
});

This results in a new, clearer string table, something like the following:

<strings culture="invariant" there-would-be-more-attributes-here="but i omitted them">
  <!-- ... -->
  <template id="UserRecordSaveFailed_DbSaveError_RecordLocked">
    <text>The record '{recordName}' is locked and cannot be edited.</text>
  </template>
  <template id="UserRecordSaveFailed_DbSaveError_RecordInUse">
    <text>The record '{recordName}' is in use by {error.inUseByUserName}.</text>
  </template>
  <template id="UserRecordSaveFailed_DbSaveError_InvalidColumnData">
    <text>The value for column '{error.columnName}' is invalid.</text>
  </template>
  <template id="UserRecordSaveFailed_default">
    <text>An error occurred while saving: '{error.message}'.</text>
  </template>
  <!-- ... -->
</strings>

This string table can be handed off to the localization team who, consulting documentation, know that they can make a culture-specific version of it containing translated text. However, they ask Sarah for more detail on one of the messages. She obliges by updating her code once more, perhaps like this:

using static System.Text.LocalizedInterpolation;
var errorMessage = Localized("UserRecordSaveFailed", error.code switch {
  DbSaveError.RecordLocked => $🌎"The record '{recordName}' is locked and cannot be edited.",
  DbSaveError.RecordInUse => $🌎"The record '{recordName}' is in use by {error.inUseByUserName}.",
  DbSaveError.InvalidColumnData => $🌎"The value for column '{error.columnName}' is invalid.",
  _ => Localized(
    "UserRecordSaveFailed_UnknownError", 
    $🌎"An error occurred while saving: '{error.message}'.", 
    note: "This message is used for unknown errors and embeds the message produced by the database when the save operation failed."
  )
});

The Localized method has an optional parameter the developer can use to provide in-line commentary just like a code comment, and it flows through to the string table. Now the resulting invariant string table is truly ready for the localization team (I won't include it here again). Let's assume it works like existing resx string tables, so the invariant string is also there in the table the localizers create, so it looks something like this:

<strings culture="ja-JP" there-would-be-more-attributes-here="but i omitted them">
  <!-- ... -->
  <template id="UserRecordSaveFailed_DbSaveError_RecordLocked">
    <for>The record '{recordName}' is locked and cannot be edited.</for>
    <text state="translated">このレコード「{recordName}」は今ロックされているので、変更できません。</text>
  </template>
  <template id="UserRecordSaveFailed_DbSaveError_RecordInUse">
    <for>The record '{recordName}' is in use by {error.inUseByUserName}.</for>
    <text state="translated">このレコード「{recordName}」は、今「{error.inUseByUserName}」にアクセス中です。</text>
  </template>
  <template id="UserRecordSaveFailed_DbSaveError_InvalidColumnData">
    <for>The value for column '{error.columnName}' is invalid.</for>
    <text state="translated">この列内{error.columnName}の数値が不正です。</text>
  </template>
  <template id="UserRecordSaveFailed_UnknownError">
    <for>An error occurred while saving: '{error.message}'.</for>
    <note>This message is used for unknown errors and embeds the message produced by the database when the save operation failed.</note>
    <text state="translated">セーブ中エーラ発生しまし: 「{error.message}」。</text>
  </template>
  <!-- ... -->
</strings>

This new string table can get checked in to source control next to the auto-generated one, and at build time gets bundled up with all the application's other resources into a satellite assembly or embedded resource.

A week passes, and Sarah gets reports that the UnknownError message is occurring frequently for cases where a record was deleted while the user was editing it. She is asked to add a specialized error message for this scenario, so she updates the code to add a new switch arm. She also looks through her open issues and notices a request to revise one of the other error messages, and does so:

using static System.Text.LocalizedInterpolation;
var errorMessage = Localized("UserRecordSaveFailed", error.code switch {
  DbSaveError.RecordLocked => $🌎"The record '{recordName}' is locked and cannot be edited. Please contact your supervisor to unlock it.",
  DbSaveError.RecordInUse => $🌎"The record '{recordName}' is in use by {error.inUseByUserName}.",
  DbSaveError.RecordWasDeleted => $🌎"The record '{recordName}' has been deleted.",
  DbSaveError.InvalidColumnData => $🌎"The value for column '{error.columnName}' is invalid.",
  _ => Localized(
    "UserRecordSaveFailed_UnknownError", 
    $🌎"An error occurred while saving: '{error.message}'.", 
    note: "This message is used for unknown errors and embeds the message produced by the database when the save operation failed."
  )
});

This updates the invariant string table, and when she prepares to commit to revision control, the diff for the string table looks something like this:

  <template id="UserRecordSaveFailed_DbSaveError_RecordLocked">
-   <text>The record '{recordName}' is locked and cannot be edited.</text>
+   <text>The record '{recordName}' is locked and cannot be edited. Please contact your supervisor to unlock it.</text>
  </template>
  ...
  </template>
+ <template id="UserRecordSaveFailed_DbSaveError_RecordWasDeleted">
+   <text>The record '{recordName}' has been deleted.</text>
+ </template>
  <template id="UserRecordSaveFailed_DbSaveError_InvalidColumnData">

If Sarah's team uses automated localization tooling, it may automatically update all the other string tables based on this diff or file issue tickets. If not, she can glance at this diff and notify the localization team of the necessary changes.

HaloFour · 2023-09-12T23:39:15Z

HaloFour
Sep 12, 2023

This seems like 99% tooling and 1% language request, which to me feels really awkward. Does that 1% need a language feature or can't it be built out of facilities already provided by the language, such as custom string interpolation handlers? I'd hate to admit it, but it also seems like interceptors paired with an analyzer and source generator could also accomplish this with existing syntax as well, including all of the tooling support around the resource files.

2 replies

kg Sep 12, 2023
Collaborator Author

I think almost all of it doesn't need language changes, but I wasn't able to come up with any workaround for the need to change how interpolated string literals work.

HaloFour Sep 13, 2023

I would expect that if Localized accepted a custom interpolated string handler as the second parameter that you'd already get very close to supporting what you want to do.

[InterpolatedStringHandler]
public ref struct LocalizedInterpolatedStringHandler {
    // members omitted
}

public static class LocalizedInterpolation {
    public static string Localized(string template, LocalizedInterpolatedStringHandler handler) { ... }
}

Then you could call it like this:

Localized("UserRecordSaveFailed", error.code switch {
  DbSaveError.RecordLocked => $"The record '{recordName}' is locked and cannot be edited. Please contact your supervisor to unlock it.",
  DbSaveError.RecordInUse => $"The record '{recordName}' is in use by {error.inUseByUserName}.",
  DbSaveError.RecordWasDeleted => $"The record '{recordName}' has been deleted.",
  DbSaveError.InvalidColumnData => $"The value for column '{error.columnName}' is invalid.",
  _ => $"default message"
}

The custom interpolator should be able to handle much of the rest. I'd be curious where that would hit a snag.

RikkiGibson · 2023-09-13T20:02:04Z

RikkiGibson
Sep 13, 2023
Maintainer

https://devblogs.microsoft.com/dotnet/performance-improvements-in-net-8/#string-formatting feels relevant to this proposal.

0 replies

iam3yal · 2023-09-13T23:06:17Z

iam3yal
Sep 13, 2023

There are so many moving parts in this system that can be greatly simplified and do without language, compiler, tooling changes depending on what needed to achieve here but I really think that before we're discussing how and what needs to change it's better to have a general discussion about localization where people should share how they are solving it today in their products, how it's done in other platforms and why existing solutions in .NET aren't sufficient and such so really research this topic and gather enough feedback then we can speak about what should be done in the .NET ecosystem to make it better so imo delving into details and possible implementation(s) just makes it a futile discussion.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Proposal] Localizable String Interpolation #7529

{{title}}

Replies: 3 comments 2 replies

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

[Proposal] Localizable String Interpolation #7529

kg Sep 12, 2023 Collaborator

Localizable String Interpolation

What An Ideal Solution Would Offer

The Vague Shape Of A Possible Solution

Q&A

Proposal In Action

Replies: 3 comments · 2 replies

HaloFour Sep 12, 2023

kg Sep 12, 2023 Collaborator Author

HaloFour Sep 13, 2023

RikkiGibson Sep 13, 2023 Maintainer

iam3yal Sep 13, 2023

kg
Sep 12, 2023
Collaborator

Replies: 3 comments 2 replies

HaloFour
Sep 12, 2023

kg Sep 12, 2023
Collaborator Author

RikkiGibson
Sep 13, 2023
Maintainer

iam3yal
Sep 13, 2023