- Shane Carr - Google i18n (SFC), Co-Moderator
- Ben Allen - Igalia (BAN)
- Eemeli Aro - Mozilla (EAO)
- Richard Gibson - OpenJS Foundation (RGN)
- Philip Chimento - Igalia (PFC)
- Chris de Almeida - IBM (CDA)
- Henri Sivonen - Mozilla (HJS)
- Elango Cheran - Google (ECH)
- Zibi Braniecki - Mozilla (ZB)
- Briefly: Bob Jung and Nebojša Ćirić - Google
- Discussion Board
- Status Wiki -- please update!
- Abbreviations
- MDN Tracking
- Meeting Calendar
- Matrix
BAN: Aside from the 3 normative PRs merged after TG1, this has been a relatively quiet couple of weeks. If anyone can review my Test262 PRs, it would be wonderful.
PFC: I get pings for these notifications. It is also useful when proposal champions can do them.
SFC: Activity on implementing DurationFormat in SpiderMonkey. Intl.LocaleInfo is fairly ready, could be shipped soon, we just need to make sure we add them to engines.
EAO: TPAC happened a month ago. The most significant thing with respect to TPAC for us is that EAO presented a talk with the WebExtensions community group while at TPAC about MessageFormat v2, got some support once the webextensions 2 spec is finalized.
SFC: There was a proposal to add an AI-based translation API, questions about how to do language mapping. Issue open for it. Do we (for example) accept translations from ja-JP to en-RU, even if we don’t have data on en-RU specifically. ZB has been passionate about the whole language-matching space, what to do with subtags.
SFC: Have been talking with [Yulia] on input elements, useful for currency or measure type, collaborate with W3C on input element. I agree that this is a direction we want to investigate.
HJS: Input element more WHATWG than W3C, so that would be the right venue.
SFC: WHATWG was present at TPAC; what is their relationship with W3C?
HJS: WHATWG works there, specs afterward get published through W3C, discussion is in WHATWG.
SFC: We need to be more active in this area – I’ve been pushing for some times, it’s not new. Also helpful to bring User Preferences stuff and world locale. Much room in this area to engage more. I may be taking a more active role in 2025, continue to encourage BAN and EAO to liase more with these groups.
SFC: Numbering system PR from FYT (#929), will discuss when he joins meeting.
https://github.com/tc39/ecma402/projects/2
SFC: Intl.Segmenter is unique in that the segmenter service is about languages, but it’s also about scripts, and also then about character properties rather than locale data. If you ask the Segmenter class about “what are the available locales”, “what are the supported locales” – well, that’s a particularly hard problem for Segmenter. If you ask the Segmenter “do you support fr?”, if your browser doesn’t French support, the answer is “no, I don’t”. But for Segmenter, even if you don’t have French built-in, you may still support French segmentation. The question Jedel is asking here is should we make Intl.Segmenter return everything, or how should we consider making it behave? Anba says we should do it as in all other components, returning the locales that ICU/ICU4X is built with.
HJS: I don’t recall what Safari does off the top of my head, but my understanding is that Firefox is doing what Chrome does, and what Chrome does is return all locales that ICU4C knows about, and that pretty much means that it doesn’t have anything to do with segmentation. For example, it knows that sv-AX exists, but it doesn’t consider fi-SE as existing, so what CLDR has datetime formatting rules for. The main question is what’s the appetite to have webcompat risks with changing that. Least risky is to keep what firefox/chrome does, and what safari possibly does. I don’t have a particular appetite for pursuing something that takes risks, I don’t know whether FYT does. My preference is to just write down what happens.
SFC: What we currently do is just wrong. ICU claims we support Cantonese, even if the segmenter does it. If you send it han-format text, it will segment as in Mandarin. Anba points out that if you throw Klingon at it, it won’t segment. Returning everything isn’t what we want. One question is if this is a normative requirement? Do we give more concrete requirements for what browsers do here? What’s a reasonable set of requirements? FYT might have some opinions here. My hope is that he’ll join us in a few minutes.
HJS: On a general level I think that if it turns out that the major implementations don’t have appetite to change this, the spec should say what the reality is for subsequent implementations not having to find that out by trial and error. Spec should document what’s the necessary things to implement. If we believe there’s enough webcompat risk that FYT doesn’t want to poke at this, we should write down what currently happens. If FYT is okay with poking at this, let’s see what happens if FYT pokes at it and reports back. For the Cantonese thing, that’s a special case that has some unfortunate history, so we shouldn’t make too big a deal about that particular case. By history I mean the way things have happened with 402 issue reports, and moving those to ICU.
SFC: Is there something I can do to bring up more context? That’s not enough to say we’ve had a productive discussion about it. Maybe ZB has some opinions on this part: we’ve had this supportedLocales feature in 402 for a while, and the question of the actual use case for it keeps coming up. If we can figure out the use cases, we can get a better answer of what the right behaviour is for this component. Currently every component returns the same set of supported locales, which is wrong. Why? Should segmenters return a different set? Why or why not?
ZB: My recollection is that the original motivation for this feature was introspection by the formatter. The assumption was that not everything would return the same list. At the time – and here is where I have to be careful about phrasing – there were concerns about fingerprintability of a solution that allows you to easily gather bits of entropy by just asking what each component supports one by one, in case each platform/browser/version would return different lists. I don’t believe we consulted with privacy experts. In my opinion it makes it a heavily engineered solution for a problem that may not exist.
SFC: I think the design – here’s a list, give me a subset of the list – has some advantages, since filtering a list is easier to do than returning a list. To answer the fingerprinting question, the resolution that is currently okay with all the engines, is that the fingerprinting surface for supportedLocales must be identical to platform/browser/version
ZB: That’s an implementation detail. There’s no reason that the web browser wouldn’t use data available on the OS.
SFC: They currently don’t. Frustrated that privacy people haven’t been in the room with internationalization people.
ZB: Also concerned that when I reached out for feedback, privacy experts said “don’t do anything fingerprintable.” “Don’t do this” isn’t a response to “how do we do it right?” I don’t think we can figure it out. We never got any useful feedback. My interpretation is that in this case is that the value proposition is that since it’s an auxiliary function you just return empty or und
, and say you support everything. This is because – in theory, the result of this function should be an input of some selector, from which you select a language. So just make a selector with your selected locales, pass us a locale and we’ll do our job. I think there should be a different list, because there’s different data.
EAO: And we should build in an expectation that the list may be different. Another point: supportedLocalesOf normalizes the input. If you ask for ru-SU, you’ll get ru-RU back, rather than what you’d put in as the input. So the strings – the output is not a subset of the input.
ZB: I wonder if it should be.
SFC: It’s normalized though, it’s not dependent on the locales.
ZB: But should it? If someone asks if ru-SU is supported, the answer should be “yes it is.” It shouldn’t be “we support ru-RU”, slap on the hand!
SFC: supportedLocales returns a normalized subset, while resolvedOptions will –
ZB: I don’t think it should normalize.
HJS: The discussion of how this shape came about – the idea that intersecting a list would be better for privacy than the browser producing a list in return. That’s beside the point for the question of what the use cases are here. It would be useful to talk about what the use case – ignoring the specifics of API shape and privacy – what’s the use case for querying a segmenter or a formatter ot collator for supported locales? It looks to me like whatever the use case was, it was designed for use cases that people had in mind for formatters. APIs that take natural language input, like Segmenter and Collator, it’s not a given that the same API design makes sense for any use case, especially for Segmenter, the assumption being that it always segments pretty much everything. To the extent that this is about segmenter, do we have concrete use cases for the segmenter case? If the answer is “it’s just for API uniformity with formatters, and we don’t have use cases for this”, then we shouldn’t overdesign this, in the kind of silly “do thie thing with the DateTimeFormatter data, even though it makes no sense for Segmenter…
HJS: If I try to come up for a use case for this, I’d come back to Cantonese. In theory, if there were future versions of browsers that had dictionary or otherwise support for segmenting Cantonese in the sense of not the kind of – I’m missing the right terminology. The sort of Cantonese that’s closer to writing spoken Cantonese rather than writing it so the written form looks like Mandarin, if future browsers support segmenting that, then there is potentially, for the use case, is to be able to ask “hey, do you support Cantonese, if not we’ll use this polyfill.” However, since there are browsers that already claim to support Cantonese, we’ve already ruined that use case. At this point it’s useless to design this more for a use case that’s already broken.
BAN: This doesn't address the use case question. But about fingerprinting, there seems to be an important difference between active and passive fingerprinting. Applications that query for supportedLocalesOf
ZB: The most generic use case I can come up with is a scenario where an application wants to make a decision about what locale to use for output, based on an intersection of available locales of DateTimeFormat, NumberFormat, PluralRules, Segmentation, available localization resources. This is not a weird case. The reason we don’t have it is that ECMA-402 doesn’t provide useful information. What we get is “we have resources for twenty languages, we hope that DateTimes are formatted properly, there’s no way to exclude scenario where we have localization resources but my browser doesn’t have information for [?], so it skips that. I agree with HJS, where if all locales are handled by Segmenter, or by PluralRules, then no language should be rejected for lack of support from those two. I’m comfortable with saying that localization should always return the canonicalized input list.
SFC: We can still discuss what the correct behavior should be. The decision about whether we keep the current behaviour because of web compatibility can be separate from figuring out what the use case is.
HJS: To go back to the history of the Cantonese issue, I think we shouldn’t overdesign for ICU/ICU4X not having a Cantonese dictionary. The most useful direction for energy would be figuring out how to source a Cantonese dictionary for ICU4C/ICU4X. If one can be sourced, if merging it with the current Mandarin + Japanese dictionary, if that can be merged so that it doesn’t make current segmentation of Mandarin or Japanese worse, the most useful thing would be to source a Cantonese dictionary to merge into Mandarin/Japanese thing, let it flow into ICU4X, don’t worry about the polyfill thing – we’ll have solved the problem in the engine itself.
SFC: This is enough for me to bring it back to the CLDR working group. Engines can tell us how to answer these questions, since it’s not something that ECMAScript can really do. What we can do is tell us the use cases, and it’s enough to have this discussion.
SFC: When we come back, we’ll get into the 2025 planning.
BAN: We have been meeting once a month, avg ~15 attendees. Lots of progress on DurationFormat and LocaleInfo, both at stage 3 and approaching stage 4, thanks to FYT. Thanks to EAO, we are making progress on MessageFormat. We also have proposals moving on Stage 1, for which SFC has been working on alignment: smart units, but more recently the Measure and conversion proposals.
Nebojsa: Important to list out the impact of these proposals, too.
ZB: Is there a related proposal to introduce typed values?
BAN: That's Measure, I think?
ZB: In ISO 18000, there is the International System of Quantities, implemented as javax.measure.
EAO: Do we also want precision, support for currency, or are those implemented separately from this? It right now looks like ISO 18000, but it may end up being different.
SFC: Intl.MessageFormat is an important thing to proceed with,
Potential Topics:
- Intl.MessageFormat
- Smart units
- Measure
- Sequence unit formatting: #398
- RBNF? #494
- Intl data loading? #434
- DisplayContext? #355
- Title casing? #294
ZB: This is 262, not 402.
BAN: The proposal has a localeConvert method, but aside from that it is entirely 262. Original goal was to allow people in 262 to have a way to do conversions in order to reduce temptation to abuse i18n APIs
SFC: Messages need to know currencies, currency carried in the message itself. The other use case we care about is providing developers to, if we include locale-aware unit formatting. If we don’t export unit conversions to 262, people will abuse the 402 tools. We can bring the 262/402 topic back up later.
Nebojsa: In terms of priorities, adoption of MessageFormat is a P0 because it drives adoption.
ZB: Because MessageFormat has good positioning in industry, we have a number of ecosystems that are blocking adding localization system until MessageFormat is ready, for example React.
EAO: One part of this, which I mentioned at UTW, is that we currently have a single message format, but there are questions going beyond that involving message bundles. If we want the Web platform to adopt this, what do we apply to that user base?
Nebojsa: Once you guys are done in December, the C and Java implementations will be synchronized with the MFv2 spec, which will really help us implement it in v8. Hopefully your proposal and the ICU4C implementation are in sync.
EAO: The JavaScript implementation is at that stage today already. It is 7 kB in total of code size (minified and compressed). Another update: WebExtensions are going to support MessageFormat – I spoke with the Webextensions community group at TPAC. For Intl.MessageFormat, if we’re talking about that, the biggest blockers right now are getting acceptance within TG1, getting agreement that MF2 is the thing we’ll be focusing on, allowing the Intl.MessageFormat proposal to advance beyond stage 1.
SFC: A little background: we’ve gotten pushback on TG1 because it’s such a new proposal that there’s some hesitancy: if we pull it into ECMAScript and no one uses it, we’re adding a lot of bloat to ECMAScript for something no one uses, and so we should wait. We have a chicken-and-egg problem. We have a couple of strategies: a user study that we’re running – there’s a professor from UCSC running this, determining whether MF2 satisfies requirements. So that’s one piece of evidence we’re trying to bring back. The other evidence is EAO’s work.
SFC: Once this lands in the ECMAScript standard, that would be a very good sign for MessageFormat moving forward.
Bob Jung: Another chicken-and-egg problem – until the tooling is improved, we can’t implement it.
EAO: I've put time into making sure you can plug-and-play this into different parts of the stack. How do messages go to your translators, how do your translators work with it, making it possible for – you don’t need to adopt MessageFormatv2, you can [do it piecemeal?]
Bob: Yeah, frameworks won't implement it until it's there.
EAO: You don’t need adoption everywhere to have utility somewhere.
Nebojsa: For example, we can replace MF1 with MF2 in the translation pipeline without using it elsewhere.
EAO: Easier for transforms and messages to be represented in different ways
ZB: Is ICU MessageFormat stable?
Nebojsa: ICU 77 is releasing it as draft in March.Spec closes in December, there will be some work, and then we’re drafting ICU for MessageFormat 2.0, to get people using it and filing bugs. No one looked at tech preview, so we had no feedback from that. Let’s make a draft, if there are major issues we can do something with the spec, but otherwise this is the way of deploying it. Once we set it to stable, two cycles later we will put it in Android.
ECH: "Draft" in ICU is like beta, but shouldn’t change unless major problems.
ZB: So this is coming in March?
Nebojsa: Yes, it starts with ICU77.
EAO: One final thing: it would be important to get explicitly buy-in from TG2 once we finalize the spec from the MessageFormat WG. TG2, as experts in the field, can say “this looks like the right thing for ECMAscript”
Nebojsa: I think that will come from the CLDR-TC decision.
EAO: Yes, though I’m also asking this body specifically to pencil into this time.
SFC: To briefly pull four issues out of the backlog, I wanted to survey if anyone here would say that this is something we want to prioritize.
EAO: The challenge is that the data size for inflection data is megabytes per locale. For it to be available anywhere on the web platform, we need data loading to work.
Nebojsa: Units also comes into play. CLDR has inflected units, not sure how to use it. Megabytes of lexicon, not megabytes of supporting it. Engine, data enough for testing only.
EAO: It will be years away for inflection to be ready, but we’ve been talking about data loading for long enough that if it’s not sorted, inflection can’t be sorted.
Nebojsa: I never got tracking on the Chromium/ChromeOS team. They say “that’s great, but you also have to deal with fonts and other resource (audio resources, etc) to get it to work correctly, so loading this CLDR data is necessary but not sufficient.
ZB: In effect we are paving the way of how ECMA adds a new DSL with MF2. I expect there's going to be pushback on a schema for data loading in Intl. It's going to be challenging to solve problems this way.
Nebojsa: I like the idea, but for ranking purposes let’s see how many people are asking for it.
SFC: In terms of RBNF and title casing. RBNF is the other body of functionality that’s being uploaded, ordinal/cardinal formatting. Stance: if there are clients at our companies who want it, we should consider implementing it. I’d like to highlight RBNF to be one of the things to watch for, since it will probably be shipping. In some cases it’s required for DateTimeFormat in certain locales. nth day of month, for example.
Nebojsa: did we talk with implementers about adding new data sets? If there are 300 functions you have to create over and over again, that is a problem.
ZB: From where I sit, the direction is that Intl can take up to 10 MB – you tell me what you want to do with it? Do you want 50 extra locales, or 5 locales with RBNF? And that’s my concern – it seems like a non-critical utility for accessibility reasons. I feel uncomfortable telling users that they can’t use locales because we have RBNF. I understand that not everyone has the same challenge. V8 can handle much more size than Hermes/ReactNative.
Nebojsa: DisplayContext?
ZB: In many languages you display differently whether it’s part of a sentence.
ZB: I think the inflection engine solves it.
SFC: It’s a question of when you are interpolating a date or unit, the way it renders will change, and you need to specify that.
ZB: Want to have Nebojsa’s name in 402.
ZB: I’m representing the side that we should not get RBNF. There are cases where we want it, but I don’t know how to solve it. We’ll have use cases where engines will refuse to provide data for RBNF, but provide a shell with no data in it.
SFC: I’m not convinced by the thesis that RBNF is a large amount of data.
ZB: Maybe I’m wrong here.
SFC: RBNF for any particular locale is one data file. For some languages – Chinese, for example – this might be quite large, but for English it won’t be much at all.
EAO: If we were able to solve data loading, this would free up a lot of space for consideration of something like RBNF. If we can not have whatever size be multiplied by 100 or 200.
ZB: Maybe. I don’t know.
SFC: When I’m thinking RBNF, I’m mainly thinking of ordinal and not spellout. Cardinal is more data, because that’s spellout. I’m not thinking of spellout, I’m thinking digits-ordinal.
EAO: That’s somewhere on the order, if you really minimize it, about eight characters per locale. Maybe less. We already have PluralRules support for ordinal selection, which goes from those categories.
ZB: It differs greatly. Catalan is large, English is not.
SFC: There’s a lot for spellout, but ordinal rules fits on one screen. When I say “RBNF” I’m lumping things together, but RBNF doesn’t mean all types.
EAO: It’s misleading to talk about RBNF if we mean to talk about ordinal. RBNF is the method by which we’d support ordinal, but that’s an implementation detail.
ZB: Ordinal is safe. I wouldn’t mind that.
EAO: Gender and other dependencies that English doesn’t have is going to be interesting to solve there.
SFC: Yes, I was just looking at the Spanish data. Algorithmic numbering systems are a thing as well. This can be a large chunk of data, but this covers all the algorithmic numbering system.
SFC: Are these in common use? Tamil vs tamildec?
ZB: Not used, likewise chinese numbers not used. Old-fashioned
SFC: Roman numerals are another example. But: Hebrew numerals is widely used.
ZB: I’m coming at this from a specific angle, we have an organization scrutinizing every byte used for Intl.
EAO: Same with Mozilla – how much data are we adding, and are users use it? Supporting formatting in Roman numerals doesn’t seem important.
SFC: There’s some locales that use these for dates. It’s an i18n correctness matter. I don’t want to address it right now, but we have some goalposts and milestones now.
EAO: Officially Finnish long-form does use ordinal formatting. So we already support this for some locales that do use them, but not universally.
SFC: I’ve seen that in DateFormat – just a dot in the position. My core point is, though, that the argument needs to be made why this matters. It sounds like you can make a decent case, but it’s not something that we can get away with not making the case for.
ZB: We could, if we wanted to flip the litmus test: do we know of any high-budget web project that rolled out client-side RBNF because they needed it so much that they were willing to use the payload. If so, we can say we can level the playing field. But “someone might like to use Roman numerals” isn’t enough. We didn’t include it in HTML list.
RGN: Actually, there are. There’s a broad list of different counters available in CSS.
SFC: If ordinal is already in CSS, that’s a really good point.
RGN: See css-counter-styles v3, look at section 6.
https://w3c.github.io/csswg-drafts/css-counter-styles
SFC: Are they doing real RBNF for Armenian? That’s one we were just looking at. They even support Tamil numbering in CSS!
ZB: Can we coalesce this data between CSS?
SFC: They’re probably using ICU. This definitely addresses ZB’s concern – if it’s already in CSS, we don’t need to add it to the browser.
SFC: Let’s revisit the list. Title casing is something people ask for, because English loves it but other languages not so much. I have some ideas based on the ICU4X implementation, which is much cleaner than the ICU4C. Unicode gives us the data to title-case a segment without title-casing individual parts of it. [missed key words]. I’m probably not going to drive a proposal forward, but would support it if someone else brought it forward. That’s also something that would not add much, if any, data, since I believe CSS has title casing.
SFC: I did want to talk about Intl data loading. There’s another issue associated with it, 212. We’ve had this for a very long time. How do we load locales in browsers? There are three ways we could go about this:
- The status quo, meaning if you want to add locales don’t use Intl, use a client-side solution, go back to the old pre-Intl days. No privacy problems, but not the best experience
- We have an API in the browser, pleaseLoadLocaleX. Async function, once we have it use it. Best client experience, but gets much pushback from security people. Achieves goal of having multi-lingual web, but serious privacy issues.
- Other possibility, which comes from issue 212 mentioned above: can we have a data-loading API? RBNF is good example: RBNF rulefile, [ ], add it and do something with it. Standard language-pack format (English language pack, French, Klingon, Tamil). Plug it into the browser, say “here’s the language pack”. Could be cached in browser local storage or something like that. This basically pushes the privacy problems onto the client side. Browsers don’t have to worry about fingerprinting based on locale being available/not available. Pushes the problem elsewhere. Problem: how do you implement it? This problem is maybe moving toward having a solution, since ICU4X has dynamic data loading. If we move in this direction, ICU4X enables supporting it. Other question: what’s the format of this data? We’re adding yet another DSL, or at least schema. We’ve already had pushback on DSL in MessageFormat, and this is a much bigger one that we have to worry about. Does have advantages.
I think this is an important problem that’s core to our mission of establishing a multi-lingual web. There are big problems, but they are big problems we should consider tackling. The impact of this is very big – expanding the locales available.
HJS: Earlier you mentioned that you assume that browsers are using ICU4C for roman numerals in CSS. This is not the case for Gecko, but instead, there is a numbering style system that is in the browser. Since this is in CSS, it wouldn't be extra binary size to also have this in ICU4X.
HJS: Gecko's implentation of the counter styles doesn't appear to go through ICU4C or ICU4X, so also doing Roman numerals in datetime formatting would be more code.
https://searchfox.org/mozilla-central/source/layout/style/res/counterstyles.css
SFC: Responding to that: even if there’s some other library that they’ve fine-tuned for this purpose, that same library could potentially be called by Intl.
ZB: Not a library, a CSS file.
SFC: Could this CSS file be turned into a library?
ZB: Maybe?
HJS: It seems like a special-purpose library for this and refactoring stuff may make this usable without duplication. Does not seem to be the case in Gecko that there would be any sharing with ICU4C/ICU4X. Let’s not make that assumption, since it doesn’t currently hold.
EAO: Regarding data loading: One approach that I don’t recall having been discussed is – well, so far we’ve been addressing this as a general-case question. We have the problem of wanting to use this specific functionality in this specific locale to do this thing that requires a bunch of data, multiplied for all locales it’s a lot, therefore we want to load the data on request for our specific locale. But another way to approach this is to consider from the browser point of view we’re already broadcasting a subset of locales in Accept-Languages header. It might be conceivable for a browser to specifically, for whatever it’s currently broadcasting in Accept-Language, to provide other, deeper functionality that it doesn’t provide for other locales. For example, an inflection engine is going to need MB per locale. This may only be available for the locales that the browser is already publicizing. When we end up rendering content, the availability of additional functionality for these locales would not increase fingerprintability. You don’t even need a UI if you don’t provide functionality for other locales.
HJS: That makes sense when there is large data like the inflection data. There's a danger of making things worse in other cases for things that have smaller data. My go-to- example for this is that the collation data for Cherokee is just script reordering – a little bit of data to put Cherokee first. Uses root collation, just with Cherokee first. Chrome removed collation for Cherokee even though it’s tiny. It’s acceptable that users who want CHerokee functionality might be running en-US in general. Let’s not throw away data that’s tiny, that’s already in CLDR, but that doesn’t make through some kind of threshold of what looks like the most popular locales. It makes sense to pursue this for large data, could make things worse applied to everything that’s currently in there.
EAO: My response to that is that if we go the route of making it more likely for some locale-specific data to not always be available, gated via Accept-Languages, it makes it easy to consider dynamic data loading APIs at the cost of a network request. In this particular case, we have a user who would like to use Cherokee collation, but not broadcasting that they want Cherokee content. The cost that there is a small payload network request that needs to be made in order to get the functionality. If the data size is small, there is little interest in browsers currently providing this data to roll it back.
SFC: Another thing I like about this direction is that it integrates with framework Web platform has already laid out (Accept-Languages). I still hear people say that it doesn’t cover websites that localize based on geolocation or user account. If you don’t have Tamil in your Accept-Languages you need to add that even if it’s in your user account.
EAO: I would count it as a decent direction for us to align the web platform to actual use one source of truth about locales of a user, rather than trusting "as a service" that we know better.
SFC: A good place to start – BAN rewrites pull request for the eighth time.
BAN: I think Accept-Language works well, despite attempts to think of fancier solutions that don't work as well.
EAO: Current proposal from Chromium team to reduce Accept-Languages and there to be a negotiation to get more. Safari is only broadcasting one language in Accept-Languages header. Behaviour here is – what Chrome and Firefiox currently do is not the only way it will be done in the future.
HJS: Safari I believe can broadcast more than one; in what cases, it’s based on – Apple language experts have determined what the allowed combinations are. Not in OSS WebKit code.
SFC: FYT has a document that he made a little while ago. Basically: if browsers want to restrict this, it doesn’t change what we’ve just said. If a browser allows a user to select a locale beyond the top 50 for their UI, then this is a way to integrate it. This is at least a path. What I'd like to do, and a direction we can generally leverage, is for engines to basically be able to make this decision for what is best for their users. Firefox/Safari/Chrome might make different decisions for their user base. I’d like to rewrite invariants in a way that allows that to happen.
EAO: And other implementations of JS, not just browsers. Which may have different restrictions.
BAN: All of the 7 versions stated that this is only a concern for agents that are browsers.
SFC: EAO, ZB, and BAN will discuss Measure/Smart Units/MessageFormat in subsequent meetings today. Wanted to know if others want to discuss issues from the priority issues list.
PCO: I wanted to talk about with my test262 hat on I’d like to talk about guidelines for locale-sensitive testing in test262.
Slides https://ptomato.name/talks/tc39tg2-2024-10
PCO: I can introduce it briefly. We want ILD behavior to be stable, so that sites don’t break, but also updated, to follow changing cultural practices & better understanding of same. These things are both good, but in direct tension. This feeds into test262. We want test coverage of these APIs, but we don’t want the test to break when we update ILD behavior. But we also want to be able to test things – it’s not useful to say “well, the output could be anything.”
PCO: So I have some questions to consider: should it be a goal to cover every locale and every option? I don’t think so. Extant tests rely on “golden output” and mini-implementation. Golden output: directly compare ILD output to known-good output. Simplest, but often breaks because even two different implementations don’t have the same human interface guidelines, and their output differs at the current time. Another is mini-implementation: including essentially polyfills in the test, implementing the spec as faithfully as possible, Problems: difficult to understand what’s being tested, unclear whether the polyfill or implementation is the one with the bug.
PCO: Ideas on what to do instead: Stable substrings. More robust than golden output, but still sharing disadvantages. We don’t compare to full output, but instead make sure that required words appear (i.e. if it’s the 9th month displayed long, it should contain “October”)
Comparative testing: Each setting must produce a distinct output. This could be good for getting coverage, but the assumption doesn’t hold in all cases. Not always true, but it could be good for getting a lot of coverage.
Metamorphic testing: RGN noted this. You find invariant properties of outputs that must hold across multiple inputs. Example from slides: get day string and month name for a date, assert that full date string contains both. Can be difficult to guess what these metamorphic properties are – and you might get it wrong. I wanted to see the thoughts of this group, and what guidelines might be useful.
SFC: I’m glad you’ve brought this forward, because it’s really important. We’ve had an ad hoc approach in the past, but it’s good to get them all written down. One topic I want to raise is that ECH and I have been working separately on adding better test data to CLDR. I.e. “here’s test data that if you use CLDR v 46, and are using DateTimeFormatting, and you specify these CLDR-conformant settings, here’s the output string you should get.” ECH has been doing a lot of work to add this into CLDR. This would resolve a lot of these types of problems, with some caveats. For example, if you call Intl.DateTimeFormat with these options, you should get the following string from CLDR. A problem though is that ECMAScript doesn’t require CLDR, even though all browsers use it. This may be more useful than these weird invariants. You’d have to map each user agent to its version of CLDR. But if you do have this mapping, you should be able to use it.
EAO: One alternative to solving this issue, if we add a locale zxx as a way to indicate there is no locale data.
A locale for which we’d include in 402 an explicit description of how it’s formatted in a way that does not depend on CLDR, it might be possible to define test cases for behaviour that would be invariant across time and invariant across CLDR/ICU changes. It might make it possible to, within that locale, test that behaviour more than is currently possible.
SFC: Another heuristic-based approach that I can throw into the mix: let’s say that we have a list of locales, and then you call supportedLocalesOf, and you don’t have any assertions about supportedLocalesOf, but you loop through and verify that the strings differ from each other (?). The idea is that if a browser claims to support a different locale, it should not just return data from another one.
EAO: On the metamorphic testing, the assumptions that you presented don't hold for all locales. For example, in Finnish the current month, October, is [finnish word]. If you format in Finnish, the current date using the “full” dateStyle, it needs to display [Finnish word]-da. This leaves the stem unchanged, but I’d be surprised if there aren’t some locales where the full version of the date wouldn’t change the stem, which would make it fail the metamorphic test you describe. It’ll work in English.
PCO: Yes, that’s exactly what we’re trying to avoid.
ZB: Why would care which locales a browser engine supports, for test262? For test262, what you care about is that the thing tested does return a string.
PCO: Yes, we could just say “the implementation returns an ILD string”. So we could say “does it return a String? Then our work is done!” But we also want to make it as useful as possible for browser implementers to share as many tests as possible across implementations in order to get the most coverage possible, so that we don’t have this situation where everyone writes their own tests and one implementation’s test covers corners that others don’t. But the more useful we could make test262 test for implementations without requiring anything that –
ZB: The case you're making is that, for some reason, browser vendors would prefer to test for presence of "october" rather than just verifying that the DateTimeFormat returns a string.
PCO: I don't want to speak for browser implementers. I'm just speaking for test262 maintainers, that's what we prefer.
ZB: Useful for whom? I’m just questioning whether this is actually needed. I think that the expectation of browser vendors is that their API returns what the specification asks it to return, which is a String, and that’s it. I’d be surprised if – I wouldn’t expect test262 to verify that in German I am returning the German name of the month.
EAO: I'd like to push back on that a little bit. That follows on this approach in Firefox and testing for localization in Fluent that we have. Because in Firefox, for most of the places we’re doing localization with Fluent for the user interface, we don’t test what the contents directly are to avoid this dependency on local context. Having the ability to test that setting this option with this value has this effect on the output, for the Intl formatter libraries, does have value in order to validate that for a new option or a new value of an option that this thing is working as expected rather than testing it manually and then having an automated test that tests for something in that output.
SFC: I'll add on to that. I think it's quite useful to exercise these different code paths and verify that the code paths are doing the reasonable thing.
The question is how do you test that they’re doing the reasonable thing? Testing that they return a String does not do that. It’s spec-adherent to return empty strings for all ILD behaviour. But that’s not useful to the users of the Web. The users care that if they specify a locale with these options they get the thing they want to see, not just an empty String. Test262 could take steps to make sure that these APIs are conforming with the non-normative text.
ZB: Mental exercise: we want to write a test for a function that returns the price of the stock. How do you test this without knowing the price of the stock?
EAO: By defining a stock with a set price.
ZB: Then how do you verify that it does the proper thing for different options?
EAO: Define a table
ZB: I don’t think we should treat this very differently. The examples in the slide deck are making assumptions that aren’t upheld. We shouldn’t encode this in test262. If we cannot do this, we have to have data. For that reason, the entity that has data that it tests is a Web browser. Test262 doesn’t have this data. Either it has to be augmented – test against a set of data from CLDR.
PCO: The illustrations on the slides – I’m not particularly trying to push those particular ones. Is there any meta-information that you can assume to be true that would help write tests for this kind of API. The example of a metamorphic test for function that gives price of stock, it might be – “get the price for buying one share, get price for two shares, result should be twice the cost of buying one share.” These are things that you can do if you treat the function like a black box, but you still know certain assumable properties about what it’s supposed to do. The answer might be can’t figure out any of those properties for the ILD strings. In that case, we do nothing – we say “these tests are not appropriate for test262”
ZB: There will be cases like this, like segmentation or collation. You have to assume that it's sorted somehow.
PCO: We’re not looking for a comprehensive approach to test everything, we’re looking for a way to test some things.
HJS: I think that we shouldn't pretend that this is data independent, since the implementations that are implementing the spec are using CLDR data. So I think it's quite reasonable to think that there's an expected output based on the current version of CLDR. The main problem is not letting the test suite freeze CLDR because the test suite has assumptions based on a version of CLDR, there should be the assumption that if the data changes the test should change as well. I know that the spec in theory… what the spec says is that it is ILD. But in practice, web developers expect that if something happens in one browser, then other browsers do the same thing. For the test suite to be useful for testing interoperability of implementations to the level that implementations can be confident that they’re not getting webcompat problems, behaviour that’s divergent from other implementations, we shouldn’t be looking at the theory of this as ILD, but instead the practice of “all these browsers use CLDR data.” Maybe there are cases where ICU4C/ICU4X might differ, even with the same version of CLDR, and we’d have to make adjustments based on that. We could say “CLDR collation for German does something, test that the collation when you use german phonebk does the things that are notable about phonebk.
SFC: I think that all I have to say here is that when the spec says ILD. But, as one thing I’ve said before, many times, is that I do think it’s the job of the specification to encode invariants that developers can expect when using APIs. It does this in Temporal, it’s part of the calendar era month code proposal. For example, if there’s an invariant that – this is possible with semantic skeleton – that when you specify year/month vs year/month/day, you could test that one string is longer than the other. This invariant could be in the spec as a normative requirement. In terms of testing against CLDR, I still think it’s useful for testing browsers to be able to do this. Even if test262 made it optional. Test262 could list them as optional tests for browsers that implement CLDR. We could leverage ECH’s work in adding this data to CLDR.
RGN: Those are great points, SFC. I do want to strongly caution against using CLDR because it's a moving target. And the failure to avoid that has resulted in a lot of pain in test262 already for things like the whitespace character used to separate dates and times changing in the Swedish locale, which had some test assuming that a particular whitespace character would be used. It’s bad practice to do this in a general sense, but in a practical way too. It minimizes the benefit of the test and maximizes the burden.
RGN: We're not looking to validate the i18n in test262, per se. We're looking at validating the proper conformance to the algorithms, which concerns the input options and outputs. But it does matter when you opt into the month, you get it. If you specify in NumberFormat that the sign is always present, you want to test for that. We want to make sure we’re testing for failure to implement the algorithm rather than failure to use a specific version of CLDR. We want to know that the algorithms, or the handwavey prose, that specify how the options interact, locale and the other options, and that’s where we get to things like more general testing based on invariants, like SFC said. Final point: we don’t have to boil the ocean. So if for instance the Finnish language has inflection that thwarts testing, that's fine, you just don't use Finnish for that test. You can still get cross-coverage by saying, for this test, you get German, Arabic and Chinese, but I need to skip Hungarian. So we can selectively apply these to get good value for effort, which is probably worth doing.
ZB (via chat): does test262 have a concept of warning? "The string returned doesn't seem to contain the german name of the month October based on CLDR 33"