Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

misc(scripts): improve collision check in collect-strings #12697

Merged
merged 4 commits into from
Jun 25, 2021

Conversation

connorjclark
Copy link
Collaborator

ref #12690 (comment)

Moves the collision tracking code to after all the individual directories have been processed (instead of doing each independently). Result is all strings are checked for collisions together, instead of just amongst those in the same directory.

Also changed the assertion code to use jest expect and compare against an inline array of expected collisions, so we can easily see what changed.

@connorjclark connorjclark requested a review from a team as a code owner June 24, 2021 22:43
@connorjclark connorjclark requested review from adamraine and removed request for a team June 24, 2021 22:43
@google-cla google-cla bot added the cla: yes label Jun 24, 2021
if (seenId && strings[seenId]) {
if (!strings[seenId].meaning) {
strings[seenId].meaning = strings[seenId].description;
collisionStrings.push(ctc.message);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this at @patrickhulce's suggestion. tbh I don't understand what's going on here. the old code was doing collisions++ in both places. should I just do one collisionStrings.push(ctc.message); here (keep L626?)

Copy link
Member

@brendankenny brendankenny Jun 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this at @patrickhulce's suggestion. tbh I don't understand what's going on here. the old code was doing collisions++ in both places. should I just do one collisionStrings.push(ctc.message); here (keep L626?)

it's the difference between the number of strings with collisions (which is the current count) vs the number of pairs of strings that collide with each other (which is collisionStrings.length without this line)

}

try {
expect(collisionStrings).toEqual([
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

want to sort, so we're not dependent on order messing with this?

/**
* @param {Record<string, CtcMessage>} strings
*/
function checkForCollisions(strings) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this code has a few purposes:

  • mutate ctc messages that collide to add meaning set to description
  • throw if the meaning of any collisions also collides
  • throw if the number of collisions has changed

I think checkForCollisions captures the second two but doesn't capture the first so well. It's also kinda not doing a great job at the second bullet either. Maybe we should rewrite this altogether?

Proposal:

  • group by message
  • set meaning on all groups of length > 1
  • assert the equality of all groups of length > 1 with previous run
  • throw if the meaning set size is less than the group length for any given group

I know that's more than you probably bargained for though, so I'm happy to take it and switch roles on this PR if you prefer :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ha, I was sniped by this to do some fancy stuff on top of the base fix in #12698, so we can have our pick of approaches...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know that's more than you probably bargained for though

Yeah... I'd prefer to defer to you or brendan for all that. Just keep the expect thing and LGTM :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done :)

@google-cla
Copy link

google-cla bot commented Jun 25, 2021

All (the pull request submitter and all commit authors) CLAs are signed, but one or more commits were authored or co-authored by someone other than the pull request submitter.

We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that by leaving a comment that contains only @googlebot I consent. in this pull request.

Note to project maintainer: There may be cases where the author cannot leave a comment, or the comment is not properly detected as consent. In those cases, you can manually confirm consent of the commit author(s), and set the cla label to yes (if enabled on your project).

ℹ️ Googlers: Go here for more info.

@google-cla google-cla bot added cla: no and removed cla: yes labels Jun 25, 2021
Copy link
Member

@brendankenny brendankenny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was pulling the existing strings from git show master:lighthouse-core/lib/i18n/locales/en-US.json in #12698 and deriving the diff dynamically so we didn't have to maintain this list, which worked really well and simply except a single step of re-extracting @examples to turn the lhl file back into a ctc file. As soon as I started writing a regex (or changing convertMessageToCtc) it wasn't simpler any more :)

This LGTM!

@@ -19,7 +19,7 @@ const UIStrings = {
/** Title of an accessibility audit that evaluates if progressbar HTML elements do not have accessible names. This title is descriptive of the failing state and is shown to users when there is a failure that needs to be addressed. */
failureTitle: 'ARIA `progressbar` elements do not have accessible names.',
/** Description of a Lighthouse audit that tells the user *why* they should try to pass. This is displayed after a user expands the section to see more. No character length limits. 'Learn More' becomes link text to additional documentation. */
description: 'When an element doesn\'t have an accessible name, screen readers announce it with a generic name, making it unusable for users who rely on screen readers. [Learn more](https://web.dev/aria-name/).',
description: 'When a `progressbar` element doesn\'t have an accessible name, screen readers announce it with a generic name, making it unusable for users who rely on screen readers. [Learn more](https://web.dev/aria-name/).',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you figure out why this was previously passing/not a TC issue?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this was previously passing

yep! this was the bug I was alluding to in #12697 (comment) where the previous code only ever checked for meaning collisions with the first occurrence of a message and not between all occurrences of the message.

not a TC issue

No clue. If I had to guess, it would be that they haven't needed to be translated yet at the same time so they weren't colliding in TC yet? But I have no clue why this has worked so far given the way TC has been explained to me.

const collidingMessages = allCollisions.map(collision => collision[1].message).sort();

try {
expect(collidingMessages).toEqual([
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another option is strings and counts per string, since e.g. these first a11y strings are never going to reduce in number (they're distinct as LHL messages but not CTC ones), and neither will the stack pack ones without some major rearchitecting, but it might also be good to wait to see how this works over the next few string changes to see if it would be actually beneficial to tweak this at all

@patrickhulce
Copy link
Collaborator

@googlebot I consent.

@google-cla google-cla bot added cla: yes and removed cla: no labels Jun 25, 2021
@connorjclark connorjclark merged commit d8d1d39 into master Jun 25, 2021
@connorjclark connorjclark deleted the fix-collisions-check branch June 25, 2021 22:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants