Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(valid-langs): deprecate validLangs, add isValidLangs, reduce file size #2527

Merged
merged 9 commits into from
Sep 25, 2020

Conversation

straker
Copy link
Contributor

@straker straker commented Sep 21, 2020

Reduce the minified file size of the valid-langs.js file by 17 kB gzipped. I did this by transforming the data into a trie and stored as a nested array of integers. The comment in the file explains the technique and gives the code in case we need to update the list of valid langs.

Deprecated the valdLangs function as there's no good reason to need the entire list when just knowing if the string itself is valid. With a trie, it's O(n) (where n is the length of the string [2-3]) to know if the string is valid whereas generating the list of valid langs and then seeing if the string is included in that list is O(2m) (where m is the number of all valid langs [8000])

Before axe-core size:

valid-lang.js: 47.87 kB (gzipped: 19.75 kB)
axe.min.js: 582 kB (gzipped: 144 kB)

After axe-core size:

valid-lang.js: 23.27 kB (gzipped: 2.72 kB)
axe.min.js: 558 kB (gzipped: 124 kB)

Part of issue: #2357

Reviewer checks

Required fields, to be filled out by PR reviewer(s)

  • Follows the commit message policy, appropriate for next version
  • Code is reviewed for security

@straker straker requested a review from a team as a code owner September 21, 2020 16:27
Copy link
Member

@stephenmathieson stephenmathieson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is amazingly clever. Super good work, Steve! The drop in file size is great.

However, I wonder if it's "too clever" in its current state. For example, how do we add a new language? How do we remove one? Should we write little "utility programs" for managing the list of langs?

@straker
Copy link
Contributor Author

straker commented Sep 21, 2020

I thought about an automated script, but then I figured it was not needed. The reason being is that the valid list of ISO 639-1/2 codes should not change (or change very rarely), so we shouldn't have a need to update the list. However, I did leave the entire code at the bottom in a comment so if we do need to update/remove an item we can.

The other option is I take this out and make it it's own npm module that axe-core imports :)

@stephenmathieson
Copy link
Member

The reason being is that the valid list of ISO 639-1/2 codes should not change (or change very rarely), so we shouldn't have a need to update the list.

I see. Probably not a valid concern then 😄

However, I did leave the entire code at the bottom in a comment so if we do need to update/remove an item we can.

Good point. This is probably enough for now, as long as another maintainer is comfortable updating the list based on your comments.

The other option is I take this out and make it it's own npm module that axe-core imports :)

I was going to suggest this, but figured you wouldn't like that. If you think the list of languages would be helpful outside of axe-core, then IMO it'd be great to have it be its own package.

lib/core/utils/valid-langs.js Outdated Show resolved Hide resolved
lib/core/utils/valid-langs.js Outdated Show resolved Hide resolved
@dylanb
Copy link
Contributor

dylanb commented Sep 22, 2020

In reading the code, I see that this does not affect the ability to pass in a custom list of valid languages - great!!

Nice work!

straker and others added 3 commits September 22, 2020 08:43
lang += '`';
}
for (let i = 0; i <= lang.length - 1; i++) {
const index = lang.charCodeAt(i) - 96;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yesterday morning I didn't know about charCodeAt. Last night when working on a personal project, I actually needed it to deal with emoji chars. :) Funny how these things can go.

@straker straker merged commit 8a699ec into develop Sep 25, 2020
@straker straker deleted the compress-langs branch September 25, 2020 16:28
lang += '`';
}
for (let i = 0; i <= lang.length - 1; i++) {
const index = lang.charCodeAt(i) - 96;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious.. what does the 96 here refer to?


(awesome PR. so pumped for this.)

I had also found this regex-based approach last week: https://github.com/SafetyCulture/bcp47/blob/develop/src/index.js but it's more of a "does it look roughly like a bcp47 code" as opposed to being specific.

Copy link
Contributor Author

@straker straker Sep 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the 96 gets a 1 index on the letter a and a 0 index on letter ` (a.charCodeAt(0) = 97)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants