Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up tests and other perf improvements #381

Merged
merged 20 commits into from
Sep 15, 2023
Merged

Conversation

GioSensation
Copy link
Member

@GioSensation GioSensation commented Sep 11, 2023

Reviewer: @shakyShane
Asana: https://app.asana.com/0/1198964220583541/1203781196578953/f

Description

A few improvements to speed up test execution and performance in general. In summary:

  • Mock getComputedStyles in jsdom (60% test speed up 🎉).
  • Memoize isCCForm: we were calling this for every single input field instead of per-form. This sped things up further, not only in the tests, but also in the real world. Most pages wouldn't see much difference because we have bailouts when there are tons of fields, but in test environments like in fix(perf): avoid checking for CC if we don't need to #258 (with bailouts switched off) this yielded a massive 80+% reduction in execution time.
  • Disconnect main MutationObserver when we reach the failsafe threshold. Even when we stop adding forms we still do some work, especially in pages like Asana where there's no form so all fields would trigger the expensive search for a form container. This change disconnects the observer altogether, so we no longer incur that extra cost after bailing.

Steps to test

All test suites passing, but much faster! 🎉

Signed-off-by: Emanuele Feliziani <feliziani.emanuele@gmail.com>
Signed-off-by: Emanuele Feliziani <feliziani.emanuele@gmail.com>
Signed-off-by: Emanuele Feliziani <feliziani.emanuele@gmail.com>
Signed-off-by: Emanuele Feliziani <feliziani.emanuele@gmail.com>
Signed-off-by: Emanuele Feliziani <feliziani.emanuele@gmail.com>
Signed-off-by: Emanuele Feliziani <feliziani.emanuele@gmail.com>
Signed-off-by: Emanuele Feliziani <feliziani.emanuele@gmail.com>
@GioSensation GioSensation self-assigned this Sep 11, 2023
Signed-off-by: Emanuele Feliziani <feliziani.emanuele@gmail.com>
Signed-off-by: Emanuele Feliziani <feliziani.emanuele@gmail.com>

# Conflicts:
#	src/Scanner.js
Signed-off-by: Emanuele Feliziani <feliziani.emanuele@gmail.com>
Signed-off-by: Emanuele Feliziani <feliziani.emanuele@gmail.com>
@@ -57,13 +57,8 @@ class ExtensionInterface extends InterfacePrototype {
return null
}

removeAutofillUIFromPage () {
super.removeAutofillUIFromPage()
this.activeForm?.removeAllDecorations()
Copy link
Member Author

@GioSensation GioSensation Sep 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I centralized removing all decoration and everything else, so we don't need this override anymore. Find it in the scanner changes below.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

love it.

@@ -128,6 +128,9 @@ class Form {
get isHybrid () {
return this.formAnalyzer.isHybrid
}
get isCCForm () {
return this.formAnalyzer.isCCForm()
}
Copy link
Member Author

@GioSensation GioSensation Sep 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The isCCForm was in the matching class, and was being run on every single field. FormAnalyzer is a more fitting place and allows us to run it once per form, rather than once per field. Massive performance improvement on pages with tons of fields.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

excellent

Signed-off-by: Emanuele Feliziani <feliziani.emanuele@gmail.com>
* @returns {boolean}
*/
isCCForm () {
if (this._isCCForm !== undefined) return this._isCCForm
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the only change to this method, apart from being moved here. Just memoizing the result. When the form needs to be rescanned it follows the same flow as isLogin and isSignup. The analyzer is destroyed and recreated, so form data is kept up to date when needed and only when needed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very nice!

} = testCase

const testTextString = expectedFailures.length > 0
? `should contain ${expectedFailures.length} known failure(s): ${JSON.stringify(expectedFailures)}`
: `should NOT contain failures`

it(testTextString, () => {
const testContent = fs.readFileSync(path.resolve(__dirname, './test-cases', html), 'utf-8')
it.concurrent(testTextString, async () => {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This didn't yield any performance gains for some reason. I've even tried chunking the testCases and running them in parallel, but nothing seemed to matter much 🤷‍♂️. Ran out of time for this scope and the gains are already pretty significant.

src/Scanner.js Outdated Show resolved Hide resolved
src/Scanner.js Outdated Show resolved Hide resolved
Signed-off-by: Emanuele Feliziani <feliziani.emanuele@gmail.com>
* @returns {boolean}
*/
function shouldLog () {
return readDebugSetting('ddg-autofill-debug')
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is part of the cherry-picking I mentioned above.

@GioSensation GioSensation marked this pull request as ready for review September 12, 2023 08:27
Copy link
Collaborator

@shakyShane shakyShane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No additional comments other than to say thanks, this is amazing work :)

Signed-off-by: Emanuele Feliziani <feliziani.emanuele@gmail.com>
Signed-off-by: Emanuele Feliziani <feliziani.emanuele@gmail.com>
Signed-off-by: Emanuele Feliziani <feliziani.emanuele@gmail.com>
Signed-off-by: Emanuele Feliziani <feliziani.emanuele@gmail.com>
Signed-off-by: Emanuele Feliziani <feliziani.emanuele@gmail.com>
Signed-off-by: Emanuele Feliziani <feliziani.emanuele@gmail.com>
Signed-off-by: Emanuele Feliziani <feliziani.emanuele@gmail.com>
return (string || '')
if (!string) return ''

return (string)
.replace(/\n/g, ' ')
.replace(/\s{2,}/g, ' ').trim()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we call this a lot of times and pretty often with null, I'm short circuiting early instead of running two replaces and a trim on an empty string every time.

this.findEligibleInputs(document)
window.performance?.mark?.('scanner:init:end')
window.performance?.mark?.('initial_scanner:init:end')
logPerformance('initial_scanner')
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've created this logPerformance little utility to log performance metrics of every scan, not just the initial one. Works the exact same way.

// Use input.form unless it encloses most of the DOM
// In that case we proceed to identify more precise wrappers
if (
this.forms.has(input.form) || // If we've added the form we've already checked that it's not a page wrapper
Copy link
Member Author

@GioSensation GioSensation Sep 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By adding this little check we avoid the much more expensive isFormLikelyToBeUsedAsPageWrapper check. This significantly speeds up forms with a lot of input fields. For example, further massive gain on your test page with 1000 fields, down to ~70ms on our browser on my machine (together with the other changes in the last few commits).

const previouslyFoundParent = [...this.forms.keys()].find((form) => form.contains(parentForm))
if (parentForm instanceof HTMLFormElement && this.forms.has(parentForm)) {
// We've met the form, add the input
this.forms.get(parentForm)?.addInput(input)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Return early if we've already met the form, so we avoid all the looping below.

// Check if the forms we've seen are either disconnected,
// or are parent/child of the currently-found form
let previouslyFoundParent, childForm
for (const [formEl] of this.forms) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In production we had 2 destructuring + loops for this stuff. Most importantly, we were not deleting disconnected forms, and since we didn't disconnect the MutationObserver this caused a leak of resources and longer and longer loops over time on certain occasions like on Asana. The loop is now faster, runs only once with no destructuring and we also remove the disconnected forms. Plus of course now we disconnect the observer as well. 💪


if (el.type === 'image') {
return removeExcessWhitespace(el.alt || el.value || el.title || el.name)
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just trying to shave off everything possible from this. This shows up a lot in the profile, just because it runs so often. Here I've just hoisted that tiny array and consolidated those two instanceof HTMLInputElement checks into one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants