Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix scanner performance regression + added test #612

Closed

Conversation

shakyShane
Copy link
Collaborator

@shakyShane shakyShane commented Jul 21, 2024

Reviewer: @dbajpeyi
Asana: https://app.asana.com/0/0/1207840563867787/f

Description

A previous PR caused the Form instance to be created & destroyed for every input found. In the example test case I added, this could be upto 60 times causing a large performance regression.

Steps to test

An integration test was added. To verify, you can run this branch locally and remove my change in Scanner.js -> then the integration test will fail.

NOTE There are some failing tests for input classifications - I think they'll need updating, but I havn't personally touched that part of the codebase for a long time, so I'd appreciate someone taking a look to see if the failures are valid.

perf PR

@shakyShane
Copy link
Collaborator Author

This stack of pull requests is managed by Graphite. Learn more about stacking.

Join @shakyShane and the rest of your teammates on Graphite Graphite

@shakyShane shakyShane marked this pull request as ready for review July 21, 2024 11:22
@shakyShane shakyShane force-pushed the 07-21-fix_scanner_performance_regression_added_test branch from 0614c8c to d58b22a Compare July 21, 2024 11:43
Comment on lines +173 to +181

// Check for forms with too few 'known' inputs
// In the case where the only inputs are all 'unknown', we can destroy the form and stop tracking it.
for (let formInstance of this.forms.values()) {
if (formInstance.hasOnlyUnknownFields) {
formInstance.destroy()
}
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is done after adding all inputs, because only then can you determine whether it's truly useless or not.

if (!f.isDestroyed) {
this.forms.set(parentForm, f)
}
this.forms.set(parentForm, new Form(parentForm, input, this.device, this.matching, this.shouldAutoprompt))
Copy link
Collaborator Author

@shakyShane shakyShane Jul 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Always set the form instance here, because this is done within the addInput loop. The form may be deemed useless later, but for now we always create it to ensure future inputs detect and re-use it.

Copy link
Collaborator

@dbajpeyi dbajpeyi Jul 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would still fill up the forms set (max 30). The side-effect of this being that if there's actual "useful" form later, it won't be added to the forms set and will remain unanalysed. We probably want to increase max size of forms set?

Comment on lines +126 to +134
get hasOnlyUnknownFields () {
const numKnownInputs = this.inputs.all.size - this.inputs.unknown.size

if (numKnownInputs === 0) {
return true
}
return false
}

Copy link
Collaborator Author

@shakyShane shakyShane Jul 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the same logic as before, but now it's moved so that the scanner can ask the question when it chooses.

const perfPage = scannerPerf(page)

await perfPage.navigate(constants.forms['www_ulisboa_pt_login.html'])
await perfPage.validateInitialScanPerf(80)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this 80 is another magic number, but during testing it's more than enough to ensure another regression would be difficult. on my Laptop this metric was over 300ms, so having a low number like 80 should be sufficient

@shakyShane shakyShane requested review from dbajpeyi and removed request for sjbarag July 21, 2024 12:26
@dbajpeyi
Copy link
Collaborator

dbajpeyi commented Jul 23, 2024

@shakyShane Checking the input classifier tests

  1. This sadly makes autofill not work on https://www.scanmalta.com/shop/, which was originally fixed in scanner: discard Form instances with no classified inputs #574. Inputs don't get detected.
  2. The number of false positives increased in submit button statistics (back to what was before the Sean's PR) which is reflected in the unit test failure.

Having said that, the overall idea looks to be good, so I am not sure what might have gone wrong. Taking a look.

*
*/
test('Large form without eligible inputs', async ({page}) => {
test.setTimeout(5000)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this needed?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I remember correctly, this just fails early. It times out at 5 seconds instead of the usual 30 seconds.

@shakyShane
Copy link
Collaborator Author

@dbajpeyi shall we close this PR out now? given that you have a different direction now?

@shakyShane shakyShane closed this Jul 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants