Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I want to scan a whole site, but only get ~200 results #84

Closed
mgifford opened this issue May 10, 2023 · 4 comments
Closed

I want to scan a whole site, but only get ~200 results #84

mgifford opened this issue May 10, 2023 · 4 comments

Comments

@mgifford
Copy link

Details

I've got a few big sites I'd like to scan, but it keeps stopping about 200 or so pages in.

Is there a way to override that? It's still a good measure, but would be useful to be able to scan the whole site if needed.

Maybe I'm just missing something in the config.

@tuminzee
Copy link
Contributor

can you try to explain this with example? I am not able to understand
unlighthouse is made so that we can generate reports on all routes

@mgifford
Copy link
Author

mgifford commented May 10, 2023

So I run this:

npx unlighthouse --site https://example.com/eng/index.html

All churns on as you'd expect until I get:

✔ Completed runLighthouseTask for /eng/declaration/wwl-cna/c15/index.html. [Score: 0.94 Samples: 1 100% complete]
✔ Unlighthouse has finished scanning https://example.com/eng/index.html/: 200 routes in 431s.

That's for basically any site I craw.

So I get 200 or so scans but the site is much bigger.

harlan-zw added a commit that referenced this issue May 10, 2023
@harlan-zw
Copy link
Owner

harlan-zw commented May 10, 2023

There is config for the maximum number of routes to scan scanner.routeRules that is set to 200 by default.

This was implemented as the stability of the worker and the UI starts degrading around here and it's quite easy for a site scan to end up queueing thousands of routes.

I've pushed up a warning that will be triggered when you hit the limit to give better visibility, it will be available in v0.6.0 which will be released soon.

You can read more about how the large sites are handled on this page.

@harlan-zw harlan-zw closed this as not planned Won't fix, can't repro, duplicate, stale Mar 3, 2024
@mgifford
Copy link
Author

mgifford commented Jul 5, 2024

So @harlan-zw  this should work to scan 500 URLs vs the default 200? It would also take 2 samples rather than just one.

Assuming this is in the directory where you execute the script:  unlighthouse.config.ts

export default {
scanner: {
// run lighthouse for each URL 2 times
samples: 2,
// increase the maximum number of routes - https://unlighthouse.dev/api/config#scannermaxroutes
maxRoutes: 500,
},
debug: true,
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants