-
Notifications
You must be signed in to change notification settings - Fork 699
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Request Queue v2 #1975
feat: Request Queue v2 #1975
Conversation
5bb58a9
to
3bc18db
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
left a few initial comments. it would be good to use a common predecessor, as now it's hard to see what actually changed in a 1000 LoC file that is mostly a copy of the v1 approach.
packages/puppeteer-crawler/src/internals/enqueue-links/click-elements.ts
Outdated
Show resolved
Hide resolved
@drobnikj can you please take a look as well @vladfrangu I also see the puppeteer e2e tests failed on platform due to a chrome version mismatch, any idea why is that? maybe its just flakyness (since the scheduled run passed just fine), but it feels pretty weird |
We will also surely need some unit tests for this, I don't want you to rewrite all existing tests to support v2, but with e2e tests we won't have such a confidence (as they won't run on each PR). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, that is a lot of code, looks good as a good start, on top of the comments I miss these parts from spec.:
- If the actor is migrating(??aborting/exiting/failing), it should unlock all requests which locked
- Ensuring that the queue is empty should wait for the lockSecs after it is empty to potential request lock can be expired.
fc0de1d
to
a1b3e95
Compare
Ready for another pair of reviews! Still need to add some tests for this, but I'm more confident it should just work |
18d4395
to
415b0f5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The structure of the code is much easier to read 👍
I'm still not sure if the local cache would not mess queue in parallel execution, can we do a simple test where we run multiple scrapers on one large site with shared request queue?
c2fd6ec
to
08cceb5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request is neither linked to an issue or epic nor labeled as adhoc!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request is neither linked to an issue or epic nor labeled as adhoc!
E2E re-re-re-running: https://github.com/apify/crawlee/actions/runs/6224902038 |
9f45080
to
a8879f6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request is neither linked to an issue or epic nor labeled as adhoc!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request is neither linked to an issue or epic nor labeled as adhoc!
FYI for the PR toolkit to pass, every PR needs to have an estimate and a correct label (or be connected to an issue that has those - the link needs to be set via zenhub, its unfortunately not enough to provide the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request is neither linked to an issue or epic nor labeled as adhoc!
Co-authored-by: Martin Adámek <banan23@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just nits, I would like to test it once the testing version will be publish npm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just nits, I would like to test it once the testing version will be publish npm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this is good to go?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
End of an era, really 😄
Looks good to me, let's see how it runs
Closes #1365