Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to block media ? #627

Closed
zhk0603 opened this issue Oct 29, 2024 · 4 comments
Closed

How to block media ? #627

zhk0603 opened this issue Oct 29, 2024 · 4 comments
Labels
t-tooling Issues with this label are in the ownership of the tooling team.

Comments

@zhk0603
Copy link

zhk0603 commented Oct 29, 2024

When using Playwright crawler, how to block media, such as stylesheet, font, image, media

@github-actions github-actions bot added the t-tooling Issues with this label are in the ownership of the tooling team. label Oct 29, 2024
@janbuchar
Copy link
Collaborator

Hello, and thank you for your interest in Crawlee! Do I understand it right that you want to prevent images and such from loading? Perhaps to save bandwidth and make scraping faster?

@zhk0603
Copy link
Author

zhk0603 commented Oct 29, 2024

yes

@janbuchar
Copy link
Collaborator

Once pre-navigation hooks are implemented (#427), I would utilize those to call page.route and set up blocking as you like. As of now, I'm not aware of an easy way to do this.

@B4nan
Copy link
Member

B4nan commented Oct 29, 2024

Just one note about this, request interception (page.route) means not being able to reuse the resource cache of the browser. You can use this approach to save bandwidth, but it will result in worse overall performance most likely.

@apify apify locked and limited conversation to collaborators Oct 29, 2024
@vdusek vdusek converted this issue into discussion #629 Oct 29, 2024

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
t-tooling Issues with this label are in the ownership of the tooling team.
Projects
None yet
Development

No branches or pull requests

3 participants