Mantisus / crawlee-python Public

forked from apify/crawlee-python

Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.

apify.github.io/crawlee-python/

Apache-2.0 license

1 star 390 forks Branches Tags Activity

Notifications

Error
Looks like something went wrong!

About

Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.

apify.github.io/crawlee-python/

Apache-2.0 license

Report repository

Releases

No releases published

Packages

No packages published

Languages

Python 89.4%
JavaScript 6.2%
CSS 4.0%
Other 0.4%