ChatGPT Friendly Crawl

prerequisite

Ensure Python 11+ is installed. Dependencies can be installed via:

pip install aiohttp pyppeteer

Usage

Before running the crawler, set these environment variables:

CHATGPT_CRAWL_VAR_START_URL: Starting URL for the crawl.
CHATGPT_CRAWL_VAR_DEPTH: Maximum crawl depth.
CHATGPT_CRAWL_VAR_MAX_PAGES: Maximum number of pages to fetch.

export CHATGPT_CRAWL_VAR_START_URL=$target_url && \
export CHATGPT_CRAWL_VAR_DEPTH=$depth_number && \
export CHATGPT_CRAWL_VAR_MAX_PAGES=$max_pages_number && \
python ./chatgpt_crawl.py

export CHATGPT_CRAWL_VAR_START_URL=https://www.google.com && \
export CHATGPT_CRAWL_VAR_DEPTH=2 && \
export CHATGPT_CRAWL_VAR_MAX_PAGES=100 && \
python ./chatgpt_crawl.py

Benefits of Using `https://r.jina.ai` API

Using the https://r.jina.ai API optimizes the retrieval process, enhancing scalability and reliability without the overhead of managing infrastructure.

Wrap-up

This "ChatGPT Friendly Crawl" combines modern async patterns with a robust API to streamline data collection, making it an efficient tool for scalable web scraping.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.gitignore		.gitignore
README.md		README.md
chatgpt_crawl.py		chatgpt_crawl.py
prompt.txt		prompt.txt
test.py		test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ChatGPT Friendly Crawl

prerequisite

Usage

Benefits of Using `https://r.jina.ai` API

Wrap-up

About

Releases

Packages

Languages

Hardcoreyoyo/ChatGPT-Friendly-Crawl

Folders and files

Latest commit

History

Repository files navigation

ChatGPT Friendly Crawl

prerequisite

Usage

Benefits of Using https://r.jina.ai API

Wrap-up

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Benefits of Using `https://r.jina.ai` API

Packages