First git clone https://github.com/spider-rs/spider.git
and cd spider
. Use the release flag for the best performance --release
when running the examples below.
Simple concurrent crawl Simple.
cargo run --example example
Subscribe to realtime changes Subscribe.
cargo run --example subscribe
Live handle index mutation example Callback.
cargo run --example callback
Enable log output Debug.
cargo run --example debug
Scrape the webpage with and gather html Scrape.
cargo run --example scrape
Scrape and download the html file to fs Download HTML. *Note: Enable feature flag [full_resources] to gather all files like css, jss, and etc.
cargo run --example download
Scrape and download html to react components and store to fs Download to React Component.
cargo run --example download_to_react
Crawl the page and output the links via Serde.
cargo run --example serde --features serde
Crawl links with a budget of amount of pages allowed Budget.
cargo run --example budget
Crawl links at a given cron time Cron.
cargo run --example cron
Crawl links with chrome headless rendering Chrome.
cargo run --example chrome --features chrome
Crawl links with chrome headed rendering Chrome.
cargo run --example chrome --features chrome_headed
Crawl links with chrome headless rendering remote connections Chrome.
cargo run --example chrome_remote --features chrome
Crawl links with view port configuration Chrome Viewport.
cargo run --example chrome_viewport --features chrome
Take a screenshot of a page during crawl Chrome Screenshot.
cargo run --example chrome_screenshot --features="spider/sync spider/chrome spider/chrome_store_page"
Crawl links with smart mode detection. Runs HTTP by default until Chrome Rendering is needed. Smart.
cargo run --example smart --features smart
Use different encodings for the page. Encoding.
cargo run --example encoding --features encoding
Use advanced configuration re-use. Advanced Configuration.
cargo run --example cache_chrome_hybrid --features="spider/sync spider/chrome spider/cache_chrome_hybrid"
Use chrome hybrid caching. Chrome Cache Hybrid.
cargo run --example advanced_configuration
Use URL globbing for a domain. URL Globbing.
cargo run --example glob --features glob
Use URL globbing for a domain and subdomains. URL Globbing Subdomains.
cargo run --example url_glob_subdomains --features glob
Downloading files in a subscription. Subscribe Download.
cargo run --example subscribe_download
Add links to gather mid crawl. Queue.
cargo run --example queue
Use OpenAI to get custom Javascript to run in a browser. OpenAI. Make sure to set OPENAI_API_KEY=$MY_KEY as an env variable or pass it in before the script.
cargo run --example openai
or
OPENAI_API_KEY=replace_me_with_key cargo run --example openai
or setting multiple actions to drive the browser
OPENAI_API_KEY=replace_me_with_key cargo run --example openai_multi
or to get custom data from the GPT with JS scripts if needed.
OPENAI_API_KEY=replace_me_with_key cargo run --example openai_extra