Skip to content

Navigation Menu

Explore
By company size
By use case
By industry
View all solutions
Topics
- AI
- DevOps
- Security
- Software Development
- View all
Explore
- GitHub Sponsors
  Fund open source developers
- The ReadME Project
  GitHub community articles
Repositories
- Enterprise platform
  AI-powered developer platform
Available add-ons
Pricing

Search code, repositories, users, issues, pull requests...

Search

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

spider-rs / spider Public

Notifications You must be signed in to change notification settings
Fork 109
Star 1.2k

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Releases: spider-rs/spider

Releases · spider-rs/spider

v2.21.31

18 Dec 11:53

j-mendez

Compare

Choose a tag to compare

Loading

v2.21.31 Latest

Latest

Whats Changed

Fix http crawling past first page

Full Changelog: v2.21.27...v2.21.31

Assets 2

Loading

All reactions

v2.21.27

10 Dec 12:21

j-mendez

Compare

Choose a tag to compare

Loading

v2.21.27

Whats Changed

add balance feature flag to switch to global semaphores.
add remote_addr feature flag to get the page remote address / ip.
add re-usable chrome client and stream ws
fix chrome inline page navigations
fix constraining html only pages

Full Changelog: v2.20.6...v2.21.27

Assets 2

Loading

All reactions

v2.20.6

05 Dec 14:11

j-mendez

Compare

Choose a tag to compare

Loading

v2.20.6

Whats Changed

fix chrome initial page return links
add hydration scrips ignore for next.js and astro
add base script targets for smart mode
add custom domain layer interception for giants
add interception analytics and ads blocking
fix chrome page timeout bytes transferring

Full Changelog: v2.16.0...v2.20.6

Assets 2

Loading

All reactions

v2.16.0

04 Dec 18:11

j-mendez

Compare

Choose a tag to compare

Loading

v2.16.0

Whats Changed

Chrome crawls now get the total bytes used over the network.
Improved ignore list for unwanted crawling request for chrome interception.

Full Changelog: v2.15.0...v2.16.0

Assets 2

Loading

All reactions

v2.15.0

04 Dec 14:37

j-mendez

Compare

Choose a tag to compare

Loading

v2.15.0

Whats Changed

Major possible performance increase for chrome crawling blocking extra unwanted XHR request and scripts.

perf(chrome): add xhr interception

Full Changelog: v2.14.0...v2.15.0

Assets 2

Loading

All reactions

v2.14.0

30 Nov 11:18

j-mendez

Compare

Choose a tag to compare

Loading

v2.14.0

Release Notes

Features

feat(transform): add transform_content_send for async streaming.

Improvements

chore(interning): add optional string-interning.
chore(website): fix crawl, establish domain removal [#233].
chore(transform): add streaming markdown/commonmark transforming.
chore(transform): add streaming text transforming.
chore(chrome): add request interception analytics ignore.

Bug Fixes

chore(page): fix URL encode handling mismatch.
chore(transform): fix repeated text streaming.
chore(page): fix page link return with full URLs.
chore(website): fix crawl delay handling.
perf(website): reduce extra context switching on crawls.

Thank you for the help @Revertron!

Full Changelog: v2.13.78...v2.14.0

Contributors

Revertron

Assets 2

Loading

Revertron reacted with hooray emoji

All reactions

🎉 1 reaction

1 person reacted

v2.13.78

27 Nov 15:54

j-mendez

Compare

Choose a tag to compare

Loading

v2.13.78

Whats Changed

Fix infinite loop with backoff Gateway retries
Fix limit handling break

Full Changelog: v2.13.64...v2.13.78

Assets 2

Loading

All reactions

v2.13.64

07 Nov 21:24

j-mendez

Compare

Choose a tag to compare

Loading

v2.13.64

Whats Changed

Major fixes for critical bugs that can hang the process.

perf reduce cpu usage for streaming rewriter
fix hang on iteration streaming
fix chrome connection hang
fix cache backend default build
fix domain absolute link join
fix shutdown break loop
add ignore protocol list

Full Changelog: v2.12.12...v2.13.64

Assets 2

Loading

All reactions

v2.12.12

05 Nov 03:09

j-mendez

Compare

Choose a tag to compare

Loading

v2.12.12

Fix smart mode re-rendering and performance

fix smart mode re-rendering inline js detection
perf improve smart mode parsing
fix encoding smart mode html
add pin html pre-parsing
add chrome status code check for performing full actions

Full Changelog: v2.11.20...v2.12.12

Assets 2

Loading

All reactions

v2.11.20

31 Oct 20:28

j-mendez

Compare

Choose a tag to compare

Loading

v2.11.20

Whats Changed

Major performance improvement on crawling processing pending tasks concurrently. Now you can get all Next.js SSG pages on initial crawl for websites that do not expose links and have dynamic event listeners for routing.

fix loop blocking tasks
improve crawl performance processing tasks concurrent
fix page absolute link joining
add wait_for_dom to target element updates chrome
add alert polyfill blocking prevention
add missing chrome navigate request timeout for http future
add ignore assets when crawling http
add with_block_assets builder config for Server response non html
perf(chrome): add skip other resources
feat(page): add nextjs build ssg path handling

Full Changelog: v2.11.0...v2.11.20

Assets 2

Loading

All reactions

Previous 1 2 3 4 5 … 11 12 Next

Previous Next

Footer

© 2024 GitHub, Inc.

Footer navigation

Terms
Privacy
Security
Status
Docs
Contact

You can’t perform that action at this time.