This repository has been archived by the owner on Aug 11, 2023. It is now read-only.
v0.2.0 - Sitemap enhancements
Adds some sitemap enhancements, most notably sitemap index support.
- Feature: Add support for sitemap index detection. If a sitemap index is detected, the package will recursively gather the URLs listed in each sitemap in your sitemap index and include them in requests. If a standard sitemap file is passed, only that sitemap will be processed.
- Bug fix: Add a dummy user-agent to avoid being blocked by firewalls (e.g. CloudFlare returns a 403 if it detects the default requests user-agent)
- Code quality: use truthy/falsy return value for validating sitemap URL extension rather than an explicit
is True