planet_n_spider v0.0.4 / dependency updates and CI/CD migration to Poetry v2.0 #125
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR includes the following changes:
planet_n_spider
v0.0.4: hide cookie-banners of the "Simple Cookie Control" WordPress Pluginspider
-classes (e.g., utility classes like thees_connector
,license_mapper
and other modules) with loguru to increase the readability and helpfulness of crawler logs, especially when encountering non-Scrapy
-related log messagesscrapy.Spider.logger
(see Scrapy Docs: Logging from Spiders))pyproject.toml
file necessary (see: pyproject.toml specification)poetry
installation to v2.0 the next time you open the project in your IDE! (restarting your IDE might be necessary afterwards)browserless
API Endpoint (see:converter/web_tools.py
) for the most recentbrowserless
-image in combination withplaywright v1.49.1