Implement respectRobotsTxtFile
crawler option
#1144
Labels
product roadmap
Issues synchronized to product roadmap.
t-tooling
Issues with this label are in the ownership of the tooling team.
This option automatically fetches the robots.txt file based on the current request and adheres to the
disallow
directives.JS version was implemented via the following PRs:
respectRobotsTxtFile
crawler option crawlee#2910onSkippedRequest
option crawlee#2916RobotsFile
toRobotsTxtFile
crawlee#2913We will first need to implement the
RobotsTxtFile
andSitemap
classes:The text was updated successfully, but these errors were encountered: