-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
robots.allowed returns false for other sites (and domains) #110
Comments
I can certainly understand the argument for not wanting it to return I have mixed feelings about what the behavior should be. On the one hand, Whenever we've used this, we generally are using it through the |
What's workaround for this? Many website contains robots.txt rules only for 2nd level domain this means that links containing "www.domain.com" will be also forbidden by rules while they're not. For example:
I'm thinking to remove |
Hi,
Let's take a look at the following example from Google:
For instance, when asked if any page on
http://other.example.com/
is allowed, reppy returnsFalse
.It should either return
True
or potentially throw an exception, but definitely not False.Returning False is incorrect because robots.txt is not a whitelist.
Here is an example:
The text was updated successfully, but these errors were encountered: