You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We set HTTPERROR_ALLOW_ALL = True. If we had left it to False, HttpErrorMiddleware would have raised an HttpError exception, which subclasses IgnoreRequest – a special exception class that gets ignored by Scrapy. That middleware also implements process_spider_exception to handle that exception and log and count the HTTP errors.
Assuming we can write a new spider middleware to handle the HttpError exception first, we can have it return FileError items instead. That way, we can remove all the @handle_http_error decorators.
Some spiders handle HTTP errors in special ways. For those spiders, the handle_httpstatus_list spider attribute can be set, as documented by HttpErrorMiddleware. They include spiders using:
is_http_success
response.status
The text was updated successfully, but these errors were encountered:
Since we want to use handle_http_error on some request callbacks but not all request callbacks, I think it's simplest to leave it as a decorator. For example, the Paraguay spiders use handle_http_error for data requests, but manually handles errors for access token requests.
We set
HTTPERROR_ALLOW_ALL = True
. If we had left it toFalse
, HttpErrorMiddleware would have raised anHttpError
exception, which subclassesIgnoreRequest
– a special exception class that gets ignored by Scrapy. That middleware also implementsprocess_spider_exception
to handle that exception and log and count the HTTP errors.Assuming we can write a new spider middleware to handle the
HttpError
exception first, we can have it return FileError items instead. That way, we can remove all the@handle_http_error
decorators.Some spiders handle HTTP errors in special ways. For those spiders, the
handle_httpstatus_list
spider attribute can be set, as documented by HttpErrorMiddleware. They include spiders using:is_http_success
response.status
The text was updated successfully, but these errors were encountered: