Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Yielding requests with callbacks #12

Open
pawelmhm opened this issue Dec 28, 2017 · 0 comments
Open

Yielding requests with callbacks #12

pawelmhm opened this issue Dec 28, 2017 · 0 comments

Comments

@pawelmhm
Copy link

Since version 3.0 there are restrictions if Request from generator has callback/errback. Why is it like this? What is the reason for this change?

I have some spiders like this

# -*- coding: utf-8 -*-
import json
import scrapy
from inline_requests import inline_requests


class toscrapecssspider(scrapy.spider):
    name = "toscrape-css"
    start_urls = [
        'http://quotes.toscrape.com/',
    ]

    @inline_requests
    def parse(self, response):
        some_data = yield scrapy.request('http://httpbin.org/headers')
        print(json.loads(some_data.body))
        next_page_url = response.css("li.next > a::attr(href)").extract_first()
        if next_page_url is not none:
            yield scrapy.request(response.urljoin(next_page_url), callback=self.parse_page)

    def parse_page(self, response):
        print(response.url)
        print("hello")

This still works fine, but prints warnings

2017-12-28 12:33:04 [scrapy.core.engine] DEBUG: Crawled (200) <GET http://httpbin.org/headers> (referer: http://quotes.toscrape.com/)
{u'headers': {u'Accept-Language': u'en', u'Accept-Encoding': u'gzip,deflate,br', u'Host': u'httpbin.org', u'Accept': u'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8', u'User-Agent': u'Scrapy/1.4.0 (+http://scrapy.org)', u'Connection': u'close', u'Referer': u'http://quotes.toscrape.com/'}}
2017-12-28 12:33:05 [py.warnings] WARNING: /home/pawel/.virtualenvs/scrapy/local/lib/python2.7/site-packages/inline_requests/generator.py:59: UserWarning: Got a request with callback set, bypassing the generator wrapper. Generator may not be able to resume. <GET http://quotes.toscrape.com/page/2/>
  "be able to resume. %s" % ret)

2017-12-28 12:33:05 [scrapy.core.engine] DEBUG: Crawled (200) <GET http://quotes.toscrape.com/page/2/> (referer: http://httpbin.org/headers)
http://quotes.toscrape.com/page/2/
hello

What can happen if generator may not be able to resume? Is there some way to preserve behavior from before 3.0 and skip warnings?

@rmax

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant