Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dev抓取微博报错 #39

Open
MythHack opened this issue Dec 11, 2014 · 2 comments
Open

dev抓取微博报错 #39

MythHack opened this issue Dec 11, 2014 · 2 comments
Assignees
Labels

Comments

@MythHack
Copy link

抓取微博的时候不知道为什么
parsers的176行会报错 mblog.created = parse(div.select('a.S_link2.WB_time')[0]['title'])

以下是错误信息
D:\cola\contrib\weibo>init.py
D:\cola\cola\core\opener.py:108: UserWarning: gzip transfer encoding is experimental!
self.browser.set_handle_gzip(True)
start to process priority: 0
process bundle from priority 0
get 3211200050 url: http://weibo.com/aj/mblog/mbloglist?count=15&pre_page=1&uid=3211200050&end_id=3786010796038435&_t=0&_k=1418233717575000&__rnd=1418233835289&pagebar=0&max_id=3751405376185938&page=1
D:\cola\cola\core\opener.py:108: UserWarning: gzip transfer encoding is experimental!
self.browser.set_handle_gzip(True)
start to process priority: 0
process bundle from priority 0
get 1898353550 url: http://weibo.com/aj/mblog/mbloglist?count=15&pre_page=1&uid=1898353550&end_id=3786306393521083&_t=0&_k=1418233717932000&__rnd=1418233844764&pagebar=0&max_id=3778673397938545&page=1
Error when handle bundle: 3211200050, url: http://weibo.com/aj/mblog/mbloglist?count=15&pre_page=1&uid=3211200050&end_id=3786010796038435&_t=0&_k=14182337175750
00&__rnd=1418233835289&pagebar=0&max_id=3751405376185938&page=1
list index out of range
Traceback (most recent call last):
File "D:\cola\cola\job\executor.py", line 519, in _parse_with_process_exception
res = self._parse(parser_cls, options, bundle, url)
File "D:\cola\cola\job\executor.py", line 442, in _parse
_options).parse()
File "D:\cola\contrib\weibo\parsers.py", line 177, in parse
mblog.created = parse(div.select('a.S_link2.WB_time')[0]['title'])
IndexError: list index out of range
Error when handle bundle: 1898353550, url: http://weibo.com/aj/mblog/mbloglist?count=15&pre_page=1&uid=1898353550&end_id=3786306393521083&_t=0&_k=14182337179320
00&__rnd=1418233844764&pagebar=0&max_id=3778673397938545&page=1
list index out of range
Traceback (most recent call last):
File "D:\cola\cola\job\executor.py", line 519, in _parse_with_process_exception
res = self._parse(parser_cls, options, bundle, url)
File "D:\cola\cola\job\executor.py", line 442, in _parse
*options).parse()
File "D:\cola\contrib\weibo\parsers.py", line 177, in parse
mblog.created = parse(div.select('a.S_link2.WB_time')[0]['title'])
IndexError: list index out of range
get 3211200050 url: http://weibo.com/aj/mblog/mbloglist?count=15&pre_page=1&uid=3211200050&end_id=3786010796038435&_t=0&_k=1418233717575000&__rnd=1418233835289&pagebar=0&max_id=3751405376185938&page=1
get 1898353550 url: http://weibo.com/aj/mblog/mbloglist?count=15&pre_page=1&uid=
1898353550&end_id=3786306393521083&_t=0&_k=1418233717932000&__rnd=1418233844764&
pagebar=0&max_id=3778673397938545&page=1
Error when handle bundle: 3211200050, url: http://weibo.com/aj/mblog/mbloglist?count=15&pre_page=1&uid=3211200050&end_id=3786010796038435&_t=0&_k=14182337175750
00&__rnd=1418233835289&pagebar=0&max_id=3751405376185938&page=1
list index out of range
Traceback (most recent call last):
File "D:\cola\cola\job\executor.py", line 519, in _parse_with_process_exception
res = self._parse(parser_cls, options, bundle, url)
File "D:\cola\cola\job\executor.py", line 442, in _parse
*options).parse()
File "D:\cola\contrib\weibo\parsers.py", line 177, in parse
mblog.created = parse(div.select('a.S_link2.WB_time')[0]['title'])
IndexError: list index out of range
Error when handle bundle: 1898353550, url: http://weibo.com/aj/mblog/mbloglist?count=15&pre_page=1&uid=1898353550&end_id=3786306393521083&_t=0&_k=14182337179320
00&__rnd=1418233844764&pagebar=0&max_id=3778673397938545&page=1
list index out of range
Traceback (most recent call last):
File "D:\cola\cola\job\executor.py", line 519, in _parse_with_process_exception
res = self._parse(parser_cls, options, bundle, url)
File "D:\cola\cola\job\executor.py", line 442, in _parse
*options).parse()
File "D:\cola\contrib\weibo\parsers.py", line 177, in parse
mblog.created = parse(div.select('a.S_link2.WB_time')[0]['title'])
IndexError: list index out of range
get 3211200050 url: http://weibo.com/aj/mblog/mbloglist?count=15&pre_page=1&uid=3211200050&end_id=3786010796038435&_t=0&_k=1418233717575000&__rnd=1418233835289&pagebar=0&max_id=3751405376185938&page=1
get 1898353550 url: http://weibo.com/aj/mblog/mbloglist?count=15&pre_page=1&uid=1898353550&end_id=3786306393521083&_t=0&_k=1418233717932000&__rnd=1418233844764&pagebar=0&max_id=3778673397938545&page=1
Error when handle bundle: 3211200050, url: http://weibo.com/aj/mblog/mbloglist?count=15&pre_page=1&uid=3211200050&end_id=3786010796038435&_t=0&_k=14182337175750
00&__rnd=1418233835289&pagebar=0&max_id=3751405376185938&page=1
list index out of range
Traceback (most recent call last):
File "D:\cola\cola\job\executor.py", line 519, in _parse_with_process_exception
res = self._parse(parser_cls, options, bundle, url)
File "D:\cola\cola\job\executor.py", line 442, in _parse
*options).parse()
File "D:\cola\contrib\weibo\parsers.py", line 177, in parse
mblog.created = parse(div.select('a.S_link2.WB_time')[0]['title'])
IndexError: list index out of range
get 1898353550 url: http://weibo.com/aj/mblog/mbloglist?count=15&pre_page=1&uid=1898353550&end_id=3786306393521083&_t=0&_k=1418233717932000&__rnd=1418234623885&pagebar=1&max_id=3734740953372321&page=1
get 1898353550 url: http://weibo.com/aj/mblog/mbloglist?count=50&pre_page=1&uid=1898353550&end_id=3786306393521083&_t=0&_k=1418233717932000&__rnd=1418234624721&page=2
get 1898353550 url: http://weibo.com/aj/mblog/mbloglist?count=15&pre_page=2&uid=1898353550&end_id=3786306393521083&_t=0&_k=1418233717932000&__rnd=1418234625316&pagebar=0&max_id=3656888933158899&page=2
Error when handle bundle: 1898353550, url: http://weibo.com/aj/mblog/mbloglist?count=15&pre_page=2&uid=1898353550&end_id=3786306393521083&_t=0&_k=14182337179320
00&__rnd=1418234625316&pagebar=0&max_id=3656888933158899&page=2
list index out of range
Traceback (most recent call last):
File "D:\cola\cola\job\executor.py", line 519, in _parse_with_process_exception
res = self._parse(parser_cls, options, bundle, url)
File "D:\cola\cola\job\executor.py", line 442, in _parse
*options).parse()
File "D:\cola\contrib\weibo\parsers.py", line 177, in parse
mblog.created = parse(div.select('a.S_link2.WB_time')[0]['title'])
IndexError: list index out of range
get 3211200050 url: http://weibo.com/aj/mblog/mbloglist?count=15&pre_page=1&uid=3211200050&end_id=3786010796038435&_t=0&_k=1418233717575000&__rnd=1418233835289&
pagebar=0&max_id=3751405376185938&page=1
Error when handle bundle: 3211200050, url: http://weibo.com/aj/mblog/mbloglist?count=15&pre_page=1&uid=3211200050&end_id=3786010796038435&_t=0&_k=14182337175750
00&__rnd=1418233835289&pagebar=0&max_id=3751405376185938&page=1
list index out of range
Traceback (most recent call last):
File "D:\cola\cola\job\executor.py", line 519, in _parse_with_process_exception
res = self._parse(parser_cls, options, bundle, url)
File "D:\cola\cola\job\executor.py", line 442, in _parse
*_options).parse()
File "D:\cola\contrib\weibo\parsers.py", line 177, in parse
mblog.created = parse(div.select('a.S_link2.WB_time')[0]['title'])
IndexError: list index out of range
get 1898353550 url: http://weibo.com/aj/mblog/mbloglist?count=15&pre_page=2&uid=1898353550&end_id=3786306393521083&_t=0&_k=1418233717932000&__rnd=1418234625316&pagebar=0&max_id=3656888933158899&page=2
get 3211200050 url: http://weibo.com/aj/mblog/mbloglist?count=15&pre_page=1&uid=3211200050&end_id=3786010796038435&_t=0&_k=1418233717575000&__rnd=1418233835289&pagebar=0&max_id=3751405376185938&page=1
Error when handle bundle: 3211200050, url: http://weibo.com/aj/mblog/mbloglist?count=15&pre_page=1&uid=3211200050&end_id=3786010796038435&_t=0&_k=14182337175750
00&__rnd=1418233835289&pagebar=0&max_id=3751405376185938&page=1
list index out of range
Traceback (most recent call last):
File "D:\cola\cola\job\executor.py", line 519, in _parse_with_process_exception
res = self._parse(parser_cls, options, bundle, url)
File "D:\cola\cola\job\executor.py", line 442, in _parse
**options).parse()
File "D:\cola\contrib\weibo\parsers.py", line 177, in parse
mblog.created = parse(div.select('a.S_link2.WB_time')[0]['title'])
IndexError: list index out of range

@qinxuye qinxuye added the bug label Dec 11, 2014
@qinxuye qinxuye self-assigned this Dec 11, 2014
@qinxuye
Copy link
Owner

qinxuye commented Dec 11, 2014

这是个历史遗留问题了,有的账号还不能重现。如果我不能重现问题的话,还需要你提供相关的原始文件之类的。

@MythHack
Copy link
Author

好没问题:)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants