Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mongodb只生成了Fans和Follows两个表,然后爬数据一直显示302,没有爬到数据。登录又显示成功,cookie获取成功,哪位高手解答下,万分感谢! #65

Open
pythonmanGo opened this issue Nov 7, 2017 · 3 comments

Comments

@pythonmanGo
Copy link

mongodb只生成了Fans和Follows两个表,然后爬数据一直显示302,没有爬到数据。登录又显示成功,cookie获取成功,哪位高手解答下,万分感谢!

登录提示:
2017-11-07 10:45:58 [Sina_spider1.cookies] WARNING: Get Cookie Success!( Account:我是马赛克 )
2017-11-07 10:45:58 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): login.sina.com.cn
2017-11-07 10:45:58 [urllib3.connectionpool] DEBUG: https://login.sina.com.cn:443 "POST /sso/login.php?client=ssologin.js(v1.4.18) HTTP/1.1" 200 None
2017-11-07 10:45:58 [Sina_spider1.cookies] WARNING: Get Cookie Success!( Account:我是马赛克 )

爬内容时提示:
2017-11-07 10:46:00 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://weibo.cn/5235640836/follow> from <GET http://weibo.cn/5235640836/follow>
2017-11-07 10:46:40 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://weibo.cn/5235640836/fans> from <GET http://weibo.cn/5235640836/fans>
2017-11-07 10:46:59 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)

@zhanghanbin3159
Copy link

需要将spider中所有http改成https即可

@pythonmanGo
Copy link
Author

1、已将spider中所有http改成https即可;
2、修改getcookie函数 browser = webdriver.PhantomJS(executable_path=r"D:\java\Python27\我是路径马赛克\phantomjs.exe")
3、获取cookie正常: Get Cookies Finish!( Num:1)
4、系统环境win7 64位 4G内存

但是获取cookie后弹出系统错误:python.exe 已停止运行
错误如下;

问题签名:
问题事件名称: BEX
应用程序名: python.exe
应用程序版本: 0.0.0.0
应用程序时间戳: 4c303241
故障模块名称: MSVCR90.dll
故障模块版本: 9.0.30729.6161
故障模块时间戳: 4dace5b9
异常偏移: 00066d03
异常代码: c0000417
异常数据: 00000000
OS 版本: 6.1.7601.2.1.0.256.1
区域设置 ID: 2052
其他信息 1: abf7
其他信息 2: abf7f34af3b04ddccc0d33fe401c1c02
其他信息 3: 79a5
其他信息 4: 79a5afb460eb4649151b9562e857bf2f

程序并没有报错,如何处理求指教

@zhanghanbin3159
Copy link

我不是使用的这个 browser = webdriver.PhantomJS 用的火狐的driver

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants