-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Scraping fails due to metadata changes #29
Comments
Thanks for the report (I haven't used infoqscraper for a while). I'm kind of busy these days, but I will fix that. |
No problem, I have a 'fix' (just removed the unused metadata fields) on my fork (https://github.com/andreweacott/infoqscraper/tree/bugfix/resolve_scraper_failure) but I've not been able to get the tests to complete so didn't want to raise a PR. The fixed app works for me though. |
Even with the fixed fork by @andreweacott I keep getting the following error:
This happens with both older videos (like the one above) and new ones (e.g. work-purpose). Using Gentoo Linux with RTMPDump 2.4 (version dated 2016/12/10) and Python 2.7.15 / 3.6.5 (not sure which this program runs on). The outcome seems a little weird as rtmpdump's source only ever seems to exit with 0, 1, 2 or 3 (i.e. one of the RD_* constants), and infoqscraper's rtmpdump -q -e -r rtmpe://video.infoq.com/cfx/st/ -y mp4:presentations/qcon08-howbigismybus.mp4 -o temp_video.avi which I can only get to return 1 – it fails to get the last keyframe and closes the connection. If I omit the -e flag, it connects and handshakes but then invariably segfaults with a resulting exit code of 139. Sorry for just dumping this here, but there's no issue tracker for the fork and this is my first time using rtmpdump directly. Do you have any idea if my problem is in infoqscraper, in my version of rtmpdump itself or perhaps some misunderstanding? |
Found in version 0.1.5
As of March 2019, scraping presentations no longer works due to format changes in the presentation HTML page.
In fact, the fields that
scrap.py
is looking for are metadata and are not used by the main application. Removing them allows presentation to be grabbed correctly.The text was updated successfully, but these errors were encountered: