Skip to content
This repository has been archived by the owner on Feb 23, 2023. It is now read-only.

Praw rescrape of entire dataset

Latest
Compare
Choose a tag to compare
@elleobrien elleobrien released this 20 Feb 22:53
· 3 commits to master since this release

In response to a discovery that pushshift.io returned unrepresentative scores on posts created during several months in 2018-19, have rescraped the entire dataset using praw to get the scores. This led to a ~30K new data points with scores >= 3 discovered!

For more see issue #1