Feature Request: add BDfR as a new extractor for archiving Reddit content #778
Labels
good first ticket
help wanted
size: medium
status: idea-phase
Work is tentatively approved and is being planned / laid out, but is not ready to be implemented yet
touches: configuration
touches: dependencies/packaging
Issues or changes that add/remove/affect dependencies
touches: docs
Discussed in #754
Originally posted by BlipRanger May 24, 2021
Just wanted to make a quick mention of BDfR as a cool project that might make for a good starting point for the unrolling of reddit comments/posts as mentioned in the roadmap. They currently support grabbing a variety of media types from the post as well as the comments/text in a separate (json) file. I've been working on an addon for it lately and I think it's a pretty great project with well-maintained code. If nothing else, they have really good examples of working with reddit data which could be useful! Just wanted to bring that to your attention!
I'd love to add BDfR as an extractor for Reddit content (and something similar for Twitter too #345) but am somewhat swamped with work and travel for the near future.
If you @BlipRanger or anyone else wants to add it as an extractor (matching the style of our other extractors, e.g.
archivebox/extractors/media.py
is a great example to copy), I'd be happy to review PRs!We have some good instructions for contributing a new extractor and getting started with ArchiveBox development in general:
The text was updated successfully, but these errors were encountered: