Reddit Scraper

beautifulSoup_basicScraper

Experiment with BeautifulSoup to write a parser that will take headlines from the front page of Reuters and then create a word cloud from them.

TODO:

fix handling of special sections (currently do not appear or have excessive whitespace due to formatting issues)
implemenation of word cloud processor
- output of word cloud for potential follow-on processing

reddit_newsAnalyzer

Experiment using NLTK and the PRAW Reddit API in order to analyze sentiment of top 1000 posts from an arbitrary subreddit

TODO:

Automatic report generation
Implement word cloud for top words from negative and positive set

reddit_bayes_textClassifier

follow-on to news analyzer that uses naive bayes algorithm to attempt to classify posts from subreddit and then predict sentiment of test data

TODO:

Improve accuracy of model
Increase training data set
Implement GPU acceleration

Text classifier scripts made by following and modifying tutorials from https://www.learndatasci.com/tutorials

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
.gitignore		.gitignore
README.md		README.md
beautifulSoup_basicScraper.py		beautifulSoup_basicScraper.py
reddit_bayes_textClassifier.py		reddit_bayes_textClassifier.py
reddit_dataRetrieval.py		reddit_dataRetrieval.py
reddit_newsAnalyzer.py		reddit_newsAnalyzer.py
reportTemplate.html		reportTemplate.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Reddit Scraper

beautifulSoup_basicScraper

Experiment with BeautifulSoup to write a parser that will take headlines from the front page of Reuters and then create a word cloud from them.

TODO:

reddit_newsAnalyzer

Experiment using NLTK and the PRAW Reddit API in order to analyze sentiment of top 1000 posts from an arbitrary subreddit

TODO:

reddit_bayes_textClassifier

follow-on to news analyzer that uses naive bayes algorithm to attempt to classify posts from subreddit and then predict sentiment of test data

TODO:

About

Releases

Packages

Languages

mashe0742/Reddit-Scraper

Folders and files

Latest commit

History

Repository files navigation

Reddit Scraper

beautifulSoup_basicScraper

Experiment with BeautifulSoup to write a parser that will take headlines from the front page of Reuters and then create a word cloud from them.

TODO:

reddit_newsAnalyzer

Experiment using NLTK and the PRAW Reddit API in order to analyze sentiment of top 1000 posts from an arbitrary subreddit

TODO:

reddit_bayes_textClassifier

follow-on to news analyzer that uses naive bayes algorithm to attempt to classify posts from subreddit and then predict sentiment of test data

TODO:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages