Skip to content

The Anatomy of a Goodreads Book Rating: Book Feature to Rating Analysis

Notifications You must be signed in to change notification settings

mbdelaresma/book-feature-to-rating-analysis

Repository files navigation

The Anatomy of a Goodreads Book Rating: Book Feature to Rating Analysis

Data Mining and Wrangling Course Submission

See full report HERE

Executive Summary

Goodreads is the leading book review website trusted by millions of users worldwide. In this project, our group analyzed the different book features and their relationship with book ratings. This was done using various data mining, wrangling, and visualization techniques. Results show that the most likely predictors for book rating are the number of book pages and the number of ratings. Insights on the other feature interactions also emerged. It was found that e-books are more prevalent for the romance genre and scarce for children’s books and comics. Further, faster reading time is observed for the e-book format. The data also validated the common notion that reading time is longer for books with higher number of pages and that there is a higher occurrence of text reviews compared to non-text.

Since there are limited book features in the dataset, it is recommended in future studies to extract user profiles as well, such as their age, gender, and other demographic and psychometric information of the reviewers. As for the current dataset available, it would be better to perform the methodology on a larger portion of the database to validate this study and gain more accurate information. Machine learning algorithms could also be explored.

Contributors

dela Resma, Marvee

Ginez, Zhoya

Inocencio, Ken

Nepomuceno, Colleen

Piquero, Geran

Punzalan, Paolo

About

The Anatomy of a Goodreads Book Rating: Book Feature to Rating Analysis

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published