Skip to content

sahanbull/PEEKC-Dataset

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 

Repository files navigation

PEEKC: A Large Dataset of Learner Engagement with Educational Videos

Summary

In this work, we release a large and novel dataset of learners engaging with educational videos in-the-wild. The dataset, named Personalised Educational Engagement with Knowledge Components (PEEKC), is one of the first publicly available datasets that address personalised educational engagement. Educational recommenders have received much less attention in comparison to e-commerce and entertainment-related recommenders, even though efficient personalised learning systems could improve learning gains significantly. One of the main challenges in advancing this research direction is the scarcity of large, publicly available datasets. In the PEEKC dataset, educational video lectures have been associated with Wikipedia concepts related to the material of the lecture, thus providing a humanly intuitive taxonomy. We believe that granular learner engagement signals, in unison with rich content representations, will pave the way to building powerful personalisation algorithms that will revolutionise educational and informational recommendation systems. Towards this goal, we 1) construct a novel dataset from a popular video lecture repository, 2) identify a set of benchmark algorithms to model engagement, and 3) run extensive experimentation on the PEEKC dataset to demonstrate its value. Our experiments with the dataset show promise in building powerful informational recommender systems.

Key Statistics

Events

  • Number of Events in the Training Data: 203,590
  • Number of Events in the Test Data: 86,945
  • Total Number of Events in the Dataset: 290,535

Users

  • Number of Learners in the Training Data: 14,050
  • Number of Learners in the Test Data: 5, 969
  • Total Number of Learners in the Dataset: 20,019

Lecture Videos

  • Number of Unique Lecture Videos in the Training Data: 6,835
  • Number of Unique Lecture Videos in the Test Data: 4,409
  • Total Number of Unique Lecture Videos in the Dataset: 7,999

Contact

For more information: Sahan Bulathwela (m.bulathwela@ucl.ac.uk)

About

To make the peek dataset available

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published