Skip to content

A chat dataset containing 250 subsets of messages with topical labels.

Notifications You must be signed in to change notification settings

mechanicpanic/Situation_Dataset

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 

Repository files navigation

Situation Dataset: freeCodeCamp data

This is a chat dataset based on the freeCodeCamp dataset, containing 250 subsets of messages I have called "situations". A situation is a subset of messages that revolves around a single event both temporally and thematically. There are six topic labels, and the subsets are of varying length and can have gaps relative to the original dataset.

The data has been manually annotated with my own software (CCA), in order to both show the functionality of the annotator and be used further in research on topic segmentation and chat untangling.

About

A chat dataset containing 250 subsets of messages with topical labels.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published