Skip to content

alessandrocubic/Unipi-Data-Mining-project-AY-22-23

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data Mining project on Twitter Datasets

Project for the Data Mining course @ University of Pisa

Authors: Alessandro Bucci, Alberto Marinelli, Giacomo Cignoni

Abstract

Data Mining project carried out on two datasets extracted from the Twitter platform, one on users and one on Tweets. The project consists of data analysis based on data mining tools divided into four tasks:

Task 1: Data Understanding and Preparation

Task 1.1: Data Understanding, explore the dataset with the analytical tools. Evaluate data quality, distribution of variables and pairwise correlations

Task 1.2: Data Preparation, improve the quality of the data and prepare them by extracting new interesting features to describe the user and their behaviour from the information gathered from tweets

Task 2: Clustering analysis, based on the user’s profile explore the dataset using various clustering techniques

Task 3: Predictive Analysis, consider the problem of predicting for each user the label which is a binary variable that indicates if a user is a bot or a genuine user

Task 4: Time Series Analysis, conduct an analysis of the time series extracted in the year 2019

About

Data Mining project University of Pisa. AY 22/23

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •