This repository is for the final project of Cousera's Getting and Cleaning Data Course (Course 3 of Data Science Specialization).
- data (directory) - directory where all given data is. (will be created upon program execution).
- README.md - this readme file.
- run_analisys.R - scripts to manipulate data and generate tidy data.
- tidy.txt - tidy data, final output of the assignment.
- codebook.md - codebook o the tidy data.
- On R environment, just execute source("run_analisys.R").
- The script will download and unzip the data (if necessary).
- The script will generate two dataframes: all.data and tidy).
- Dataframe all.data contains the merged data (with mean and std measurements) of train and test.
- Dataframe tidy contains the average of each variable for each activity and each subject.
- Dataframe tidy will be writen to a file called "tidy.txt".
- Calls download.file.and.unzip
- Creates all.data by calling create.all.data
- Creates tidy by calling create.tidy
Function that extracts the given zip file (assignment input). If file does not exists, it will download it.
Function that returns the dataframe of code-activity_labels.
Function that returns a vector of all features in dataset, including the ones we don't want.
Function that reads train and test data (measurements, activities, subjects) and returns the data frame.
- To return train data frame, call read.data with dataType argument equal to "train".
- To return test data frame, call read.data with dataType argument equal to "test".
Function that orchestrates previous functions and returns a merged data frame of both train and test data.
Function that transforms the "all.data" dataframe into another dataframe with the average of each variable for each activity and each subject.