AppleHealthAnalysis

Introduction

'AppleHealthAnalysis' is a R package to import and analyse data generated by the iOS Apple Health app. This includes data that is entered by the user, from iOS apps, as well as from any Apple Watch data generated.

The main function of the app is to parse the XML file, and convert the data into a R data frame. From there on the user can use the built in functions of R to analyse and plot the data. There will be some built in analysis functions for convenience, however with the amount of data the user can use their imagination to find out ways to analyse their own data.

Things that are good to know

I'm a fan of Hadley Wickham's tidyverse, and dplyr in particular, however have a bit to go to use them to their full potential. I've kept the imported data as dataframes, however changing them to tibbles may make it more efficient in the future.

Memory requirements

For modern computers there should not be too much of a problem in using the package to analyse Apple Health data. However as the amount of data in an individuals Apple Health app grows with time this may cause problems with memory allocation in R, especially with computers that are still have 2/4/8GB of RAM. As far as I understand R needs to hold all data in memory, however there may be clever people who can get around this. I think the Microsoft R implementation may be able to do disk based streaming of data.

As I was developing the package, I noted the following memory use when stepping through each step in the XML import (and this is with a fairly small Apple Health data set!):

14 MB - Apple Health exported ZIP file
28.2 MB - unzipped XML file (plus a separate CDA xml file: not sure what this does)
46 MB - XML extract containing the XML "Records" elements
109.3 MB - R list of the "Records" elements (the ones we are interested in)
3.2 MB - the resulting exported XSLX file

So the main memory requirements are on XML extraction. I think dplyr's piping system is the most efficient way of getting the data out of the XML file, but suggestions on how to improve resource requirements would be gratefully appreciated.

How to use

Download the GitHub repository and open the R Studio project. Then do the following:

# Get some Apple Health data an put the export.xml file into your directory
library(devtools)
install_github("deepankardatta/AppleHealthAnalysis")
library(AppleHealthAnalysis)
health_data <- ah_import_xml("export.xml")
ah_shiny(health_data)
# Explore

Future developments

Get feedback on the code efficiency
Find someone to help make the code more efficient
If data parsing takes a long time think about a progress bar
Get some more built in reports
Work out if there is anymore useful data in the exported XML files to use
Shiny dashboard
Data comparisons

Things that I have read to help make this package

I had a look at a few things to help make this package

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
R		R
man		man
.Rbuildignore		.Rbuildignore
.gitignore		.gitignore
AppleHealthAnalysis.Rproj		AppleHealthAnalysis.Rproj
DESCRIPTION		DESCRIPTION
LICENSE		LICENSE
NAMESPACE		NAMESPACE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AppleHealthAnalysis

Introduction

Things that are good to know

Memory requirements

How to use

Future developments

Things that I have read to help make this package

About

Releases

Packages

Languages

License

deepankardatta/AppleHealthAnalysis

Folders and files

Latest commit

History

Repository files navigation

AppleHealthAnalysis

Introduction

Things that are good to know

Memory requirements

How to use

Future developments

Things that I have read to help make this package

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages