Skip to content

njtierney/numbat-data

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 

Repository files navigation

A Realistic Guide to Making Data Available Alongside Code to Improve Reproducibility.

Netlify Status

Abstract

Data makes science possible. Sharing data improves visibility, and makes the research process transparent. This increases trust in the work, and allows for independent reproduction of results. However, a large proportion of data from published research is often only available to the original authors. Despite the obvious benefits of sharing data, and scientists' advocating for the importance of sharing data, most advice on sharing data discusses its broader benefits, rather than the practical considerations of sharing. This paper provides practical, actionable advice on how to actually share data alongside research. The key message is sharing data falls on a continuum, and entering it should come with minimal barriers.

Slide available here

Working paper available here

Take home messages

  • You don't have to do every single thing to publish your data
  • Take small steps - get the data somewhere first, add more detail as you go
  • Try and get a DOI from a service like Zenodo or Dryad

Thanks

  • Karthik Ram
  • Miles McBain
  • Anna Kystalli
  • Daniella Lowenberg
  • ACEMS International Mobility Programme
  • Helmsley Charitable Trust
  • Gordon and Betty Moore Foundation
  • Sloan Foundation

Resources

Colophon

Bio

Dr. Nicholas Tierney (PhD. Statistics, BPsySci (Honours)) is a Lecturer in Business Analytics and Statistics at Monash University, working with Professors Dianne Cook and Rob Hyndman. His research aims to improve data analysis workflow, and make data analysis more accessible. Crucial to this work is producing high quality software to accompany each research idea. Mostly recently, Nick's work is focussing on exploring longitudinal data (brolgar), and improving how we share data alongside research ( ddd). Other work has focussed on exploring data with the R package visdat, and on creating analysis principles and tools to simplify working with, exploring, and modelling missing data with the package naniar. Nick has experience working with decision trees (treezy), optimisation (maxcovr), Bayesian Data Analysis, and MCMC diagnostics (mmcc.

Nick is a member of the rOpenSci collective, which works to make science open using R, has been the lead organiser for the rOpenSci ozunconf events from 2016-2018 (2016, 2017, 2018), and co-hosts the rstats podcast "Credibly Curious" with Dr. Saskia Freytag. Outside of research, Nick likes to hike, rockclimb, make coffee, bake sourdough, (eventually) knit a hat, take photos, and explore new hobbies.

About

repo for my data talk at NUMBAT

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages