Skip to content

Latest commit

 

History

History
99 lines (78 loc) · 3.16 KB

book_skeleton.md

File metadata and controls

99 lines (78 loc) · 3.16 KB

Book Skeleton

The aim of this file is iterate/decide on:

  • A list of chapters
  • The scope of each chapter
  • The order of the chapters

Update from January Sprint

Completed Chapters

WIP

Needed

  • Collaborating (through Github/Gitlab) - @pherterich and @rosiehigman
  • Case studies - @LouiseABowler
  • Checklist - @annakrystalli and @KirstieJane

Desired

  • Working Environment and notebooks - @LouiseABowler & @sgibson91
  • Coding Styles and Linting - @r-j-arnold
  • Deep Learning
  • Ethics
  • Credit for Reproducible Research
  • Scoping a data project - RSEs
  • Similarities and differences across data science disciplines
  • Visualisation

Ideas from the November Sprint

Why do reproducbile research

  • For yourself
  • For everyone else

Git & Version Control

  • Version Control - what is it and why use it. (Or xx_final_Final.docx is version control, but you can do better.)
  • Your first project using Git
  • Accessing previous versions of your code
  • Branches and Merging
  • Semantic Versioning - Tags and releases (and how that fits in with trad publishing DOIs)

Collaborating throught Github

  • Commenting and documenting projects
  • How to write a good README
  • How to write good commit messages
  • Anatomy of a github repo
  • Using Pull requests
  • How to use cloud-based tools to power-up your repo (e.g. cloud-based CI see below)

Testing for research

  • Why should you test your code?
  • What is a good test
  • End-to-end testing vs. Unit Testing & other types.

Data Management

  • Documenting Data
  • Open (FAIR) Standards

Reproducible Compute Environments

  • Packaging code for re-use (containers)
  • Managing Dependences
  • Continuous Integration (and available tools)
  • Random Seeds & Dealing with Stochastic simulations (e.g. due to floating point errors)
  • Reproducible Research for HPC Projects

Getting Credit for Reproducible Research

Licensing For re-use

Definition of Reproducibility

  • How does reproducibility overlap with Open

Reproducible as a team

  • Roles in a reproducible project
  • Maintaining reproducibility as a group

Barriers to Reproducibility

A reproducible paper

Your Working Environment

  • Introduction to the command line
  • Alternatives to the Command Line
  • Interactive Development Environments
  • Jupyter Notebooks (and common issues)

Ethics

  • Dealing with Sensitive Data
  • Getting Consent for Data Sharing
  • Ethics Approbal @ the Turing

Scoping a data project

  • What makes a good data science project