Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use case: external scientist without data access needs results #233

Open
mih opened this issue Nov 2, 2019 · 2 comments · May be fixed by #235
Open

Use case: external scientist without data access needs results #233

mih opened this issue Nov 2, 2019 · 2 comments · May be fixed by #235
Labels

Comments

@mih
Copy link
Collaborator

mih commented Nov 2, 2019

This is a common problem for any data analysis involving personal information. Approach:

  • Build dataset that implements the same structure (organization, and filenames if possible), but does not contain the actual problematic data (maybe tracked, but not available through annex, but maybe even without any relationship to the actual data, i.e. mock-data,or simulated data)
  • Provide dataset publicly to aid development of analysis implementations
  • Clearly describe how this mock differs from the inaccessible other dataset
  • External users are instructed to create a new dataset (to hold their code) that has the mock dataset as a subdataset
  • External users submit their dataset, the subdataset is replaced with the real dataset (actual version is tracked), code is executed (after having been reviewed), results are captured in the submitted dataset.
  • Results are pushed back to the external users (or deposited in an accessible place for them to pull) -- the local data remains local and unavailable
@adswa adswa linked a pull request Nov 4, 2019 that will close this issue
@adswa adswa added UX use-case and removed UX labels Nov 4, 2019
@adswa
Copy link
Contributor

adswa commented Dec 3, 2019

note to self from mihs talk today: "bring the computation to the data"

@adswa
Copy link
Contributor

adswa commented Dec 19, 2019

I have started a draft of this a while ago already in #235, let's actually do this with the studyforrest data at the start of 2020 :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants