Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor the chapter on publishing #761

Merged
merged 17 commits into from
Oct 1, 2021
Merged

Refactor the chapter on publishing #761

merged 17 commits into from
Oct 1, 2021

Conversation

adswa
Copy link
Contributor

@adswa adswa commented Sep 24, 2021

This work-in-progress PR starts a larger effort on restructuring and extending the handbooks content on publishing datasets to third party services.

In particular, it

  • separated service-specific walk-throughs into standalone sections. This affected the "intro" section in particular, from which content on configuring dropbox as a special remote, exporting dataset contents to figshare, and using Git-lfs have each been transformed into standalone sections in the chapter. At the moment, there are walk-throughts for dropbox, s3, figshare, git lfs and gin. This new structure may also make it easier to add new services as walk-throughs.
  • improved the introduction into the topic of dataset publication. I have split it into different usecases depending on the capabilities of the employed third party services. The intro lists them, and links to the relevant walk-throughs later in the chapter.
  • added a standalone section on data privacy
  • extended the example on exporting datasets to figshare with screenshots and code

There is still missing content and room for improvement:

  • The placeholder section on publishing datasets to git repository hosting services needs to be filled with the current commands and procedures to do this
  • The placeholder section on publishing datasets to git repository hosting services needs to document the new create-sibling-{github/gogs/gitlab...} command that will come in 0.16.0
  • The walkthroughs might benefit from a more generic setup, such as the one that @jsheunis has created in the S3 workflow, where the section is a complete stand-alone piece including the generation of a dataset to publish. If we adopt this approach for the other walk-throughs, too, their content can become more accessible and would not require readers to e.g., have a "DataLad-101" dataset from the previous chapters to publish.
  • the section on data privacy could be extended with more strategies and ideas or examples - at the moment, its a lose and hastily collected set of ideas. @mslw maybe has more content based on his work on the RDM module on this topic

In particular, group general dataset sharing scenarios into subsections,
remove the dropbox, gitlfs and figshare walk through to make them stand
alone sections, and link relevant sections later in the chapter as
examples and further information for each of the dataset sharing
scenarios.

This commit also includes more of the figures created for dataset publishing
into the handbook.
…extend it

include Figshare walkthrough in toctree
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant