Skip to content

Commit

Permalink
add motto and improving bullets
Browse files Browse the repository at this point in the history
  • Loading branch information
brunj7 committed Apr 15, 2024
1 parent 4c6a077 commit f401f8a
Showing 1 changed file with 14 additions and 6 deletions.
20 changes: 14 additions & 6 deletions preserve.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,11 @@ title: "Archiving and preserving your data"

As you finalize your project, an important task is to archive your data in a publicly available repository (pending sensitivity and by non-disclosure agreement exceptions). There are a few important steps to ensure that your data can be reused by others and thus make your work more reproducible.

Your general philosophy when preparing the preservation of your scientific products should be: ***Document what you used and preserve what you produced***

See below for more information.

<hr>

## What scientific products to preserve?

Expand All @@ -16,14 +21,14 @@ Here are a few questions to ask yourself to determine if you should refer in you

1. The **raw data is already publicly accessible**, and the hosting solution (website, FTP server, etc.) seems well maintained (ideally providing a recommended citation)

=> Document the website or process you used to collect the data and when you accessed/downloaded the data you used. Try also to determine if a pecific version number is associated with the data you used.*
=> *Document the website or process you used to collect the data and when you accessed/downloaded the data you used. Try also to determine if a specific version number is associated with the data you used.*

2. The raw data is **not** publicly accessible

Note that we are not talking about data under a non-disclosure agreement (NDA) here but more about data with an unclear reuse status or obtained by interactions with a person or an institution. For example, if the data you used were sent to you privately, then we recommend that you:
Note that we are not talking about data under a non-disclosure agreement (NDA) here but more about data with an unclear reuse status or obtained by interactions with a person or an institution. For example, if the data you used were sent to you privately, then we recommend that you:

- inquire with your person of contact about the status of licensing and if they would be willing to let you share those data publicly. You might face resistance at first, so take the time to explain why you think it is valuable to your work to also share those data sets.
- if, in the end, it is not possible to share the data, please still describe the data in your documentation and list the contact information (person or institution) to inquire about this data set.
- inquire with your person of contact about the status of licensing and if they would be willing to let you share those data publicly. You might face resistance at first, so take the time to explain why you think it is valuable to your work to also share those data sets.
- if, in the end, it is not possible to share the data, please still describe the data in your documentation and list the contact information (person or institution) to inquire about this data set.


### Intermediate data
Expand All @@ -44,6 +49,8 @@ Those services are often well-integrated with data repositories that link your c
We recommend including any data set used to produce statistics, figures maps, and other visualizations that were used in your work, in this case, even if generated by scripts.


<hr>

## Choosing a data repository

OK, we know what we want to archive. Now let’s decide where we want to preserve things!
Expand Down Expand Up @@ -78,15 +85,16 @@ Note that data repositories often support a certain number of licenses, so this

If you want to know more on how to best license your data, click [here](https://www.library.ucsb.edu/sites/default/files/dls-n10-2021-licensing-navy_0.pdf)


## Documenting your work

To make your archiving process the most efficient, it is key to document your work as you progress throughout your project. If you do so, archiving your data will consist of collecting existing information about the various parts of your project rather than developing it from scratch a few months after you have generated this specific dataset.
To make your archiving process the most efficient, it is key to document your work as you progress throughout your project. If you do so, archiving your data will consist of collecting existing information about the various parts of your project rather than developing it from scratch a few months after you have generated this specific data set.

Add an image about the power of README

### Metadata

Metadata aims at describing your data with enough information that should let you be able to reuse this data even if you know nothing about this specific dataset. It is sometimes defined as data about data. So what should you include? Here are some pointers:
Metadata aims at describing your data with enough information that should let you be able to reuse this data even if you know nothing about this specific data set. It is sometimes defined as data about data. So what should you include? Here are some pointers:

- Describe the contents of data files. If you are using complex jargon or concepts make sure you refer to external vocabulary or clearly define these terms as used in your project
- Keep data entry consistent
Expand Down

0 comments on commit f401f8a

Please sign in to comment.