DataBrewer Recipes Repository.
- Free software: MIT license
- Documentation: https://databrewer.readthedocs.org.
- Project: https://github.com/rolando/databrewer
This is a collection of dataset recipes, that is, a simple description of where to find existing datasets archives.
The recipes itself are licensed under MIT license. Each dataset may have its own licensing and usage restrictions.
This recipes are used by the databrewer
tool. See https://github.com/rolando/databrewer
You can contribute in several ways, for example:
- Requesting additions of new datasets.
- Reporting errors in existing datasets.
- Adding new recipes for interesting datasets.
- Improving existing recipes: better descriptions, keywords, fixing URLs, etc.
- The
name
fields must be all lowercase and separated by dashes (if needed). - Brackets can be used to group subsets of files within the dataset.
- Single-file datasets can use the
url
field. - If dataset comes from a dataset repository or single entity, a short prefix
should be added to the name (i.e.:
fte-<name>
for FiveThirty datasets). - If a dataset has a download page but is not available for direct downloading, the field restricted must be set to true.
Example recipes:
- Single-file: fte-pulitzer.yaml
- Multiple-files: uci-zoo.yaml
- Multiple-files with subsets: fte-uber-tlc.yaml
- Multiple-files with subsets and dates: nyc-tlc-taxi.yaml
- Restricted downloads: kaggle-comp-titanic.yaml