Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release/oedatamodel v1.0.0 #4

Closed
wants to merge 17 commits into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
54 changes: 53 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,54 @@
# oedatamodel
A common open energy data model (oedatamodel) and datapackage format for energy and scenario data
A common open energy data model (oedatamodel) and datapackage format for energy and scenario data.

# Introduction
The oedatamodel is provided as a template data model as Entity Relationship Modell(ERM) and is designed
for the usage on the open energy platform but it can be used in common relational database systems.
Additional we include a datapackage for every release.

Existing approaches and ideas such as the [IAMC data format](https://github.com/IAMconsortium/pyam#data-model) or [Do-a-thon: Towards a common data standard
for integrated assessment and energy systems modelling](https://forum.openmod-initiative.org/t/do-a-thon-towards-a-common-data-standard-for-integrated-assessment-and-energy-systems-modelling/1774/5) were adopted in the development process.

The latest version can be found in the folder oedatamodel/latest.
The version available there offers the user 2 ERM's. The ERM "oedatamodel.pdf" shows the data model
that can be implemented on a database (e.g. postgresql). Here tables, relations, column names and
data types are provided. The ERM "oedatamodel-readable.pdf" is provided additionally and is designed
to simplify the editing of the data by a user. It also provides tables and relations as well as column
names and datatypes, but the tables are in a less normalized format. We recommend this version of the
data model for the implementation in e.g. csv tables because the advantage of the human readable format
is not optimal for the technical usage in a database.

In addition, the raw files (file format: .er) from which the PDF respectively the ERM is generated are
also provided for each data model.

# Oedatamodel - Datapackage

We publish a data package for each release. The data package contains the file oedatamodel_datapackage.json
with example and template content. This includes CSV files representing the data model. The JSON
file contains the [oemetadata format](https://github.com/OpenEnergyPlatform/oemetadata), which is used to store/provide metadata for open data on the OEP.
This includes:

- A general description of the data
- List of contributions
- Licence information
- Information about sources like records or model frameworks with license information for each source
- The datamodel with metadata for each field
- Table relations and key attributes to provide the information outside a database
- The metadata string itself can be validated using the [Open Metadata Integration (omi)](https://github.com/OpenEnergyPlatform/omi) tool

Oemetadata provides a [detailed description](https://github.com/OpenEnergyPlatform/oemetadata/blob/develop/metadata/latest/metadata_key_description.md) with examples for each key in the metadata string.

# Edit the Entity Relationship Modell

For the generation of an ERM we use this [erm tool](https://github.com/BurntSushi/erd). The [er or erd](https://github.com/BurntSushi/erd#the-er-file-format) file format offers a simple syntax and
can be created and saved using a standard text editor.

For the generation of the ERM e.g. in .pdf format the installation of the erm tool is necessary. For
detailed instructions, please see the [package description](https://github.com/BurntSushi/erd#installation).

After successful installation, a terminal/CMD must be opened and the console command (Windows: 'cd path')
must be used to navigate to the folder where the .er/.erd file is stored. To execute the tools, the command
is then used in the Terminal/CMD to generate the ERM:

`erd -i oedatamodel.er -o oedatamodel.pdf`

8 changes: 8 additions & 0 deletions oedatamodel/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
# Oedatamodel release structure

We use the terms latest and future to describe a release. Latest contains the current stable release.
Additionally there is a subfolder which shows the version number of the current release. The folder
future contains the latest release, if there are changes that should be published in a future release
a new version of the data model is created in the future folder.

Releases are stored in the release history folder. A user can continue to use older versions if necessary.
50 changes: 50 additions & 0 deletions oedatamodel/latest/v100/OEDataModel-concrete.er
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
[Scenario] {bgcolor: "#c1d6c1"}
*'"scenario id" (int)'
'scenario (text)'
'region (text(json))'
'year (int)'
'source (text)'
'comment (text)'


[Scalar] {bgcolor: "#b9d3eb"}
*'"scalar id" (int)'
+'"scenario id" (int)'
'region (text (json))'
'"input energy vector" (text)'
'"output energy vector" (text)'
'"parameter name" (text)'
'"technology" (text)'
'"technology_type" (text)'
'unit (text)'
'tags (json/hstore)'
'method (json/hstore)'
'source (text)'
'comment (text)'
'value (decimal/float)'

Scenario 1--* Scalar

[Timeseries] {bgcolor: "#b9d3eb"}
*'"timeseries id" (int)'
+'"scenario id" (int)'
'region (text (json))'
'"input energy vector" (text)'
'"output energy vector" (text)'
'"parameter name" (text)'
'"technology" (text)'
'"technology_type" (text)'
'unit (text)'
'tags (json/hstore)'
'method (json/hstore)'
'source (text)'
'comment (text)'
'"timeindex start" (timestamp)'
'"timeindex stop" (timestamp)'
'"timeindex resolution" (intervall)'
'series ([decimal/float])'

Scenario 1--* Timeseries



Binary file added oedatamodel/latest/v100/OEDataModel-concrete.pdf
Binary file not shown.
41 changes: 41 additions & 0 deletions oedatamodel/latest/v100/OEDataModel-normalization.er
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
[Scenario] {bgcolor: "#c1d6c1"}
*'"scenario id" (int)'
'scenario (text)'
'region (text(json))'
'year (int)'
'source (text)'
'comment (text)'

Scenario 1--* Data

[Data] {bgcolor: "#b9d3eb"}
*'"data id" (int)'
+'"scenario id" (int)'
'region (text (json))'
'"input energy vector" (text)'
'"output energy vector" (text)'
'"parameter name" (text)'
'"technology" (text)'
'"technology_type" (text)'
'unit (text)'
'tags (json/hstore)'
'method (json/hstore)'
'source (text)'
'comment (text)'
'type (text ("scalar" | "timeseries"))'

Data 1--1 Scalar

[Scalar] {bgcolor: "#b9d3eb"}
*+'"data id" (int)'
"value (decimal/float)"

Data 1--1 Timeseries

[Timeseries] {bgcolor: "#b9d3eb"}
*+'"data id" (int)'
'"timeindex start" (timestamp)'
'"timeindex stop" (timestamp)'
'"timeindex resolution" (intervall)'
'series ([decimal/float])'

Binary file not shown.
24 changes: 24 additions & 0 deletions oedatamodel/latest/v100/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
# Oedatamodel v1.0.0 - Technical description

The oedatamodel release version 1.0.0 contains two datamodel´s as UML-ERM and the corresponding datapackage
for each datamodel. This section describes the technical aspects for each datamodel.

We have created two variants of the data model to achieve different results. First we needed a good solution for
the application of the data model in a database environment. For this purpose we have created "OEDataModel-normalization".
This data model is developed as a [joint-table inheritance](https://docs.sqlalchemy.org/en/13/orm/inheritance.html#joined-table-inheritance) data model and is in a [normalized](https://en.wikipedia.org/wiki/Database_normalization#Example_of_a_step_by_step_normalization) state. By normalizing the
data model we eliminate redundancies (within columns) and ensure that the data model meets the general requirements of
Data to be stored on a relational database system. Common table inheritance is our solution for
the redundancy in the data tables "timeseries" and "scalar". We introduced an aggregated "data" table to
the data model. The "data" table contains all redundant fields from the "timeseries" data and "scalar" data tables.
By introducing a shared primary key "data_id" in all data related tables and by introducing a new field in
aggregated "data" table named "type" we can define the data type ("scalar" or "time series") for each row in the
Table. When retrieving the data, SQL allows us to connect the data tables with each other and create a readable
joint record.


The other result is called "OEDataModel-concrete". This format is intended to be more user-friendly when working
with datasets, for example, using a tool like Excel. The usability aspect that we wanted to achieve with this data
model is to allow a user to edit a dataset in a table that contains all fields. This leads to a lot of redundant
fields. In the data related tables, but the usability is much better for this use case. Since we need to map this
approach to the "OEDataModel-joint" data model, the development of an adapter is required. We plan the development
of the adapter within the next iterations of the development process.
Original file line number Diff line number Diff line change
@@ -0,0 +1,160 @@
{"name": "Oedatamodel readable - General Energy Modell Datapackage",
"title": "OpenEnergyPlatform data format for scenario data in human readable format",
"id": "",
"description": "datamodel, metadata and examples provided as datapackage",
"language": ["en-GB"],
"keywords": ["datamodel", "datapackage", "genral energy dataformat"],
"publicationDate": "2020-08-11",
"context":
{"homepage": "https://openenergy-platform.org/",
"documentation": "https://github.com/OpenEnergyPlatform/oedatamodel/tree/develop/oedatamodel/latest/v100/oedatamodel-readable.pdf",
"sourceCode": "https://github.com/OpenEnergyPlatform/oedatamodel/tree/develop/oedatamodel/latest/v100/oedatamodel-readable.er",
"contact": "",
"grantNo": "",
"fundingAgency": "",
"fundingAgencyLogo": "",
"publisherLogo": ""},
"spatial":
{"location": "",
"extent": "",
"resolution": ""},
"temporal":
{"referenceDate": "",
"timeseries":
{"start": "",
"end": "",
"resolution": "",
"alignment": "",
"aggregationType": ""} },
"sources": [
{
"title": "Open energy datamodel",
"description": "oedatamodel for energy model data",
"path": "https://github.com/OpenEnergyPlatform/oedatamodel/tree/develop/oedatamodel",
"licenses": [
{
"name": "CC0-1.0",
"title": "Creative Commons Zero v1.0 Universal",
"path": "https://creativecommons.org/publicdomain/zero/1.0/legalcode",
"instruction": "You are free: To Share, To Create, To Adapt",
"attribution": "© Reiner Lemoine Institut"
}
]
}],
"licenses": [
{
"name": "",
"title": "",
"path": "",
"instruction": "",
"attribution": ""
}
],
"contributors": [
{"title": "jh-RLI", "email": null, "date": "2020-08-11", "object": "datapackage", "comment": "Create template datapackage for oedatamodel"},
{"title": "", "email": "", "date": "", "object": "", "comment": ""} ],
"resources": [
{"profile": "tabular-data-resource",
"name": "oed-readable_scenario",
"path": "oedatamodel-readable_scenario.csv",
"format": "csv",
"encoding" : "UTF-8",
"schema": {
"fields": [
{"name": "scenario_id", "description": "Unique identifier", "type": "bigint", "unit": null},
{"name": "scenario", "description": "Scenario name", "type": "text", "unit": null},
{"name": "region", "description": "Country or region", "type": "json", "unit": null},
{"name": "year", "description": "Year", "type": "integer", "unit": null},
{"name": "source", "description": "Source", "type": "text", "unit": null},
{"name": "comment", "description": "Comment", "type": "text", "unit": null} ],
"primaryKey": ["scenario_id"],
"foreignKeys": [{
"fields": [null],
"reference": {
"resource": null,
"fields": [null] } } ] },
"dialect":
{"delimiter": ";",
"decimalSeparator": "."} },

{"profile": "tabular-data-resource",
"name": "oed-readable_scalar",
"path": "oedatamodel-readable_scalar.csv",
"format": "csv",
"encoding" : "UTF-8",
"schema": {
"fields": [
{"name": "scalar_id", "description": "Unique identifier", "type": "bigint", "unit": null},
{"name": "scenario_id", "description": "Scenario name", "type": "text", "unit": null},
{"name": "region", "description": "Country or region", "type": "json", "unit": null},
{"name": "input_energy_vector", "description": "", "type": "integer", "unit": null},
{"name": "output_energy_vector", "description": "", "type": "text", "unit": null},
{"name": "parameter_name", "description": "", "type": "text", "unit": null},
{"name": "technology", "description": "", "type": "text", "unit": null},
{"name": "technology_type", "description": "", "type": "text", "unit": null},
{"name": "value", "description": "Parameter value", "type": "decimal", "unit": "kW"},
{"name": "unit", "description": "Parameter unit", "type": "text", "unit": null},
{"name": "tags", "description": "Free classification with key-value pairs", "type": "hstore", "unit": null},
{"name": "method", "description": "Method type (sum, mean, median)", "type": "json", "unit": null},
{"name": "source", "description": "Source", "type": "text", "unit": null},
{"name": "comment", "description": "Comment", "type": "text", "unit": null} ],
"primaryKey": ["scalar_id"],
"foreignKeys": [{
"fields": ["scenario_id"],
"reference": {
"resource": "oed_scenario",
"fields": ["scenario_id"] } } ] },
"dialect":
{"delimiter": ";",
"decimalSeparator": "."} },
{"profile": "tabular-data-resource",
"name": "oed-readable_timeseries",
"path": "oedatamodel-readable_timeseries.csv",
"format": "csv",
"encoding" : "UTF-8",
"schema": {
"fields": [
{"name": "scalar_id", "description": "Unique identifier", "type": "bigint", "unit": null},
{"name": "scenario_id", "description": "Scenario name", "type": "text", "unit": null},
{"name": "region", "description": "Country or region", "type": "json", "unit": null},
{"name": "input_energy_vector", "description": "", "type": "integer", "unit": null},
{"name": "output_energy_vector", "description": "", "type": "text", "unit": null},
{"name": "parameter_name", "description": "", "type": "text", "unit": null},
{"name": "technology", "description": "", "type": "text", "unit": null},
{"name": "technology_type", "description": "", "type": "text", "unit": null},
{"name": "unit", "description": "Parameter unit", "type": "text", "unit": null},
{"name": "timeindex start", "description": "Start timestemp", "type": "timestamp", "unit": null},
{"name": "timeindex stop", "description": "Stop timestemp", "type": "timestamp", "unit": null},
{"name": "timeindex resolution", "description": "Timesteps", "type": "intervall", "unit": null},
{"name": "series", "description": "Timesteps", "type": "array[decimal]", "unit": null},
{"name": "tags", "description": "Free classification with key-value pairs", "type": "hstore", "unit": null},
{"name": "method", "description": "Method type (sum, mean, median)", "type": "json", "unit": null},
{"name": "source", "description": "Source", "type": "text", "unit": null},
{"name": "comment", "description": "Comment", "type": "text", "unit": null} ],
"primaryKey": ["scalar_id"],
"foreignKeys": [{
"fields": ["scenario_id"],
"reference": {
"resource": "oed_scenario",
"fields": ["scenario_id"] } } ] },
"dialect":
{"delimiter": ";",
"decimalSeparator": "."} } ],

"review": {
"path": "",
"badge": ""},
"metaMetadata":
{"metadataVersion": "OEP-1.4.0",
"metadataLicense":
{"name": "CC0-1.0",
"title": "Creative Commons Zero v1.0 Universal",
"path": "https://creativecommons.org/publicdomain/zero/1.0/"} },
"_comment":
{"metadata": "Metadata documentation and explanation (https://github.com/OpenEnergyPlatform/organisation/wiki/metadata)",
"dates": "Dates and time must follow the ISO8601 including time zone (YYYY-MM-DD or YYYY-MM-DDThh:mm:ss±hh)",
"units": "Use a space between numbers and units (100 m)",
"languages": "Languages must follow the IETF (BCP47) format (en-GB, en-US, de-DE)",
"licenses": "License name must follow the SPDX License List (https://spdx.org/licenses/)",
"review": "Following the OEP Data Review (https://github.com/OpenEnergyPlatform/data-preprocessing/wiki)",
"null": "If not applicable use (null)"} }
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
scalar_id;scenario_id;region;input_energy_vector;output_energy_vector;parameter_name;technology;technology_type;unit;tags;method;source;comment;value
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
scenario_id;scenario;region;year;source;comment
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
timeseries_id;scenario_id;region;input_energy_vector;output_energy_vector;parameter_name;technology;technology_type;unit;tags;method;source;comment;timeindex_start;timeindex_stop;timeindex_resolution;series
Loading