Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

There is no concept to connect sources in the OEP #27

Open
christian-rli opened this issue Feb 24, 2020 · 13 comments
Open

There is no concept to connect sources in the OEP #27

christian-rli opened this issue Feb 24, 2020 · 13 comments
Assignees
Labels
other: help wanted 🙋 Extra attention is needed priority: low 🦥 Low priority status: blocked 🛑 Blocked or impeded progress type: question ❓ Further information is requested

Comments

@christian-rli
Copy link
Contributor

christian-rli commented Feb 24, 2020

If a table comes from one source it's fairly simple to connect the source to the original data.

As soon as there are two sources for one table it's not possible in the metadata to selectively connect the information to the different sources.

When a table has a source column there should be a way to link something like a bibtex key and either reference (doi) an absolute source path or provide the source yourself. This can then be referenced in the metadata.

@Ludee
Copy link
Member

Ludee commented Feb 24, 2020

There is a deprecated section for literature at the OEP.
But other projects do it better:

Ludee added a commit that referenced this issue Feb 24, 2020
@Ludee
Copy link
Member

Ludee commented Feb 24, 2020

Solution A:
Each source in the metadata has an key entry = bibkey
Each source in the metadata has an additional doi
In the data, there is a coulmn source/bibkey linking to the metadata

Ludee added a commit that referenced this issue Feb 24, 2020
@Ludee Ludee added other: help wanted 🙋 Extra attention is needed type: question ❓ Further information is requested and removed v1.5 labels Apr 16, 2020
@christian-rli
Copy link
Contributor Author

christian-rli commented Jul 22, 2020

Solution A:
Each source in the metadata has an key entry = bibkey
Each source in the metadata has an additional doi
In the data, there is a coulmn source/bibkey linking to the metadata

Repeating this to make sure I understand correctly: This will need two new keys in the metadata sources object: bibkey and doi. The value of 'bibkey' can also be found in a cell of the data in a column named 'source' or 'bibkey'. The value of 'doi' is only a doi, so it needs to be connected directly to the 'bibkey'. The structure needs to make sure that only one 'doi' belongs to one 'bibkey'. Did I understand correctly @Ludee ?

Does this also mean that with this solution only one source can be referenced for one line? Would it make sense to advise for an extra column in the data that includes the doi? It wold make it possible to match bibkey and doi automatically for the metadata. That function would still need to be written of course, but it might facilitate creating the sources for the metadata quite a bit.

@Ludee Ludee added the v1.5 label Jan 18, 2021
@jh-RLI
Copy link
Contributor

jh-RLI commented May 3, 2021

In modex we use the description field within a source (see example below) to add table and source identifier to the metadata (we use the bibtexkey as an identifier but that's not strict, could be any identifier value (e.g. a primary key id) as reference to a specific row in the data tables ). By that, we connect the source in the metadata with the bibtex file (if available). As we use the oedatamodel, we have a source column in the table. We insert the full bibtex citation and/or key there, by that we also connect the sources in the OEMetadata to the database table row (by id) and/or bibtexkey. This not perfect yet as we currently need to provide the bibtex file within the datapackage which is not compatible with the OEP. For future projects I would recommend adding the full citation text and bibtex key (as backup) in the source column for each row.

Example:

OEMetadata 1.4.1
...
"sources": [
{
        "title": "Impact of weighted average cost of capital, capital expenditure, and other parameters on future utility-scale {PV} levelised cost of electricity",
      ->"description": "[oed-table:scalar],[Bibtexkey:Vartiainen2019] - Impact of weighted average cost of capital, capital expenditure, and other parameters on future utility-scale PV levelised cost of electricity Progress in Photovoltaics: Research and Applications",
        "path": "10.1002/pip.3189",
        "licenses": [
            ...
        ]
    }
]
...

In the source column we can insert the same bibtexkey/full citation multiple times in different row´s.

image

@chrwm
Copy link
Member

chrwm commented Oct 11, 2021

This issue may need further discussion to find a convenient solution. Hence, it will not yet be considered in oemetadata release v1.5
The proposed workaround seems to work well for projects that work with bib files, but introducing the key bibkey might deserve a second thought.

@chrwm chrwm removed the v1.5 label Oct 11, 2021
@chrwm chrwm added the v1.6 label Oct 14, 2021
@chrwm
Copy link
Member

chrwm commented Mar 21, 2023

FYI @srhbrnds @henhuy

Problem scope:

data & metadata = datapackage

  • maintain data & metdata user-friendly

Requirements on data & metadata (please add/edit):

  • identify data sources for each data point

  • identify licence and usage rights of completete datapackage easily
    --> this means, it should be easy to find the licence information of the overall dataset and how to use it

  • ensure possibility of identifying licences and rights of sources that make up dataset
    --> licence information of individual source of the dataset should be tracable

example table 1:

id region year storage_capacity charging_power fixed_cost variable_cost investment_cost operational_life_time mileage bandwidth_type version method source comment
1 Germany 2019 35 4 0.027 0.044 35000 12 12127 {'market_share': 'range'} {'storage_capacity':'LucadeTena2018', 'charging_power':'LucadeTena2018', 'fixed_cost':'ADAC2023', 'variable_cost':'ADAC2023', 'investment_cost':'ADAC2023', 'operational_life_time':'deTena2018', 'mileage':'motointegrator2023', 'occupancy_rate':'InstitutfuerangewandteSozialwissenschaftGmbH2018', 'market_share':'ownAssumptions', 'charging_efficiency' :'ownAssumptions', 'feed_in_efficiency':'ownAssumptions', 'energy_conversion_efficiency':'LucadeTena2018'}
2 Germany 2025 35 4 0.027 0.044 35000 12 12127 {'market_share': 'range'} {'storage_capacity':'deTena2018', 'charging_power':'deTena2018', 'fixed_cost':'ADAC2023', 'variable_cost':'ADAC2023', 'investment_cost':'ADAC2023', 'operational_life_time':'deTena2018', 'mileage':'motointegrator2023', 'occupancy_rate':'Mueller2013', 'market_share':'ownAssumptions', 'charging_efficiency' :'ownAssumptions', 'feed_in_efficiency':'ownAssumptions', 'energy_conversion_efficiency':'deTena2018'}
3 Germany 2030 35 5.5 0.03 0.044 32000 12 12127 {'market_share': 'range'} {'storage_capacity':'deTena2018', 'charging_power':'deTena2018', 'fixed_cost':'ADAC2023', 'variable_cost':'ADAC2023', 'investment_cost':'ADAC2023', 'operational_life_time':'deTena2018', 'mileage':'motointegrator2023', 'occupancy_rate':'Mueller2013', 'market_share':'ownAssumptions', 'charging_efficiency' :'ownAssumptions', 'feed_in_efficiency':'ownAssumptions', 'energy_conversion_efficiency':'deTena2018'}

example table 2:

id scenario_id region year input_energy_vector output_energy_vector technology technology_type parameter_name value unit tags method source comment
1909 2 ["Baltic"] 2016 air electricity wind turbine offshore installed capacity 338.8 MW WirtschaftundEnergieCap Dez 15
1910 2 ["North"] 2016 air electricity wind turbine offshore installed capacity 2956.1 MW WirtschaftundEnergieCap Dez 15
1911 2 ["BB","BE","BW","BY","HB","HE","HH","MV","NI","NW","RP","SH","SL","SN","ST","TH"] 2016 air electricity wind turbine onshore capital costs 1288000 €/MW {"value": "Interpolation 2015-2020"} DEA2020 p224
1912 2 ["BB","BE","BW","BY","HB","HE","HH","MV","NI","NW","RP","SH","SL","SN","ST","TH"] 2016 air electricity wind turbine onshore fixed costs 23280 €/MW/a {"value": "Interpolation 2015-2020"} DEA2020 p224
1913 2 ["BB","BE","BW","BY","HB","HE","HH","MV","NI","NW","RP","SH","SL","SN","ST","TH"] 2016 air electricity wind turbine onshore lifetime 25.4 years {"value": "Interpolation 2015-2020"} DEA2020 p224
1914 2 ["BB","BE","BW","BY","HB","HE","HH","MV","NI","NW","RP","SH","SL","SN","ST","TH","North","Baltic"] 2016 air electricity wind turbine offshore capital costs 2714000 €/MW {"value": "Interpolation 2015-2020"} DEA2020 p245
1915 2 ["BB","BE","BW","BY","HB","HE","HH","MV","NI","NW","RP","SH","SL","SN","ST","TH","North","Baltic"] 2016 air electricity wind turbine offshore fixed costs 53851.8 €/MW/a {"value": "Interpolation 2015-2020"} DEA2020 p245
1916 2 ["BB","BE","BW","BY","HB","HE","HH","MV","NI","NW","RP","SH","SL","SN","ST","TH","North","Baltic"] 2016 air electricity wind turbine offshore lifetime 25.4 years {"value": "Interpolation 2015-2020"} DEA2020 p245
2133 2 ["BB"] 2016 air electricity wind turbine onshore installed capacity 5700.03 MW MaStR2021
2134 2 ["BE"] 2016 air electricity wind turbine onshore installed capacity 11 MW MaStR2021

Proposed solution for OEM-1.6 or later:

  1. Introduce key bibSources and keep old structure within key e.g. library

Pro:

  • users have all sources structured in bibfile
  • with example tables above, individual data point are tracable to individual sources.
  • users who don't work with bibfiles can still document their sources

Con:

  • (licence information of individual source would be shifted to bibfile and not seen directly in metadata) - not a con, if not requirement
"sources":{ 
        "bibSources": "http://url_to_bib_file_with_bib_file",
        "library":
		[
        {
            "title": null,
            "description": null,
            "path": null,
            "licenses": [
                {
                    "name": null,
                    "title": null,
                    "path": null,
                    "instruction": null,
                    "attribution": null
                }
            ]
        },
        {
            "title": null,
            "description": null,
            "path": null,
            "licenses": [
                {
                    "name": null,
                    "title": null,
                    "path": null,
                    "instruction": null,
                    "attribution": null
                }
            ]
        }
    ]
}

Licence information and rights of usage for entire datapackage still easily accessible in

"licenses": [
        {
            "name": null,
            "title": null,
            "path": null,
            "instruction": null,
            "attribution": null
        }
    ],

@chrwm
Copy link
Member

chrwm commented Apr 6, 2023

There is no designated bibfile field for licences.
bibfile field note is shown per default in bibliography and could be used for licence information.

@jh-RLI
Copy link
Contributor

jh-RLI commented Apr 6, 2023

I like the idea suggested by @chrwm. This would completely separate the recommended way of source management from the metadata. Of course, the current solution would still be available.

I will add concerns that might be relevant. Perhaps we decide to include your solution first and try to resolve the issues later.

One objection for me would be that we would require all users to use the Bibtex format if they want to cite sources and link them efficiently in the data. (Maybe this is not even bad)

Another point is that if we keep the current format and use the Bibtex format, we would have to handle two formats if we want to display the sources in the OEP, for example. We would have to save the Bibtex file and read from it to display the source information on the website. (This is feasible but extra work)

@chrwm
Copy link
Member

chrwm commented Apr 6, 2023

One objection for me would be that we would require all users to use the Bibtex format if they want to cite sources and link them efficiently in the data. (Maybe this is not even bad)

In my solution the current way of handling sources and the new key bibsources would exist parallel. So people could use both.

I agree to the other concern.

@jh-RLI
Copy link
Contributor

jh-RLI commented Apr 6, 2023

Okay, I thought the use case you presented would add more features, but it is aimed at usability - then there is no concern :)

@chrwm
Copy link
Member

chrwm commented Apr 11, 2023

From today's meeting:
A link to bibSources seems to be accepted as an extra OEM-key.
The question of whether one should be able to see the licences of the individual sources in OEMetadata in addition to the total licence of the resource (without having to look into the source) is still open.

@chrwm
Copy link
Member

chrwm commented Aug 7, 2023

To pick things up again - I propose to implemtent the following solution:

  1. add a link field that links to a file containing the sources, e.g. a bibfile.
  2. move current resource-sources into another field, e.g. individual
"sources":{ 
        "link": "http://url_to_bib_file_with_bib_file",
        "individual":
		[
        {
            "title": null,
            "description": null,
            "path": null,
            "licenses": [
                {
                    "name": null,
                    "title": null,
                    "path": null,
                    "instruction": null,
                    "attribution": null
                }
            ]
        },
        {
            "title": null,
            "description": null,
            "path": null,
            "licenses": [
                {
                    "name": null,
                    "title": null,
                    "path": null,
                    "instruction": null,
                    "attribution": null
                }
            ]
        }
    ]
}

  1. if link points to bibfile, motivate to save licence information in the note field, which is displayed by default, e.g.:
@misc{huelk2022,
    author = {Hülk, Ludwig and Pleßmann, Guido and Muschner, Christoph and Kotthoff, Florian and Tepe, Deniz},
    title = {open-MaStR - Marktstammdatenregister},
    DOI = {10.5281/zenodo.6807426},
    publisher = {Zenodo},
    year = {2022},
    month = {Jul},
    note = {License information: Marktstammdatenregister - © Bundesnetzagentur für Elektrizität, Gas, Telekommunikation, Post und Eisenbahnen | DL-DE-BY-2.0}
}

grafik

@jh-RLI jh-RLI moved this to To do in OEMetadata v2.0 May 30, 2024
@Ludee Ludee removed the v1.6 label Oct 8, 2024
@Ludee
Copy link
Member

Ludee commented Oct 16, 2024

We just discussed the current state and possible solutions and came to the conclusion we will not implement an update in the upcoming version 2.0.

@Ludee Ludee removed this from OEMetadata v2.0 Oct 16, 2024
@Ludee Ludee assigned jh-RLI and unassigned christian-rli Oct 16, 2024
@Ludee Ludee added type: feature 🛠 New feature or request status: blocked 🛑 Blocked or impeded progress and removed type: enhancement ⚙️ Improvement of an existing feature labels Nov 13, 2024
@Ludee Ludee added priority: low 🦥 Low priority and removed type: feature 🛠 New feature or request labels Jan 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
other: help wanted 🙋 Extra attention is needed priority: low 🦥 Low priority status: blocked 🛑 Blocked or impeded progress type: question ❓ Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants