Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Units and scales (and currency) in Table Schema #216

Closed
rufuspollock opened this issue Sep 24, 2015 · 31 comments
Closed

Units and scales (and currency) in Table Schema #216

rufuspollock opened this issue Sep 24, 2015 · 31 comments

Comments

@rufuspollock
Copy link
Contributor

rufuspollock commented Sep 24, 2015

STATUS:


Excellent discussion with @dr-shorthair today led me to consider importance of units and scales (and currency) in JSON Table Schema.

Suggest we could specific at MAY level:

unit: simple-string-descriptor e.g. m/s
unitSemantic: pointer-to-a-url-describing that unit - could be RDF uri
currency:      # could be part of units but think probably better separate
factor: a scaling factor (e.g. 1000 would mean to scale by 1000

References

@s-celles
Copy link

Pint is a great Python library for units. http://pint.readthedocs.org/

@danfowler
Copy link
Contributor

Thinking about this through the lens of a Fiscal Data Package profile, a mapping object has been used to give semantic meaning to raw numbers in a budget dataset. As an example, currency is a type currently applied on a field by mapping a source JTS column onto a new field in a mapping object. I'm wondering: should be a general principle for applying semantic meaning to columns in a CSV or should we consider the FDP a special case.

Related:

@pwalsh
Copy link
Member

pwalsh commented Dec 1, 2015

@danfowler JTS already supports currency as a format on number:

@rufuspollock
Copy link
Contributor Author

@pwalsh i know - though I'm wondering if that was a good idea vs proper units. Note also we did not support "factor" ;-)

@rufuspollock
Copy link
Contributor Author

OK, I think we should introduce units and factor. Re units the question I would have is to understand any difference between QUDT and dataprotocols units spec.

@danfowler could you take a quick look at QUDT and the units spec and see if you can identify any differences.

@danfowler
Copy link
Contributor

@rgrp I can take a look.

@dr-shorthair
Copy link

I would suggest handling currency separate from units of measure, but in the same overall framework along with controlled vocabularies and coordinate reference systems. These are all 'reference systems'.

The special thing about currency is that conversion factors are time-dependent, and the changes are large. This does not apply to typical uom.

There is also some time-dependency in both spatial and temporal coordinate systems due to (a) moving spatial datum dues to plate tectonics - yes this does matter in applications like precision agriculture; (b) leap seconds, though in both cases most users would not notice.

@pwalsh
Copy link
Member

pwalsh commented Mar 7, 2016

@rgrp @danfowler any progress here?

@dr-shorthair great points. I'm wondering, though, if the conversion aspects you highlight are relevant for the spec itself (rather than relevant for potential applications of the spec).

@patcon
Copy link

patcon commented May 26, 2016

Great discussion! Just wanted to chime in that I think this would be helpful for CSV columns as well :)

@pwalsh
Copy link
Member

pwalsh commented Jul 12, 2016

@rgrp do you want to move forward on this?

@rgieseke
Copy link
Contributor

Would that look something like the following?

"schema": {
  "fields": [
    {
      "name": "Year",
      "description": "Year",
      "type": "date"
    },
    {
      "name": "Total",
      "description": "Total carbon emissions from fossil fuel consumption and cement production (million metric tons of C)",
      "type": "number",
      "unit": "Mt",
      "unitSystem": "SI"
    }
  ]

[…]

@roll roll added the backlog label Aug 8, 2016
@rufuspollock
Copy link
Contributor Author

@rgieseke yes - that is correct. Your unitSystem is an addition by you I assume? And is unit a reference to the dataprotocols units spect or a different one?

@rufuspollock
Copy link
Contributor Author

rufuspollock commented Aug 9, 2016

@pwalsh next steps here would be:

  • Deciding what exactly to add. e.g. factor and unit
  • Deciding where to add it - I'm thinking this is more of a pattern or extension rather than core ...

@rgieseke
Copy link
Contributor

rgieseke commented Aug 9, 2016

@rgrp Yes, sorry I mis-remembered units and unitsSemantic. Why would it be units as plural though?

@rufuspollock
Copy link
Contributor Author

@rgieseke units was a typo which I have corrected - should be unit.

@rnuske
Copy link

rnuske commented May 4, 2017

We are planning to use table schema for describing the inner structure our resources. But we definitely need to store the unit of measurement. Thus, we would very much welcome if the table schema spec would support it and we wouldn't have to work with custom addons.

@danfowler
Copy link
Contributor

@muehlenpfordt et al at Open Power System Data seem to have produced Data Packages with a unit: attribute at the field level with a string value (e.g. "MW"). I'd be curious to learn if that what use case that supports in that project.

https://github.com/Open-Power-System-Data/renewable_power_plants/blob/master/validation_and_output.ipynb

@rgieseke
Copy link
Contributor

rgieseke commented Jun 8, 2017

I also went with unit for each table column

https://github.com/openclimatedata/global-carbon-budget/blob/master/datapackage.json#L59

I think the main use case is to easily read in a data set and apply a unit transformation, e.g. for comparison with another dataset.

@simleo
Copy link

simleo commented Jun 29, 2017

We have a use case for this in biotracks, see CellMigStandOrg/biotracks#9

@Kenji-K
Copy link

Kenji-K commented Oct 25, 2017

I have a question. Will there be any specified way of converting measurements from one unit to another? Say celsius to kelvin or fahrenheit. Or is this outside the scope of the spec?

@rufuspollock
Copy link
Contributor Author

@Kenji-K this would be outside of the spec - it would be something a tool would implement (but the spec could form the basis for that tool's API)

@rufuspollock
Copy link
Contributor Author

I also think we may want to move the units draft spec http://specs.okfnlabs.org/units/ back to FD specs /cc @danfowler @pwalsh - now #537

@pwalsh @roll

@rufuspollock rufuspollock changed the title Units and scales (and currency) in JSON Table Schema Units and scales (and currency) in Table Schema Apr 10, 2020
@yohanboniface
Copy link

Hey, is there still interest in this feature ?
We (for French administration) would use it (basically, tools consuming table schema would infer some behavior according to the unit, when defined).
Any way we could help it land in the spec ?

@peterdesmet
Copy link
Member

Yes, also interested, to use it for Camtrap DP. Although one can of course expand the Frictionless Table Schema as they want (e.g. adding a unit property for each field) I’d also rather have this as part of the core Table Schema itself.

@rufuspollock
Copy link
Contributor Author

@yohanboniface yes a lot of interest. First start would be a detailed pattern. Note @Stephen-Gates had a go at that in #607 - we are really open to getting a pattern and then turning that into part of the spec.

@DunklesArchipel
Copy link

Working with scientific data, we are very interested in having units implemented in the schema.
We started implementing a frictionless schema to describe tabular data containing measured quantities and added a unit key to the fields. In principle, the field would be a QuantitiyField. Other properties of the field would be the dimension. A suggested schema can be found here.

For the string notation of the units, we use that from astropy. This allows simple conversion of units. This allows simple conversion of tabular quantity data into other units. I hope these aspects provide some useful information to improve the specs or even for the validation of scientific data in general.

@roll roll added Table Schema and removed Recipes labels Apr 12, 2024
@frictionlessdata frictionlessdata locked and limited conversation to collaborators Oct 21, 2024
@roll roll converted this issue into discussion #992 Oct 21, 2024

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Projects
Status: Done
Development

No branches or pull requests