Make labels available in reduce, apply_dimension etc. #245

m-mohr · 2019-11-21T16:45:16Z

We pass only the data to the callbacks in these functions: aggregate_polygon, aggregate_temporal, apply_dimension, merge_cubes, reduce, resample_cube_temporal. It is useful to also have the labels available, e.g. for the client band math "magic" or more advanced timeseries analysis. We should make the labels available for each value. Could be achieved either with an additional parameter or something like a labeled array data type.

The text was updated successfully, but these errors were encountered:

m-mohr · 2019-11-26T09:45:33Z

Seems to be useful and needs to be explored:

Whether back-ends can actually provide the data (rasdaman may not be able to do it)
How to pass the data to the reducer

m-mohr · 2019-11-26T11:26:47Z

Telco: It seems useful, let's explore it.

m-mohr · 2019-12-17T13:17:58Z

Idea

Define a data type "assoc-array" (ordered associative array based on JSON data type array, i.e. a OrderedDict in Python, an associative array in PHP, Map in JS, not sure about Java). Keys (strings or numbers) are dimension labels, values are pixel values. There's no JSON equivalent for this, but I don't think this is an issue. You could have { array: [{a:1}, {b: 2}] } or { array: [ ["a", "b"], [1, 2] ] } or {array: { labels: ["a", "b"], values: [1, 2] } } ...
Example (PHP): $data = ["a" => 123, "b" => 567]
Allow easy access to it by extending the from_argument object with an index. This avoids heavy use of array_element or a similar process.
Example: {from_argument: "data", index: "a"} to access a in data.
Additionally, either allow array_element to be used on this data type (and objects?) or define separate processes.

This would be backward compatible, I think. Could be supported by from_node, too.

By default index would be set to false so that an array without keys is returned (as it is now, for backward compatibility). Setting index to true returns the full dict. Settings the index to a string or number returns the requested element in the array.

cc @jdries

Example process graph

Changes: https://gist.github.com/m-mohr/ec69ca2fc27a003aa3bd78a8e4b512da/revisions

Before

{
  "dc": {
    "process_id": "load_collection",
    "description": "Loading the data; The order of the specified bands is important for the following reduce operation.",
    "arguments": {
      "id": "Sentinel-2",
      "spatial_extent": {
        "west": 16.1,
        "east": 16.6,
        "north": 48.6,
        "south": 47.2
      },
      "temporal_extent": ["2018-01-01", "2018-02-01"],
      "bands": ["B08", "B04", "B02"]
    }
  },
  "evi": {
    "process_id": "reduce",
    "description": "Compute the EVI. Formula: 2.5 * (NIR - RED) / (1 + NIR + 6*RED + -7.5*BLUE)",
    "arguments": {
      "data": {"from_node": "dc"},
      "dimension": "spectral",
      "reducer": {
        "callback": {
          "nir": {
            "process_id": "array_element",
            "arguments": {
              "data": {"from_argument": "data"},
              "index": 0
            }
          },
          "red": {
            "process_id": "array_element",
            "arguments": {
              "data": {"from_argument": "data"},
              "index": 1
            }
          },
          "blue": {
            "process_id": "array_element",
            "arguments": {
              "data": {"from_argument": "data"},
              "index": 2
            }
          },
          "sub": {
            "process_id": "subtract",
            "arguments": {
              "data": [{"from_node": "nir"}, {"from_node": "red"}]
            }
          },
          "p1": {
            "process_id": "product",
            "arguments": {
              "data": [6, {"from_node": "red"}]
            }
          },
          "p2": {
            "process_id": "product",
            "arguments": {
              "data": [-7.5, {"from_node": "blue"}]
            }
          },
          "sum": {
            "process_id": "sum",
            "arguments": {
              "data": [1, {"from_node": "nir"}, {"from_node": "p1"}, {"from_node": "p2"}]
            }
          },
          "div": {
            "process_id": "divide",
            "arguments": {
              "data": [{"from_node": "sub"}, {"from_node": "sum"}]
            }
          },
          "p3": {
            "process_id": "product",
            "arguments": {
              "data": [2.5, {"from_node": "div"}]
            },
            "result": true
          }
        }
      }
    }
  },
  "mintime": {
    "process_id": "reduce",
    "description": "Compute a minimum time composite by reducing the temporal dimension",
    "arguments": {
      "data": {"from_node": "evi"},
      "dimension": "temporal",
      "reducer": {
        "callback": {
          "min": {
            "process_id": "min",
            "arguments": {
              "data": {"from_argument": "data"}
            },
            "result": true
          }
        }
      }
    }
  },
  "save": {
    "process_id": "save_result",
    "arguments": {
      "data": {"from_node": "mintime"},
      "format": "GTiff"
    },
    "result": true
  }
}

After

{
  "dc": {
    "process_id": "load_collection",
    "description": "Loading the data; The order of the specified bands is important for the following reduce operation.",
    "arguments": {
      "id": "Sentinel-2",
      "spatial_extent": {
        "west": 16.1,
        "east": 16.6,
        "north": 48.6,
        "south": 47.2
      },
      "temporal_extent": ["2018-01-01", "2018-02-01"],
      "bands": ["B08", "B04", "B02"]
    }
  },
  "evi": {
    "process_id": "reduce",
    "description": "Compute the EVI. Formula: 2.5 * (NIR - RED) / (1 + NIR + 6*RED + -7.5*BLUE)",
    "arguments": {
      "data": {"from_node": "dc"},
      "dimension": "spectral",
      "reducer": {
        "callback": {
          "sub": {
            "process_id": "subtract",
            "arguments": {
              "data": [{"from_argument": "data", "index": "B8"}, {"from_argument": "data", "index": "B4"}]
            }
          },
          "p1": {
            "process_id": "product",
            "arguments": {
              "data": [6, {"from_argument": "data", "index": "B4"}]
            }
          },
          "p2": {
            "process_id": "product",
            "arguments": {
              "data": [-7.5, {"from_argument": "data", "index": "B2"}]
            }
          },
          "sum": {
            "process_id": "sum",
            "arguments": {
              "data": [1, {"from_argument": "data", "index": "B8"}, {"from_node": "p1"}, {"from_node": "p2"}]
            }
          },
          "div": {
            "process_id": "divide",
            "arguments": {
              "data": [{"from_node": "sub"}, {"from_node": "sum"}]
            }
          },
          "p3": {
            "process_id": "product",
            "arguments": {
              "data": [2.5, {"from_node": "div"}]
            },
            "result": true
          }
        }
      }
    }
  },
  "mintime": {
    "process_id": "reduce",
    "description": "Compute a minimum time composite by reducing the temporal dimension",
    "arguments": {
      "data": {"from_node": "evi"},
      "dimension": "temporal",
      "reducer": {
        "callback": {
          "min": {
            "process_id": "min",
            "arguments": {
              "data": {"from_argument": "data"}
            },
            "result": true
          }
        }
      }
    }
  },
  "save": {
    "process_id": "save_result",
    "arguments": {
      "data": {"from_node": "mintime"},
      "format": "GTiff"
    },
    "result": true
  }
}

m-mohr · 2020-01-14T13:30:46Z

This can also be useful for the object-based schema in rename_labels' parameter labels.

m-mohr · 2020-01-21T12:13:14Z

The subtype labeled-array is now available, which is an array but has labels stored instead indices. Labeled arrays can still be used as normal arrays, so you can pass an labeled array still to mean() for examples, without any change to the process graph. The labels can be accessed with array_* functions, e.g. array_element, array_find and array_labels. Labels take preference over indices.

We don't need a JSON encoding yet. With the changes in #254 to rename_labels, we have no place yet where we need a JSON encoding for labeled arrays in process graphs. So I didn't invent one yet.

The shortcut to access data without array_element, e.g. {from_argument: "data", index: "a"} is not included yet. I guess I'll combine these changes with #161?!

m-mohr transferred this issue from Open-EO/openeo-processes Dec 19, 2019

m-mohr added this to the v1.0-rc1 milestone Dec 19, 2019

m-mohr added process graphs processes Process definitions and descriptions labels Dec 19, 2019

This was referenced Jan 13, 2020

sort, rearrange, order don't work with apply_dimension Open-EO/openeo-processes#117

Closed

labels: Get dimension labels Open-EO/openeo-processes#120

Merged

m-mohr added a commit that referenced this issue Jan 21, 2020

Adds labeled arrays for #245

98b4940

This was referenced Jan 21, 2020

Labeled arrays (for callbacks) #255

Merged

Added support for labeled arrays Open-EO/openeo-processes#133

Merged

m-mohr added a commit that referenced this issue Jan 21, 2020

Adds labeled arrays for #245

d90cb83

m-mohr added the has PR label Jan 21, 2020

m-mohr closed this as completed Jan 27, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make labels available in reduce, apply_dimension etc. #245

Make labels available in reduce, apply_dimension etc. #245

m-mohr commented Nov 21, 2019

m-mohr commented Nov 26, 2019

m-mohr commented Nov 26, 2019

m-mohr commented Dec 17, 2019 •

edited

Loading

m-mohr commented Jan 14, 2020

m-mohr commented Jan 21, 2020 •

edited

Loading

Make labels available in reduce, apply_dimension etc. #245

Make labels available in reduce, apply_dimension etc. #245

Comments

m-mohr commented Nov 21, 2019

m-mohr commented Nov 26, 2019

m-mohr commented Nov 26, 2019

m-mohr commented Dec 17, 2019 • edited Loading

Idea

Example process graph

Before

After

m-mohr commented Jan 14, 2020

m-mohr commented Jan 21, 2020 • edited Loading

m-mohr commented Dec 17, 2019 •

edited

Loading

m-mohr commented Jan 21, 2020 •

edited

Loading