Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can we make the RAPI request identical to a Processes Execute Request? #17

Closed
jerstlouis opened this issue Mar 9, 2021 · 19 comments · Fixed by #27
Closed

Can we make the RAPI request identical to a Processes Execute Request? #17

jerstlouis opened this issue Mar 9, 2021 · 19 comments · Fixed by #27
Labels
API OGC API - Routes

Comments

@jerstlouis
Copy link
Member

jerstlouis commented Mar 9, 2021

Given that several steps have been taken in OGC API - Processes to simplify and make more concise execution requests, could the routing API go the extra step to harmonize the routing requests, whether implemented as a stand-alone Routes API, or integrated within an OGC API - Processes service?

This would make it possible for generic OGC API - Processes clients to access OGC API - Routes services only implementing Part 1, providing they could be used with a separate generic routing process description.

The only remaining required changes would be:

  • embedding the routing request parameters inside an inputs object property, and
  • wrapping values inside an object with a value property

This would mean going from:

{
  "waypoints":
  {
    "type": "MultiPoint",
    "coordinates": [
       [ 36.1234515, 32.6453783 ],
       [ 36.1214698, 32.655952  ],
       [ 36.1247213, 32.7106286 ]
   ]
  },
  "preference" : "fastest",
  "height" : 4.5
}

to:

{
  "inputs": {
    "waypoints":
    {
      "value": {
        "type": "MultiPoint",
        "coordinates": [
          [ 36.1234515, 32.6453783 ],
          [ 36.1214698, 32.655952  ],
          [ 36.1247213, 32.7106286 ]
        ]
      }
    },
    "preference" : { "value": "fastest" },
    "height" : { "value": 4.5 }
  }
}

My own opinion is that this is now a small cost to pay in terms of simplicity and elegance, for the interoperability benefits and avoiding the situation of having two different competing routing APIs.

@skyNacho
Copy link

@jerstlouis please, correct me if I am wrong but, wouldn't this approach force the output, that is the Route Exchange Model, be embedded inside an outputs object property and all its attributes must contain an id and value property? Yesterday at the developer track I was playing with OGC API - Processes implementation in pygeoapi and it was my understanding that the output format would be quite different from our REM spec.

@jerstlouis
Copy link
Member Author

jerstlouis commented Mar 25, 2021

@skyNacho If you look at the latest draft 7.9.3 Response, you will see:

If the "response" attribute of the execute request was set to "raw", the content of the response SHALL only include the one output selected by the execute request body.

I am missing the "response" : "raw" and "mode" :"sync" attributes, as well as the "output" section in my execute request above, but that could simply be an extra requirement when talking to the generic Processes service (but at least the main part would be identical), that RAPI-only service could safely ignore. The output section in the execute request is something I hope can become optional when there is a single output and default format (that is already explicitly specified in Workflows).

In document response or async mode, you would either have an embedded JSON object for value following the REM spec, or you would have an href to a JSON file, still following the REM spec exactly.

@cportele
Copy link
Member

The extra inputs wrapper object might be ok, but IMHO all the value wrappers for each property would be too much "noise". If Processes really needs them, the input model is obviously quite different. Wouldn't it be possible for Processes to allow that the value of a property is either just the value (as in the Routes case) or some wrapper object that adds additional metadata in addition to the value?

But even if not, it seems straightforward for agents that really need to transform between both representations to do so, so I do not see a problem with having different representations, if there is a clear mapping between the two.

@jerstlouis
Copy link
Member Author

@cportele Well, potentially it could be feasible with a oneOf for String, Number, Bool and Arrays. If we could convince the group, I would definitely welcome the lighter way to specify simple fixed parameters input.

However for Objects (like in the ase of waypoints) though, depending on how the parser works it might be more difficult, and there's more potential for conflict with the object type Processes expects.
Processes expects an object allowing to use either href or value, and specify the format for the data being passed in.
href might actually be useful for Routes, e.g. to reference a list of waypoints and/or obstacles accessible from a URL somewhere.
In Workflows, there are additional types of objects / properties allowing for inputs to be generated by other processes, or available from an OGC API collection. e.g. we use this for the dataset property:

   "dataset" : { "collection" : "https://maps.ecere.com/ogcapi/collections/osm:dc:roads" }

The waypoints and/or dataset and/or obstacles could also potentially be generated by another process, or available from an OGC API collection.

@jerstlouis
Copy link
Member Author

@cportele Processes has just accepted the change to remove the need for { "value" : <value> } (with a question mark on object types potentially conflicting).

Inching ever closer to a fully harmonized request :)

@cportele
Copy link
Member

Excellent 👍

@jerstlouis
Copy link
Member Author

The nice thing about "value" for objects is that we could optionally reference the GeoJSON schema to make the input request self-describing, like so:

{
  "inputs": {
    "waypoints":
    {
      "schema" : { "$ref" : "https://geojson.org/schema/MultiPoint.json" },
      "value": {
        "type": "MultiPoint",
        "coordinates": [
          [ 36.1234515, 32.6453783 ],
          [ 36.1214698, 32.655952  ],
          [ 36.1247213, 32.7106286 ]
        ]
      }
    },
    "preference" : "fastest",
    "height" : 4.5
  }
}

@cportele
Copy link
Member

In our last call I took the action to review latest Processes draft with respect to this issue.

It seems the change with the values has not yet been included in the current draft (link), but let's assume this will change.

Looking at the other members beside value, I would say that

  • href is unnecessary, because for references the value could simply be replaced with a { '$ref': '...' } object that points to the remote value;
  • uom seems to be a hack and it would be better to use an object for measure values.

That leaves format. This is only relevant for string input, AFAICT. This information should also be part of the schema and not a separate member. The Vocabulary for the Contents of String-Encoded Data can be used, which also includes, for example, contentMediaType .

It looks like the examples in Processes also use title and description, but these should be part of the schema, so this is probably just an error in the examples.

Conclusion: I don't think value is needed for objects either and Processes could be updated accordingly. The only thing that we would need to add in the "compute a new route" request is the inputs member, which would be ok for me.

With respect to JSON Schema references, the recommended practice of JSON Schema should be followed and describedby links should be used.

@jerstlouis
Copy link
Member Author

jerstlouis commented Apr 12, 2021

About href, that's a good point and I thought of that as well, but some parsers or configuration automatically replaces the $ref with what it points to, which is generally not what one would want here.

Regarding uom as an object, the nice thing is it allows the client to optionally specify a unit but a service could also understand an unqualified unit as the default unit. But maybe if the server accepts a purely numeric value, being in one default unit, then uom should never be specified and instead that default unit would be expressed in the input description.

@cportele
Copy link
Member

cportele commented Apr 13, 2021

For the cases where the $ref semantics is not what is desired, the better approach would be to use a Link object as the value (and a canonical URI for the schema of the Link object) to distinguish the semantics. The Link object can also express the media type in the type property. For example:

Value

"imageInput": {
   "href": "https://www.example.com/daraa_dted.geotiff",
   "type": "image/tiff",
   "rel": "enclosure",
   "length": 96354
}

Schema

(Note: the schema was corrected due to an error pointed out in the comment below.)

"imageInput": {
   "title": "Image Value Input",
   "description": "This is an example of an image input.",
   "allOf": [
      {
         "$ref": "http://schemas.opengis.net/ogcapi/features/part1/1.0/openapi/schemas/link.yaml"
      },
      {
         "type": "object",
         "properties": {
            "type": {
               "type": "string",
               "enum": [
                  "image/tiff",
                  "image/jp2"
               ]
            }
         }
      }
   ],
   "example": {
      "href": "https://www.example.com/daraa_dted.geotiff",
      "type": "image/tiff",
      "rel": "enclosure",
      "length": 96354
   }
}

As an aside, some of the examples are probably misleading in the Processes draft. The imageInput parameter states that it is about an "inline image", which would be consistent with the $ref semantics, but it seems as if this is not intended.

@gfenoy
Copy link

gfenoy commented Apr 14, 2021

I like a lot the idea of using the link in schema definition and glad to see it used in Processes.

From my understanding of the current state of the OGC API - Processes, the first "type" parameter should be renamed to "schema".

Also, following this rewrite it would lead to the following. I am not sure it is correct but, I have added the part of the schema to support the execute body containing "Image": "BINARY_FILE" probably not very commonly used but possible from the current state of Processes in my understanding.

"imageInput": {
    "title": "Image Value Input",
    "description": "This is an example of an image input.",
    "schema": {
        "oneOf": [
	    {
		"allOf": [
		    {
			"$ref": "http://schemas.opengis.net/ogcapi/features/part1/1.0/openapi/schemas/link.yaml"
		    },
		    {
			"type": "object",
			"properties": {
			    "type": {
				"type": "string",
				"enum": [
				    "image/tiff",
				    "image/jp2"
				]
			    }
			}
		    }
		]
	    },
            {
		"type": "string",
		"contentEncoding": "binary",
		"contentMediaType": "image/tiff"
	    },
            {
		"type": "string",
		"contentEncoding": "binary",
		"contentMediaType": "image/jp2"
	    }
        ]
    },
    "example":{
	"href": "https://www.example.com/daraa_dted.geotiff",
	"type": "image/tiff",
	"rel": "enclosure",
	"length": 96354
    }
}

If the previous schema was correct for giving the possibility to use "imageInput": "BINARY_FILE" in request body then, it leads me to a question: how can a server implementation know how the string was encoded, its mediaType and potential schema?

Actually, this format is probably not the best to illustrate what I would like to point out as it does not really apply for Images but now, let suppose we support "data": "CONTENT_FILE"in request body and that the data input can support geometry data provided as GeoJSON, GML and KML.

Distinction in between GeoJSON and other formats will be easy as it would be the JSON object embedded in place of the "CONTENT_FILE". Still in that case it works cause we discuss featureCollection, but now let suppose you support different kind of JSON object for an input then how services will be informed of the kind of input provided.

This is the reason why I would have like to see both "encoding", "type" (mediaType) and"schema" within the object for ComplexData inputs. In consequence, we would get something like this for the initial imageInput input.

"imageInput": {
    "title": "Image Value Input",
    "description": "This is an example of an image input.",
    "schema": {
        "oneOf": [
	    {
		"allOf": [
		    {
			"$ref": "http://schemas.opengis.net/ogcapi/features/part1/1.0/openapi/schemas/link.yaml"
		    },
		    {
			"type": "object",
			"properties": {
			    "type": {
				"type": "string",
				"enum": [
				    "image/tiff",
				    "image/jp2"
				]
			    }
			}
		    }
		]
	    },
            {
		"type": "object",
		"properties": {
		    "value": {
			"oneOf": [
			    {
				"type": "string",
				"contentEncoding": "base64",
				"contentMediaType": "image/tiff"
			    },
			    {
				"type": "string",
				"contentEncoding": "binary",
				"contentMediaType": "image/tiff"
			    },
			    {
				"type": "string",
				"contentEncoding": "base64",
				"contentMediaType": "image/jp2"
			    },
			    {
				"type": "string",
				"contentEncoding": "binary",
				"contentMediaType": "image/jp2"
			    }
			]
		    )
		    "type": {
			"type": "string",
			"enum": [
			    "image/tiff",
			    "image/jp2"
			]
		    },
		    "encoding": {
			"type": "string",
			"enum": [
			    "base64",
			    "binary"
			]	
		    }
		}
	    },
            {
		"type": "object",
		"properties": {
		    "value": {
			"type": "string",
			"contentEncoding": "binary",
			"contentMediaType": "image/jp2"
		    }
		}
	    },
        ]
    },
    "example":{
	"href": "https://www.example.com/daraa_dted.geotiff",
	"type": "image/tiff",
	"rel": "enclosure",
	"length": 96354
    }
}

I would like to know what do you think of proposing a way to let the Processes client inform about the encoding, mediaType and potential schema associated to a value passed as string (or an object for application/json content, but with the addition of the schema url as a requirement associated with in case there are available option).

@cportele
Copy link
Member

From my understanding of the current state of the OGC API - Processes, the first "type" parameter should be renamed to "schema".

@gfenoy - Duh, thanks for pointing this out! There was an error in the schema, there is no "type" member. The "allOf" defines the type as the combination of the general link type and the specific media type constraint. I have corrected my comment above.

Adding additional inline options with "oneOf" would work, too. However, my feeling is that the resulting structures become too complex and also do not seem to be really useful, because the "contentEncoding" and "contentMediaType" properties are there to specify the string content in the schema. If that cannot be done, because the string may be one of 3 media types and the encoding may be binary or base64, these schema capabilities do not add any values, because you have to declare the media type and encoding in the content anyhow. Why not create a general type for inline content with "value"/"type"/"encoding" properties and then use another "allOf" construct to constrain the values for the specific case - similar to the approach with the Link object?

@jerstlouis
Copy link
Member Author

jerstlouis commented Apr 14, 2021

@cportele regarding dropping value completely even for objects, with the Worfklow extension in addition to inline and linked values we can also specify:

  • An OGC API collection, e.g. { "collection" : "http://maps.ecere.com/ogcapi/collections/blueMarble" }
  • A nested process, which is a recursed process execution object like the top-level object of the document, e.g. { "process" : "http://maps.ecere.com/processes/ElevationContours", "inputs" : { ... } }

so I'm not sure how that could work.

@cportele
Copy link
Member

@jerstlouis

The collection case looks a lot like a link to me (probably with a specific link relation type). And if not, what is the problem with a an object with a "collection" member?

Similarly, what is the problem with the process object?

@jerstlouis
Copy link
Member Author

jerstlouis commented Apr 14, 2021

@cportele The distinction is that with Core, the href / link is directly the input (e.g. the GeoTIFF or GeoJSON), with a single / hardcoded response.
With Workflows, it's a link to an OGC API collection document, which the requester would first request and analyze and would issue different requests for different area or resolution, and can still use content negotiation e.g. to retrieve GeoTIFF vs. netCDF encoding, or even decide whether to use Features or Coverages if both are available for the same collection (e.g. sensor data, point cloud).

The problem with the "collection" and "process" members, as an example, is that when using an inline value it might be a GeoJSON FeatureCollection object that is expected. With Workflows, in addition to directly embedding the GeoJSON FeatureCollection object, you also have the option to:

  • Point to an OGC API - Features collection (/collections/{collectionId}), and the client will issue a request adding /items and an appropriate bbox based on the area of interest, to generate a variable URL to this GeoJSON FeatureCollection
  • Point to an OGC API - Process which the client will request from to generate the FeatureCollection object as an output

And clients always have the option of using any of those things for the same input:

  • An embedded object
  • A data link
  • An OGC API Collection link
  • An OGC API Processes link (+ inputs)

A Link with different relation types might be used for the latter 3. But how to enable the distinction between the in-line object value vs. the Link object?

Also Workflows' "process" being a link breaks the recursiveness of the whole thing, where "process" and "inputs" are the properties of the top-level object. We were also thinking of additional members for collections like being able to specify a filter for example.

@cportele
Copy link
Member

@jerstlouis - I think the discussion about the design of the Processes workflow extension should take place in the Processes SWG, at least I cannot contribute as have not followed the discussions and do not have a good understanding of it.

My conclusion from a Routes perspective is that it should be possible for Processes to allow the use of custom objects (e.g. a GeoJSON MultiPoint) as a value of a property of the route definition without being constrained to a fixed template for such objects. Processes can define additional rules how to support specific cases (e.g., by pre-defining certain types of objects or object templates with associated semantics), but this should not constrain other OGC API resources (that someone may want to implement using resources from Processes).

@jerstlouis
Copy link
Member Author

jerstlouis commented Apr 15, 2021

@cportele Sure we could open a new issue in Processes to continue the discussion, but this is about the change proposed in this issue removing the "value" vs. "href" (and in Workflows vs. "collection" vs. "process") wrapper object.

Leaving Workflows aside for the sake of simplicity (though the problem is the same), how to distinguish between a "Link" vs. a "FeatureCollection" object (e.g. both have a member named "type" member with different meanings) when both are allowed for a particular input.

That is, in Processes the client has the option of either embedding a FeatureCollection or linking to one in its process execution request.

@gfenoy
Copy link

gfenoy commented Apr 15, 2021

Adding additional inline options with "oneOf" would work, too. However, my feeling is that the resulting structures become too complex and also do not seem to be really useful, because the "contentEncoding" and "contentMediaType" properties are there to specify the string content in the schema. If that cannot be done, because the string may be one of 3 media types and the encoding may be binary or base64, these schema capabilities do not add any values, because you have to declare the media type and encoding in the content anyhow. Why not create a general type for inline content with "value"/"type"/"encoding" properties and then use another "allOf" construct to constrain the values for the specific case - similar to the approach with the Link object?

@cportele very glad to hear that we get the exact same felling about the not very useful but very verbose structure outputted, it was typically the reason I have shared it there.

I totally agree with you that in case there is a distinction that is possible between the type / encoding (potential schema) options available then the client request body may contain them in and object with the value added to it.

Just to clarify the previous example about the "application/json" which should be a featureCollection - or a MultiPoint as it is the case in Route - then if there is no option available (meaning that the json object can only be MultiPoint and not for instance a Polygon or anything else, sorry I know there is no sense in that for routes but still, it is an example) then using my proposed solution for Processes we will then be able to use directly the JSON object (so the MultiPoint object itself, "param": MY_MULTIPOINT_OBJ). The only thing is that in the ProcessDescription we will mention that the input is schema defined as "oneOf" the ["allOf" link object you provided] and, [the type: object schema $ref multipointSchema]. Maybe this is not illustrated well here with Routes as it does not make sense to provide the MultiPoint by reference but let suppose you need to access this input remotely for any reason.

This is the reason why I said that we may distinguish easily between GeoJSON and other type, because the data will be provided as a json object and not a string. The issue was about KML and GML that would be provided as a string, this is where type, encoding and schema may apply.

I would like to mention that I don't like much the "schema" name I use here because it was defined this way in WPS 1.0.0 and corresponds to a XML schema so it may be named in another way, contentSchema may be an option, I don't really know what to use but in that case it would be xml schema for KML and GML (and its supported version).

I hope it clarify my point and, It is clear that I am definitely +1 for this part "Why not create a general type for inline content with "value"/"type"/"encoding" properties and then use another "allOf" construct to constrain the values for the specific case - similar to the approach with the Link object?"

I hope it is clear that for me, we should have this "value" / "type" / "encoding" only in case choices are possible, in case you can provide only one kind of input then, there is no issue to directly use "string" / "contentEncoding" / "contentMediaType" / "contentSchema" (if any).

Maybe the best way to be clear would be to take back the example for KML / GML / GeoJSON kind of "featureCollection" (what I like with this example is that featureCollection can apply both in case of GML and GeoJSON). As I don't have the inline element definition as of yet please forgive if I give example using the previous syntaxe. Here would be the input definition for an input "A" from the ProcessDescription:

"A": {
    "title": "Geometry collection",
    "description": "This is an example of a generic geometry collection input.",
    "schema": {
        "oneOf": [
	    {
		"allOf": [
		    {
			"$ref": "http://schemas.opengis.net/ogcapi/features/part1/1.0/openapi/schemas/link.yaml"
		    },
		    {
			"type": "object",
			"properties": {
			    "type": {
				"type": "string",
				"enum": [
				    "text/xml",
				    "application/json"
				]
			    }
			}
		    }
		]
	    },
            {
		"type": "object",
		"properties": {
		    "value": {
			"oneOf": [
			    {
				"type": "string",
				"contentEncoding": "u-8",
				"contentMediaType": "text/xml",
				"contentSchema": "http://schemas.opengis.net/gml/3.2.1/gml.xsd"
			    },
			    {
				"type": "string",
				"contentEncoding": "u-8",
				"contentMediaType": "text/xml",
				"contentSchema": "http://schemas.opengis.net/kml/2.3/ogckml23.xsd"
			    }
			]
		    )
		    "type": {
			"type": "string",
			"enum": [
			    "text/xml"
			]
		    },
		    "encoding": {
			"type": "string",
			"enum": [
			    "utf-8"
			]
		    },
		    "XschemaX": {
			"type": "string",
			"enum": [
			    "http://schemas.opengis.net/kml/2.3/ogckml23.xsd",
			    "http://schemas.opengis.net/gml/3.2.1/gml.xsd"
			]	
		    }
		}
	    },
            {
		"type": "object",
		"$ref": "https://geojson.org/schema/FeatureCollection.json"
	    },
        ]
    },
    "example":{
	"href": "https://www.example.com/server/collection.json",
	"type": "application/json",
	"rel": "enclosure",
	"length": 1542
    }
}

I used the XschemaX to highlight that I don't really know how to name this parameter. Also in that case, the requestBody can but does not require to contains the type and encoding parameter when providing a string value as the server is sure this string is a xml file, there is "no choice" as I said earlier.

@gfenoy
Copy link

gfenoy commented Apr 15, 2021

Sorry but actually, reading this last definition there is still something missing, when one want to send a reference specifying the XschemaX used for the reference to be fetched.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API OGC API - Routes
Projects
Development

Successfully merging a pull request may close this issue.

5 participants