Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How do you cancel or query the state of an action request? #302

Closed
benfrancis opened this issue Nov 23, 2018 · 60 comments
Closed

How do you cancel or query the state of an action request? #302

benfrancis opened this issue Nov 23, 2018 · 60 comments
Labels
ActionAffordance Topic realtated to the Actions Affordance Defer to next TD spec version This topic is not covered in this charter, maybe included for the next TD version.

Comments

@benfrancis
Copy link
Member

In Mozilla's Web Thing API, an action can be requested using an HTTP POST request on an Action resource to create an ActionRequest resource. The Action resource is essentially an action queue, consisting of multiple ActionRequest resources.

The response to the POST provides a unique URL for the ActionRequest resource, which can then have its status queried with a GET or be cancelled with a DELETE. A list of all current requests can be retrieved by a GET on the Action resource.

How would this API be described in a Thing Description following the current draft specification? Or is there another intended way to achieve these use cases?

@benfrancis
Copy link
Member Author

To provide some additional context for people who don't want to read the Web Thing API specification...

To request an action, the Web Thing API uses a POST request, e.g.

POST https://mythingserver.com/things/lamp/actions/fade
Accept: application/json

{
  "fade": {
    "input": {
      "level": 50,
      "duration": 2000
    }
  }
}

Response:

201 Created

{
  "fade": {
    "input": {
      "level": 50,
      "duration": 2000
    },
    "href": "/things/lamp/actions/fade/123e4567-e89b-12d3-a456-426655"
    "status": "pending"
  }
}

You can get a list of current action requests.

Request:

GET /things/lamp/actions/fade
Accept: application/json

Response:

200 OK
[
  {
    "fade": {
      "input": {
        "level": 50,
        "duration": 2000
      },
      "href": "/things/lamp/actions/fade/123e4567-e89b-12d3-a456-426655",
      "timeRequested": "2017-01-25T15:01:35+00:00",
      "status": "pending"
    }
  },
  {
    "fade": {
      "input": {
        "level": 100,
        "duration": 2000
      },
      "href": "/things/lamp/actions/fade/123e4567-e89b-12d3-a456-426655",
      "timeRequested": "2017-01-24T11:02:45+00:00",
      "timeCompleted": "2017-01-24T11:02:46+00:00",
      "status": "completed"
    }
  }
]

You can get the status of an action request.

Request:

GET /things/lamp/actions/fade/123e4567-e89b-12d3-a456-426655
Accept: application/json

Response:

200 OK
{
  "fade": {
    "input": {
      "level": 50,
      "duration": 2000
    },
    "href": "/things/lamp/actions/fade/123e4567-e89b-12d3-a456-426655",
    "timeRequested": "2017-01-25T15:01:35+00:00",
    "status": "pending"
  }
}

You can cancel an action request.

Request:
DELETE /things/lamp/actions/fade/123e4567-e89b-12d3-a456-426655

Response:
204 No Content

You can also get a list of action requests of all types with a GET request to an Actions resource (whose URL is provided by the top level links member).

Request:

GET /things/lamp/actions
Accept: application/json

Response:

200 OK
[
  {
    "fade": {
      "input": {
        "level": 50,
        "duration": 2000
      },
      "href": "/things/lamp/actions/fade/123e4567-e89b-12d3-a456-426655",
      "timeRequested": "2017-01-25T15:01:35+00:00",
      "status": "pending"
    }
  },
  {
    "reboot": {
      "href": "/things/lamp/actions/reboot/124e4568-f89b-22d3-a356-427656",
      "timeRequested": "2017-01-24T13:13:33+00:00",
      "timeCompleted": "2017-01-24T13:15:01+00:00",
      "status": "completed"
    }
  }
]

And for completeness you can also request an action on the top level Actions resource if you want to.

Request:
POST https://mythingserver.com/things/lamp/actions/
Accept: application/json

{
  "fade": {
    "input": {
      "level": 50,
      "duration": 2000
    }
  }
}

Response:

201 Created

{
  "fade": {
    "input": {
      "level": 50,
      "duration": 2000
    },
    "href": "/things/lamp/actions/fade/123e4567-e89b-12d3-a456-426655"
    "status": "pending"
  }
}

How can all of this be expressed in a Thing Description, including the payload formats and possible error responses?

@draggett
Copy link
Member

draggett commented Nov 23, 2018

By contrast, in Arena, HTTPS POST takes the action input as the body of the request, and returns the action output as the body of the response. This follows the basic semantics for POST. If a developer wants a cancellable process that is initiated by an action, that can be layered on top of the core patterns of actions and events. You could return a process ID for the action that initiates the process, and provide another action to cancel an active process using the process ID. You can likewise define a progress event that passes the process ID together with status information. My take is thus that the core semantics for the Web of Things should be really simple and more complex models can be layered on top. This principle also applies to APIs for event logs and for querying a history of property updates in the case of telemetry streams.

@benfrancis
Copy link
Member Author

benfrancis commented Nov 23, 2018

If the argument for forms vs. links is for backwards compatibility with existing IoT APIs using declarative protocol bindings, then it should be possible to describe the API described above using a Thing Description alone. Describing Mozilla's Web Thing API should be a particularly easy example as it was developed in parallel with the Thing Description specification and its data model is already aligned.

The solution you describe requires changing the API itself, and doesn't explain how the client would know from the Thing Description that the ID returned by the action request can be used to cancel that request by using a different action. Can you provide an example Thing Description which provides the client with all of that information in a declarative protocol binding?

@benfrancis
Copy link
Member Author

benfrancis commented Nov 23, 2018

By contrast, in Arena, HTTPS POST takes the action input as the body of the request, and returns the action output

What happens if the action is a long running process which doesn't complete by the time the HTTP request times out? (This was the reason we created the action queue API, and is a difference between requesting an action and and simply setting a property)

@draggett
Copy link
Member

Developers are responsible for documenting the purpose of properties, actions and events. Picking appropriate names would help, e.g. an action called cancelProcess with an input named processID.

@draggett
Copy link
Member

The timeout for HTTP requests is often client dependent. We could standardise how to express an indication of the maximum expected duration of a long lived process as part of the metadata for an action. Alternatively, developers could use the processID design pattern described above.

@benfrancis
Copy link
Member Author

benfrancis commented Nov 23, 2018

Developers are responsible for documenting the purpose of properties, actions and events. Picking appropriate names would help, e.g. an action called cancelProcess with an input named processID.

This assumes the involvement of a human to interpret those names and write custom code for that specific web thing, which doesn't allow for ad-hoc interoperability.

The timeout for HTTP requests is often client dependent. We could standardise how to express an indication of the maximum expected duration of a long lived process as part of the metadata for an action. Alternatively, developers could use the processID design pattern described above.

The duration of an action may be user defined e.g. an action to fade a light from 0 to 100% brightness over the course of 1 hour.

Again, your proposed solution requires changing the API. But how would you get the current status of an action in this model? e.g. to find out if the action succeeded or failed.

@draggett
Copy link
Member

I agree that when it comes to describing existing services using a standardised machine interpretable format, this gets increasingly complicated, and I question whether this complexity is justified. In the longer time frame we can and should encourage convergence in protocols and their usage across the Internet and this would render declarative protocol bindings a historical legacy that is no longer needed.

An alternative is to provide a platform ID that web of things clients can used to identify what protocols and usage patterns apply to a given thing. This avoids the need for complex representations in every TD.

@benfrancis
Copy link
Member Author

The point I'm making here is that it's not just difficult to be backwards compatible with existing APIs, it may actually be impossible without significantly more complexity than is currently allowed for in the current specification.

Is anyone willing to make a stab at describing Mozilla's Web Thing REST API in a Thing Description alone? The issue described here is just one of several problems with trying to do that.

(I think we've already agreed that the Web Thing WebSocket API can not be described in a Thing Description and would require a separate WebSocket subprotocol specification.)

@draggett
Copy link
Member

The duration of an action may be user defined e.g. an action to fade a light from 0 to 100% brightness over the course of 1 hour.

Yes, but the developer should have some understanding of what the maximum is likely to be before the process can be considered to have failed. That expectation could be given as a metadata property.

Again, your proposed solution requires changing the API.

Yes, but see my previous post that questions the long term commercial need for a complex declarative protocol binding standard.

But how would you get the current status of an action in this model? e.g. to find out if the action succeeded or failed.

You would listen to the status events. In addition, I would design the server and the application to be robust against a loss of network connectivity, the reboot of the client or server, etc.

@mkovatsc
Copy link
Contributor

We have been discussing this for a long time -- since IG-only work -- and also came to good conclusion about this. The ideal way to this is using hypermedia where a running Action is represented as a Web resource that is dynamically created upon invokaction. This running Action itself can have Properties and Actions itself again. Maybe you remember the discussions we had around a potential application/wot+json media type for this.

We do have all required extension points in place for this: The output of an Action can be such an application/wot+json representation or potentially other hypermedia formats such as CoRAL.

To date, support for this is very limited in existing systems; hypermedia concepts are almost nil. Thus, we decided to focus on description of deployed systems for now and tackle this issue in the next charter period or in the IG first. The closest we have are custom "ticket responses" that each platform does in a different style. This must be solved by semantically describing the response content and leave it to the application. @draggett described this approach in his comment further up.

@mkovatsc mkovatsc added the Defer to next TD spec version This topic is not covered in this charter, maybe included for the next TD version. label Nov 23, 2018
@draggett
Copy link
Member

draggett commented Nov 23, 2018

We have been discussing this for a long time -- since IG-only work -- and also came to good conclusion about this. The ideal way to this is using hypermedia where a running Action is represented as a Web resource that is dynamically created upon invokaction. This running Action itself can have Properties and Actions itself again. Maybe you remember the discussions we had around a potential application/wot+json media type for this.

My very old proposal was to support things as first class types. However, I agree that this something we can leave to future extensions given the subtleties involved.

@mkovatsc
Copy link
Contributor

support things as first class types

Simply set the content type to application/td+json, done...

@draggett
Copy link
Member

That works for limited cases, but isn't a general solution. Object oriented programming languages support objects as first class types, so having things as first class types is something that will be expected for the web of things. At the protocol level we can pass things using the URI for their thing description, or as you suggest, by passing the JSON-LD for the thing description, both are a form of reference to a thing. Interestingly, by passing the JSON-LD explicitly, this corresponds to giving the thing a blank node for its RDF identifier.

If the TD for a thing includes declarations of initial values, the platform should carry out the initialisation. If this involves a thing, the platform needs to retrieve the thing's TD, if not supplied in place, and initialise that thing. This can get a little complicated when you need to deal with forward references, and when the dependencies between things form cycles. I showed how to handle that over two years ago, proving that it is a tractable problem, just as it is for object oriented programming languages.

@sebastiankb sebastiankb added the ActionAffordance Topic realtated to the Actions Affordance label Jan 23, 2020
@vcharpenay
Copy link
Contributor

Here is a proposal: in a future version of the TD model, we could at least standardize new operation types to cancel, query (and update?) an invoked action: something like cancelaction, queryaction, updateaction. Each operation type would not necessarily be used in a TD directly but it could be used as part of a Link header or some hypermedia-aware response payload to drive WoT consumers.

@vcharpenay
Copy link
Contributor

Having said that, it is possible already with the current TD spec to specify operations on dynamically created resources. For that, you can define a generic action manage that declares forms on these resources (what you call ActionRequests in Mozilla's WebThings, @benfrancis) Here is a try:

{
    "@context": [
        "https://www.w3.org/2019/wot/td/v1",
        {
            "ActionRequest": "http://example.org/ActionRequest",
            "cancelaction": "http://example.org/cancelActionOperationType",
            "queryaction": "http://example.org/queryActionOperationType"
        }
    ],
    "id": "urn:example:mylamp",
    "actions": {
        "fade": {
            "input": {
                "type": "object",
                "properties": {
                    "level": { "type": "number" },
                    "duration": { "type": "duration" }
                }
            },
            "output": {
                "type": "object",
                "properties": {
                    "href": {
                        "@type": "ActionRequest",
                        "type": "string"
                    },
                    "status": {
                        "enum": [ "pending", "completed" ]
                    }
                }
            },
            "forms": [
                {
                    "href": "https://mythingserver.com/things/lamp/actions/fade",
                    "op": "invokeaction"
                }
            ]
        },
        "manage": {
            "uriVariables": {
                "actionRequest": {
                    "@type": "ActionRequest",
                    "type": "string"
                }
            },
            "forms": [
                {
                    "href": "{actionRequest}",
                    "htv:methodName": "DELETE",
                    "op": "cancelaction"
                },
                {
                    "href": "{actionRequest}",
                    "htv:methodName": "GET",
                    "op": "queryaction"
                }
            ]
        }
    }
}

In this example, you can see that I use cancelaction and queryaction but since they do not exist yet in the TD model, I declared them in the JSON-LD context. Same thing for the class ActionRequest, which indicates the output of action invokation is the same as the actionRequest URI variable.

@vcharpenay
Copy link
Contributor

However, I would also expect (or wish) that future WoT Things have a more hypermedia-driven interface to consumers. In that case, the cancelaction and queryaction operations could be added to each array item in the response of GET /things/lamp/actions/fade.

As @mkovatsc and @draggett said, the same structure as in the TD model could be used for the JSON response. I would suggest one conceptual variant, though: to me, reserving the class Thing for physical objects is important. So, it means that if a new "TD" is returned after invoking an action, it should be interpreted as an extension of the original TD, as if new actions on the same Thing were made available. It makes a big difference when dealing with the semantics of TD documents but the JSON structure would not be significantly impacted.

@egekorkan
Copy link
Contributor

I support the idea at the previous comment by @vcharpenay.
In order to bring some more examples to the discussion, we have a PanTilt module (think it like the non-camera part of a CCTV camera) where a stopMovement action can stop any ongoing movements. The source code and TD can be found here.
In addition to actions that take a long time, there can be actions that are started via a request but the physical action never stops. From the previous example, it would be the moveContinuously and panContinuously actions, where the invokeaction request starts the movement and the movement doesn't stop until it hits a limit or a stopMovement action is invoked. A more familiar example would be a conveyor belt that is started with an action and stopped with another action.
A hypermedia based approach was the first one that came to mind but I was not sure how one would describe it, since execution of a form with an op invokeaction would need to return some information that is used by a form with another op value. I think the comment above goes in the right direction by taking this into account. Just that I think it would be better to not introduce another action and maybe pack the manage action into the fade action.

@mlagally
Copy link
Contributor

mlagally commented Feb 7, 2020

As discussed in the TD call on 7.2. we are looking at different examples.

Oracle's IoT Cloud service has a hypermedia-based action model that supports synchronous and asynchronous operations.

The response payload contains a key "complete", when the operation is already finished, otherwise the url endpoint contains a link to asynchronously query the status.

{
"complete":false,
"id":"72a4239f1644-ccf",
"endpointId":"6248475d6e28-3013",
"url":"https://iotserver/iot/api/version/resource/path",
"method":"Request method",
"status":"Request statusOne of [RECEIVED, DISPATCHED, COMPLETED, EXPIRED, FAILED, UNKNOWN].",
"requestTime":"2016-07-22T10:44:57.746Z",
"responseTime":"Time when the response is received by server",
"responseEventTime":"2016-07-22T10:44:57.746Z",
"responseStatusCode":"Request status code from the response message
(One of [HTTP 200: OK, HTTP 201: Created, HTTP 202: Accepted, HTTP 203: Non Authoritative Information, HTTP 204: No Content, HTTP 400: Bad Request, HTTP 401: Unauthorized, HTTP 402: Payment Required, HTTP 403: Forbidden, HTTP 404: Not Found, HTTP 405: Method Not Allowed, HTTP 406: Not Acceptable, HTTP 408: Request Timeout, HTTP 409: Conflict, HTTP 500: Internal Server Error, HTTP 502: Bad Gateway, HTTP 503: Service Unavailable].)",
"response":"Original response message payload JSON document"
}

Here's the full API decumentation for Invoke action:
https://docs.oracle.com/en/cloud/paas/iot-cloud/iotrq/op-iot-api-v2-apps-app-id-deviceapps-devapp-id-devicemodels-devicemodel-id-actions-action-name-post.html

Starting point for the API documentation:
https://docs.oracle.com/en/cloud/paas/iot-cloud/iotrq/toc.htm

@takuki
Copy link
Contributor

takuki commented Feb 13, 2020

I would like to point out that Thing-Consumer protocol should always look forward, but not backward.

This means, it had not better depend on transaction model where you can "cancel" a request while it is in action.

Consumer should be able to make an independent "cancel" request to a Thing, and the Thing makes a best effort to fulfill the request. The fulfillment may be just stop the action, or Thing may wait the action to finish (if it cannot be stopped immediately) and revert to the original state if possible.

Here, note that a Thing may be able to process the cancel request even after the original request was complete. This is why I said Thing-Consumer protocol should always look forward. Things can decide how best to process the "cancel" request because it just one of the subsequent action requests.

@mlagally
Copy link
Contributor

I like the proposal. There's one aspect to consider:
Is the cancel operation synchronous or asynchronous?
If it is asynchronous, would it be possible to abort a long-lasting cancel operation that does not complete?

@zolkis
Copy link

zolkis commented Feb 13, 2020

For an async operation the cancellation should also be async.
Usually cancelling cannot be guaranteed, so it is always best effort.
Therefore Things need to be designed

  • either in a way that their state captures this detail
  • or supporting transactions and rollbacks.

@sebastiankb
Copy link
Contributor

sebastiankb commented Feb 21, 2020

Does it make sense to introduce a hypermedia-specific navigation term that gives an indication of where the resource is defined in the payload message that can be used to query the status or cancel an action? E.g., in the case of Oracle it would be

{

            "forms": [
                {
                    "href": "...",
                    "op": "invokeaction",
                    "hypermedia" : "url" //--> points to the JSON term of the response payload message
                }
            ]
        } 

for Mozilla it would look like

{

            "forms": [
                {
                    "href": "...",
                    "op": "invokeaction",
                    "hypermedia" : "href" //--> points to the JSON term of the response payload message
                }
            ]
        } 

Btw: I have checked the MDSP API, and there seems to be no use of the hypermedia approach yet.

@benfrancis
Copy link
Member Author

On the Thing Description call today we discussed proposed invokeanyaction/queryallactions operations.

In that issue I noted that there are three potential use cases for "querying" an action:

  1. Getting an individual ActionStatus resource regarding an individual action request (e.g. GET /actions/fade/1935-5939-ngu3)
  2. Getting a list of pending action requests for a given action (e.g. GET /actions/fade)
  3. Getting a list of pending action requests for all actions (e.g. GET /actions)

Do we need two operations for Action affordances which distinguish between the first two? E.g. queryaction vs. queryactionrequest?

@sebastiankb
Copy link
Contributor

Do we need two operations for Action affordances which distinguish between the first two? E.g. queryaction vs. queryactionrequest?

I think, this is a similar analogy to readproperty and readallproperties. In this context it would make sense to have two. Maybe we should use the term queryallactions instead. Option 2 and 3 can be supported by a Thing implementation and be announced at the top level forms:

  "forms": [
    {
      "op": ["queryallactions"],
      "href": "./actions/{ACTION_NAME}"
    },
    {
      "op": ["queryallactions"],
      "href": "./actions"
    }
  ]

@benfrancis
Copy link
Member Author

@sebastiankb wrote:

I think, this is a similar analogy to readproperty and readallproperties.

I agree in that no. 2 is like readproperty and no. 3 is like readallproperties, but if we were following that example then no. 2 should be in the Action affordance, not a top level form. There's is no equivalent of no. 1 for properties because a Property only has one value, whereas an Action may have multiple instances.

  "forms": [
    {
      "op": ["queryallactions"],
      "href": "./actions/{ACTION_NAME}"
    },
    {
      "op": ["queryallactions"],
      "href": "./actions"
    }
  ]

I agree the same name makes sense for both operations, but it may be tricky to define how a Consumer distinguishes between the two if they share the same name.

I wish there was a word in the English language for an instance of an action, but I can't think of one. Some other ideas...

    1. queryaction - in a Form in the Action affordance
    2. listactions - in a Form in the Action affordance
    3. listallactions - in a top level Form
    1. queryactionstatus - in a Form in the Action affordance
    2. queryaction - in a Form in the Action affordance
    3. queryallactions - in a top level Form
    1. queryaction - in a Form in the Action affordance
    2. readactionqueue - in a Form in the Action affordance
    3. readallactionqueues - in a top level Form

@sebastiankb
Copy link
Contributor

sebastiankb commented Aug 4, 2021

I agree in that no. 2 is like readproperty and no. 3 is like readallproperties, but if we were following that example then no. 2 should be in the Action affordance, not a top level form.

Yes, thats makes sense.One idea is to design the top-level form to inform the client that it can query actions with a filter by specifying the name of the actions in the URL, which will return only the status of all active actions with the corresponding action name.

I agree the same name makes sense for both operations, but it may be tricky to define how a Consumer distinguishes between the two if they share the same name.

If we introduce the convention then the client can distinguish based on the URL, right?

I wish there was a word in the English language for an instance of an action, but I can't think of one. Some other ideas...

I would prefer no. II.

@sebastiankb
Copy link
Contributor

regarding @benfrancis comment I will put this to the agenda of today's TD call.

@egekorkan
Copy link
Contributor

Just one argument regarding having readaction based verbs for the op: What if in the future we see that there is also a use case for observing an action where the Consumer gets the changes to the state of the action? It might be good to make it aligned with properties.

An opposing argument based on the same "worry" I have: We should make sure that op keywords are different enough that a newcomer does not confuse actions with properties.

Yet another comment:
Reading a property and querying an action are semantically very close. One can say that invoking an action creates a property affordance that is simply temporary, thus having readaction make sense

@benfrancis
Copy link
Member Author

benfrancis commented Aug 4, 2021

Note that the example Thing Description in #302 (comment) is now out of date. Following a review of the proposed action protocol binding for the Core Profile, both the synchronous and asynchronous responses follow the same data schema, which has been expanded to include a href member. I've tried to provide an updated example Thing Description below which covers both cases, but it's not easy.

{
  "@context": "https://www.w3.org/2019/wot/td/v1",
  "id": "urn:ex:thing",
  "actions": {
    "fade": {
      "input": {
        "type": "object",
        "properties": {
          "level": {
            "type": "integer",
            "minimum": 0,
            "maximum": 100
          },
          "duration": {
            "type": "integer",
            "minimum": 0,
            "unit": "milliseconds"
          }
        }
      },
      "output": {},
      "schemaDefinitions": {
        "actionStatus": {
          "status": {
            "type": "string",
            "enum": [ "pending", "running", "completed", "failed" ],
            "required": true
          },
          "output": {
            "required": false
          },
          "error": {
            "type": "object",
            "required": false
          },
          "href": {
            "type": "string",
            "const": "/fade/{id}",
            "required": false
          }
        }
      },
      "forms": [
        {
          "href": "/fade",
          "op": "invokeaction",
          "htv:methodName": "POST",
          "contentType":"application/json",
          "response": {
            "contentType": "application/json",
            "schema": "actionStatus"
          },
          "additionalResponses": {
            "success": "yes",
            "contentType": "application/json",
            "schema": "actionStatus",
            "htv:headers": [ 
              {
                "htv:fieldName": "Location",
                "htv:fieldValue": "/fade/{id}"
              }
            ]
          }
        },
        {
          "href": "/fade/{id}",
          "op": "queryaction",
          "contentType":"application/json",
          "htv:methodName": "GET",
          "response": {
            "contentType":"application/json",
            "schema": "actionStatus"
          }
        },
        {
          "href": "/fade/{id}",
          "op": "cancelaction",
          "htv:methodName": "DELETE"
        }
      ],
      "uriVariables": {
        "id": {
          "type": "string",
          "description": "identifier of action request"
        }
    }
  }
}

The notes from above still apply:

Notes:

  • The response to the invokeaction operation does not follow the output data schema because an asynchronous response to an action invocation does not include the output of the action. Rather, the output schema is used as part of the actionStatus data schema in the follow-up queryaction operation. Separating the output data schema from the response data schema is one of the topics discussed in Consider adding DataSchema to response and additionalResponses #1053
  • I've included an empty output schema as placeholder since in this particular example the action has no output. But where an action does have an output, I'm not sure of the most appropriate way to link the output schema from the actionStatus schema. Is a JSON pointer appropriate here?
  • The schema member currently only seems to be allowed in an AdditionalResponse (added in 100f0de), not an ExpectedResponse. That would need changing.
  • Is it sufficiently obvious to consumers that the {id} in the Location header of the response to the invokeaction request corresponds to the {id} used in the href of other operations? I can't think of a way semantic annotations would help in this case.
  • Is it OK to use URL templates in the Location header?
  • This TD doesn't currently describe error conditions. Is there a way to specify the status code of an ExpectedResponse and an AdditionalResponse? OpenAPI does this by keying responses by status code, but I think the decision was not to do that for additionalResponses since it would be too protocol specific. I can't find vocabulary in the Protocol Binding Templates specification to describe an HTTP status code.

In addition to these notes:

  • There are now some fields in the actionStatus data schema which are required for some Forms, but not others. I assume the only way to describe this is by splitting it into three different schemas?
  • I've never been sure whether the contentType in a Form refers to the request or the response?
  • Can a URI variable be used in a const in a data schema?

Overall my impression is that it would be very difficult for a Consumer which didn't explicitly implement the Core Profile Protocol Binding to interpret this Thing Description, but this is the closest I can get to providing a declarative equivalent of the concrete protocol binding described in the specification. Note that my intention is that a Web Thing using the Core Profile would expose a much simpler Thing Description than this, this is just a canonical(ish) example of what it might look like once all the defaults defined in the Core Profile Protocol Binding have been applied, and how the full protocol binding would have to be described for a Consumer which doesn't implement the Core Profile.

I think the important action item here is to decide whether to add the queryaction and cancelaction operation names to the Thing Description specification, and what their meta-interaction equivalents in top level forms might be called.

@benfrancis
Copy link
Member Author

benfrancis commented Aug 4, 2021

Note that my intention is that a Web Thing using the Core Profile would expose a much simpler Thing Description than this

E.g.

{
    "@context": "https://www.w3.org/2019/wot/td/v1",
    "id": "urn:ex:thing",
    "actions": {
      "fade": {
        "input": {
            "type": "object",
            "properties": {
               "level": {
                  "type": "integer",
                  "minimum": 0,
                  "maximum": 100
                },
                "duration": {
                  "type": "integer",
                  "minimum": 0,
                  "unit": "milliseconds"
                }
            }
        },
        "output": {},
        "forms": [
            {
              "href": "/fade",
              "op": "invokeaction"
            },
            {
              "href": "/fade/{id}",
              "op": "queryaction"
            },
            {
              "href": "/fade/{id}",
              "op": "cancelaction"
            }
        ],
        "uriVariables": {
            "id": {
            "type": "string",
            "description": "identifier of action request"
            }
        }
    }
}

@benfrancis
Copy link
Member Author

benfrancis commented Aug 4, 2021

@egekorkan wrote:

Reading a property and querying an action are semantically very close. One can say that invoking an action creates a property affordance that is simply temporary, thus having readaction make sense.

If an operation using an HTTP request like GET /actions/fade/19g3-631g-61gj was called readaction, then what would an operation like GET /actions/fade or GET /actions be called?

I think the key difference between properties and actions is that a property only has one value at any one time, whereas an action may have multiple running instances (in serial or in parallel). So whilst a property is likely to be bound to a single resource (hence the singular terms readproperty/writeproperty) an action may be bound to a collection of resources (i.e. an action queue).

I think we basically need to decide whether the term "action" in operation names refers to:

A) the collection, e.g.

  • invokeaction - POST /actions/fade/
  • cancelactioninstance - DELETE /actions/fade/19g3-631g-61gj
  • queryactioninstance - GET /actions/fade/19g3-631g-61gj
  • queryaction - GET /actions/fade
  • queryallactions - GET /actions
  • observeactioninstance - GET /actions/fade/19g3-631g-61gj Accept: text/event-stream
  • observeaction - GET /actions/fade Accept: text/event-stream
  • observeallactions - GET /actions Accept: text/event-stream

B) an individual instance of the interaction, e.g.

  • invokeaction - POST /actions/fade/
  • cancelaction - DELETE /actions/fade/19g3-631g-61gj
  • queryaction - GET /actions/fade/19g3-631g-61gj
  • queryactionlist - GET /actions/fade
  • queryallactionlists - GET /actions
  • observeaction - GET /actions/fade/19g3-631g-61gj Accept: text/event-stream
  • observeactionlist - GET /actions/fade Accept: text/event-stream
  • observeallactionlists - GET /actions Accept: text/event-stream

Which works best?

@egekorkan
Copy link
Contributor

Not sure if I should comment here or at #1208 but I think that there are some problems when one thinks of the Consumer applications in cases that href has dynamic ids. Please also have a look at https://github.com/w3c/wot-thing-description/tree/main/proposals/hypermedia-control-2#observations-1 .

An important thing to highlight here is that for many devices there would be no real need to have dynamic ids if we do not want to queue multiple actions. If I am fading a lamp, rotating a robot, sprinkling water on a farm, my Thing can reject subsequent invoke actions if one is already being processed. Dynamic hrefs is more difficult to implement in a Thing and in Consumers so I would not want to promote their use in the TD specification. They should be of course possible to describe and they are needed for the WebThings API as well. Ideally, we should use static hrefs in most examples and then a separate section about how to managed dynamic hrefs in TDs.

@sebastiankb
Copy link
Contributor

sebastiankb commented Aug 5, 2021

This is a reply to the example above.

  • schemaDefinitions is a term which can be used at the top level only
  • schemaDefinitions itself should carry a JSON Schema definition

Here is the version which would cover this:

{
    "@context": "https://www.w3.org/2019/wot/td/v1",
    "id": "urn:ex:thing",
    "actions": {
        "fade": {
            "input": {
                "type": "object",
                "properties": {
                    "level": {
                        "type": "integer",
                        "minimum": 0,
                        "maximum": 100
                    },
                    "duration": {
                        "type": "integer",
                        "minimum": 0,
                        "unit": "milliseconds"
                    }
                }
            },
            "output": {},
            "forms": [
                {
                    "href": "/fade",
                    "op": "invokeaction",
                    "htv:methodName": "POST",
                    "contentType": "application/json",
                    "response": {
                        "htv:headers": [
                            {
                                "htv:fieldName": "Location",
                                "htv:fieldValue": "/fade/{id}"
                            }
                        ]
                    }
                },
                {
                    "href": "/fade/{id}",
                    "op": "queryaction",
                    "htv:methodName": "GET",
                    "response": {
                        "contentType": "application/json",
                        "schema": "actionStatus"
                    }
                },
                {
                    "href": "/fade/{id}",
                    "op": "cancelaction",
                    "htv:methodName": "DELETE",
                    "contentType": "application/json"
                }
            ],
            "uriVariables": {
                "id": {
                    "type": "string",
                    "description": "identifier of action request"
                }
            }
        }
    },
    "schemaDefinitions": {
        "actionStatus": {
            "type": "object",
            "properties": {
                "status": {
                    "type": "string",
                    "enum": [
                        "pending",
                        "running",
                        "completed",
                        "failed"
                    ]
                },
                "error": {
                    "type": "object"
                }
            },
            "required": [
                "status"
            ]
        }
    }
}

I think, this looks ok for the dynamic URL approach with {id}. The challenging part is, how consumer handle this and understand that there is a relation between "htv:fieldValue": "/fade/{id}" and "href": "/fade/{id}", "op": "queryaction",.

As @egekorkan also noted above, the question is whether his dynamic approach will always be associated with a physically based action such as "move robot arm" where there will be only one physically instance shared by different potential consumers. Perhaps there should also be a static approach that simplifies such handling and makes it clear that the consumer can directly query a single resource to check the action status of a single physical instance. Here would be an example based on the hypermedia control example:

{
    "@context": "https://www.w3.org/2019/wot/td/v1",
    "id": "urn:ex:thing",
    "actions": {
        "fade": {
            "input": {
                "type": "object",
                "properties": {
                    "level": {
                        "type": "integer",
                        "minimum": 0,
                        "maximum": 100
                    },
                    "duration": {
                        "type": "integer",
                        "minimum": 0,
                        "unit": "milliseconds"
                    }
                }
            },
            "output": {},
            "forms": [
                {
                    "href": "/fade",
                    "op": "invokeaction",
                    "htv:methodName": "POST",
                    "contentType": "application/json"
                },
                {
                    "href": "/fade/status",
                    "op": "queryaction",
                    "htv:methodName": "GET",
                    "response": {
                        "contentType": "application/json",
                        "schema": "actionStatus"
                    }
                },
                {
                    "href": "/fade/cancel",
                    "op": "cancelaction",
                    "htv:methodName": "DELETE",
                    "contentType": "application/json"
                }
            ]
        }
    },
    "schemaDefinitions": {
        "actionStatus": {
            "type": "object",
            "properties": {
                "status": {
                    "type": "string",
                    "enum": [
                        "pending",
                        "running",
                        "completed",
                        "failed"
                    ]
                },
                "error": {
                    "type": "object"
                }
            },
            "required": [
                "status"
            ]
        }
    }
}

@benfrancis
Copy link
Member Author

@sebastiankb Thank you for the clarifications. Your modified actionStatus schema doesn't include the href member so it doesn't deal with the issue of how the dynamic URL in that member would work, but I get the general idea.

@egekorkan Note that currently #1208 doesn't say anything about dynamic resources, it just introduces the terms queryaction and cancelaction as operation names. As @sebastiankb's second example demonstrates, a simple Web Thing which only allows one instance of an action to be executed at a time could use those operations without the use of dynamic resources.

Perhaps we can land #1208 with that simple use case in mind, before dealing with more complex cases? I just wanted to highlight that there could be naming issues with meta-interactions further down the line.


With regards to more complex use cases:

  1. As @egekorkan points out, the Web Thing REST API does use dynamic resources to represent a queue of actions, as do other existing implementations
  2. The proposed protocol binding for actions in the Core Profile (Define protocol binding for actions - closes #81 wot-profile#89) also optionally uses dynamic resources to represent a queue of actions

I agree with the observation in the README @egekorkan linked to that "everything gets quite complicated" when dealing with dynamic resources, and with @sebastiankb's point that it's not currently obvious how a Consumer is meant to infer relationships between dynamic URLs used across multiple Forms a Thing Description.

In actual fact the original reason for me filing this issue in 2018 was to demonstrate the limitations of the Forms approach in Thing Descriptions, and argue for an alternative approach. I think it says a lot that three years down the line this still isn't possible.

I can see two paths forward for dealing with these more complex use cases:

  1. Continue to try to define a vocabulary for describing dynamic resources, and the relationships between them, in Thing Descriptions. Perhaps using semantic annotations and new operation types to deal with collections of resources, like queryactionlist.
  2. Accept that there are limitations to the complexity of interactions that can automatically be inferred by Consumers from Forms in a Thing Description. Have Web Things provide enough metadata in a Thing Description for a Consumer to carry out basic operations (like invokeaction), and rely on out-of-band information (e.g. in the prose of a profile specification) to define more complex interactions like action queues. That could mean for example that any WoT Consumer could invoke an action (and perhaps query and cancel a single instance), but only Consumers which implement the profile specification are able to deal with dynamic resources in an action queue.

It could be possible for us to try both:

  1. First define an HTTP API for action queues in the Core Profile protocol binding, which works by applying a set of defaults/assumptions to a single endpoint provided for an ActionAffordance in a Thing Description
  2. Then try to define vocabulary to describe that (and other) APIs declaratively in a Thing Description, such that any WoT Consumer could theoretically use the full set of operations without implementing the Core Profile

An alternative is that we give up on action queues and have the Core Profile only support a single instance of an action at a time, but that would mean there's still no way to describe the set of features already provided by WebThings and we'd have to drop features in order to be W3C compliant.

@relu91
Copy link
Member

relu91 commented Aug 9, 2021

TL;DR

I also agree that we need to put down something first and maybe revise and refine (or even remove it) down the road. The simple use case seems well described with the current proposal and I would support moving forward.


Having said that to have a complete solution we must solve also the dynamic problem. This would not be only useful for modeling actions queues, but I think it will help us scratch some itches for the TDD Thing Model. I don't know if I am the only one, but I see a strong similarity between TD description collections and Actions queues. In fact, we had a hard time figuring out how to model the CRUD operations on the TDD Thing Description collection. Ben knows that well. So how can we deal with collections?

@benfrancis above proposed two valid points, but I want to add another one into the discussion. It is actually loosely based on @vcharpenay's approach's for hypermedia control and on some ideas that were already popping out during our calls. To put it simply:

  1. Reuse the current vocabulary (plus this new ops for action), but describe the interactions with the collection in a dynamically created TD. The dynamically created TD will be described as a TM, so we don't lose any static descriptive power.

Of course, this point deserves a better explanation (providing also a list of pros and cons) and if accepted would require some modifications in @benfrancis's core profile protocol. However, I don't clutter this issue cause I really want to see some progress here.

My proposal is to review and merge #1208 with just the simplest use case in mind.

@egekorkan
Copy link
Contributor

Regarding dynamic TDs, this introduces a whole new can of worms for me. Then, other protocols can start describing ways to change TDs to implement some functionality. This is somewhat my personal opinion but I think that TDs should be always static in order to make it possible to proxy/bridge devices, generate user interfaces etc. Having dynamic TDs also introduces the problem of notifying the Consumer when that happens, more problematic if the Thing cannot host its own TD.

@relu91
Copy link
Member

relu91 commented Aug 10, 2021

Regarding dynamic TDs, this introduces a whole new can of worms for me. Then, other protocols can start describing ways to change TDs to implement some functionality. This is somewhat my personal opinion but I think that TDs should be always static in order to make it possible to proxy/bridge devices, generate user interfaces etc. Having dynamic TDs also introduces the problem of notifying the Consumer when that happens, more problematic if the Thing cannot host its own TD.

For the record, the option that I am proposing diverges from @vcharpenay's on this point. I agree that dynamically changing TDs make everything even more tangled. In the short proposal with dynamically created TDs I stress the dynamic of the creation: TD stays static once retrieved. It basically is a discovery method -> getting a TD from an affordance invocation.

@benfrancis
Copy link
Member Author

I think we basically need to decide whether the term "action" in operation names refers to:

A) the collection, e.g.

* `invokeaction` - `POST /actions/fade/`

* `cancelactioninstance` -  `DELETE /actions/fade/19g3-631g-61gj`

* `queryactioninstance` - `GET /actions/fade/19g3-631g-61gj`

* `queryaction` - `GET /actions/fade`

* `queryallactions` - `GET /actions`

* `observeactioninstance` - `GET /actions/fade/19g3-631g-61gj Accept: text/event-stream`

* `observeaction` - `GET /actions/fade  Accept: text/event-stream`

* `observeallactions` - `GET /actions  Accept: text/event-stream`

B) an individual instance of the interaction, e.g.

* `invokeaction` - `POST /actions/fade/`

* `cancelaction` -  `DELETE /actions/fade/19g3-631g-61gj`

* `queryaction` - `GET /actions/fade/19g3-631g-61gj`

* `queryactionlist` - `GET /actions/fade`

* `queryallactionlists` - `GET /actions`

* `observeaction` - `GET /actions/fade/19g3-631g-61gj  Accept: text/event-stream`

* `observeactionlist` -  `GET /actions/fade  Accept: text/event-stream`

* `observeallactionlists` - `GET /actions  Accept: text/event-stream`

Which works best?

Reflecting on this some more, I think of these two option A makes the most sense because it's consistent with properties and events. Subscribing to an event subscribes you to all instances of an event in the same way that querying an action queries all instances of an action (subscribeevent, queryaction) and the names of the meta interactions are consistent too (readallproperties, subscribeallevents, queryallactions). There is no equivalent of querying an instance of an action for properties and events because they don't have dynamically created resources.

I therefore propose the following operation names:

  • invokeaction - e.g. POST /actions/fade/
  • cancelactioninstance - e.g. DELETE /actions/fade/19g3-631g-61gj
  • queryactioninstance - e.g. GET /actions/fade/19g3-631g-61gj
  • queryaction - e.g. GET /actions/fade
  • queryallactions - e.g. GET /actions

To demonstrate the difference between queryactioninstance, queryaction and queryallactions here are some example payloads that could be used in the Core Profile:

queryactioninstance

{
  "status": "pending"
}

queryaction

[
  {
    "status": "completed",
    "href": "/things/lamp/actions/fade/123e4567-e89b-12d3-a456-426655"
  },
  {
    "status": "pending",
    "href": "/things/lamp/actions/fade/123e4567-e89b-12d3-a456-431553"
  }
]

queryallactions

{
  "fade": [
    {
      "status": "completed",
      "href": "/things/lamp/actions/fade/123e4567-e89b-12d3-a456-426655"
    },
    {
      "status": "pending",
      "href": "/things/lamp/actions/fade/123e4567-e89b-12d3-a456-431553"
    },
    {
      "status": "pending",
      "href": "/things/lamp/actions/fade/123e4567-e89b-12d3-a456-51ff16"
    }
  ],
  "reboot": [
    {
      "status": "pending",
      "href": "/things/lamp/actions/reboot/123e4567-e89b-12d3-a456-f3dea"
    }
  ]
}

@sebastiankb
Copy link
Contributor

from today's TD call:

@benfrancis
Copy link
Member Author

benfrancis commented Sep 9, 2021

  • we decided the following name convention for new operation values: queryactioninstance, queryaction, queryallactions, and cancelactioninstance

Below is an example Thing Description of a Thing with a single Action called "fade", which demonstrates usage of each of these operations with an API which expands on the proposed action protocol binding for the Core Profile. See notes below.

{
  "@context": "https://www.w3.org/2019/wot/td/v1",
  "id": "urn:ex:thing",
  "title": "Web Lamp",
  "securityDefinitions": {
    "basic_sc": {
      "scheme": "basic",
      "in": "header"
    }
  },
  "actions": {
    "fade": {
      "input": {
        "type": "object",
        "properties": {
          "level": {
            "type": "integer",
            "minimum": 0,
            "maximum": 100
          },
          "duration": {
            "type": "integer",
            "minimum": 0,
            "unit": "milliseconds"
          }
        }
      },
      "output": {},
      "forms": [
        {
          "href": "/actions/fade",
          "op": "invokeaction",
          "htv:methodName": "POST",
          "contentType": "application/json",
          "response": {
            "htv:headers": [
              {
                "htv:fieldName": "Location",
                "htv:fieldValue": "/fade/{id}"
              }
            ],
            "contentType": "application/json",
            "schema": "actionStatus"
          }
        },
        {
          "href": "/actions/fade",
          "op": "queryaction",
          "htv:methodName": "GET",
          "response": {
            "contentType": "application/json",
            "schema": "actionStatusList"
          }
        },
        {
          "href": "/actions/fade/{id}",
          "op": "queryactioninstance",
          "htv:methodName": "GET",
          "response": {
            "contentType": "application/json",
            "schema": "actionStatus"
          }
        },
        {
          "href": "/actions/fade/{id}",
          "op": "cancelactioninstance",
          "htv:methodName": "DELETE",
          "contentType": "application/json"
        }
      ],
      "uriVariables": {
        "id": {
          "type": "string",
          "description": "identifier of action request"
        }
      }
    }
  },
  "forms": [
    {
      "href": "/actions",
      "op": "queryallactions",
      "htv:methodName": "GET"
    }
  ],
  "schemaDefinitions": {
    "actionStatus": {
      "type": "object",
      "properties": {
        "status": {
          "type": "string",
          "enum": [
            "pending",
            "running",
            "completed",
            "failed"
          ]
        },
        "output": {},
        "error": {
          "type": "object"
        },
        "href": {
          "type": "string",
          "format": "uri",
          "const": "/actions/fade/{id}"
        }
      },
      "required": [
        "status"
      ]
    },
    "actionStatusList": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "status": {
            "type": "string",
            "enum": [
              "pending",
              "running",
              "completed",
              "failed"
            ]
          },
          "output": {},
          "error": {
            "type": "object"
          },
          "href": {
            "type": "string",
            "format": "uri",
            "const": "/actions/fade/{id}"
          }
        },
        "required": [
          "status"
        ]
      }
    }
  }
}

Open questions:

  1. I'm conscious that with this declarative approach, even a very simple Thing like this the Thing Description becomes very verbose. @relu91's action model proposal in Yet another Action Model #1223 could alleviate this slightly by moving some of the API description out into a separate Action Description resource, but it wouldn't solve the problem of how to describe action queues (see Yet another Action Model #1223 (comment)). I still have some open questions though:
    1. Note that I had to create two separate data schema definitions 1) actionStatus for the queryactioninstance operation and 2) actionStatusList for the queryaction operation. The latter is just multiple instances of the former wrapped in an array but is there a more concise way to express this?
    2. I haven't defined a separate data schema for the response to the queryallactions operation because I'm assuming that the specification could define this to be a map of the queryaction responses for all actions, keyed by action name. This is what we do for properties. However, it's a bit more complicated in this case because it's expanding upon the data schema from a form response rather than the top level data schema of the interaction affordance. How could this work?
    3. There's no type for the output member of the actionStatus schema because this action has no output, but if it did have an output I assume the data schema would have to be duplicated from the output member of the action affordance, since there's no way to reference one data schema from another?
    4. The invokeaction operation includes an {id} to identify an action instance in both an HTTP header and the body of the response. I'm still not sure how a Consumer would know that the {id} variable in the header and href member of the actionStatus schema correspond to the {id} variable in the href of the queryactioninstance and cancelactioninstance forms?
    5. Is defining the format of the href member of the actionStatus schema as "const": "/actions/fade/{id}" a valid use of const?
  2. I'm still not entirely happy with the naming of the operations, but so far nobody has come up with a better sounding alternative.
    1. In renaming queryaction and cancelaction to queryactioninstance and queryactioninstance, it has made me think again about a simpler use case where an action is queryable and cancelable, but only one instance can be requested at a time and therefore there's no need for the concept of an "action instance". In catering for the more complex case of action queues, are we over-complicating things for that simpler use case?

I can go ahead and modify #1208 to rename queryaction to queryactioninstance and cancelaction to cancelactioninstance, but I think there are still open questions to answer, not least about queryallactions, which we should discuss in #1200.

@benfrancis
Copy link
Member Author

benfrancis commented Sep 14, 2021

I have submitted an alternative PR (#1226) which uses the operation names queryactioninstance and cancelactioninstance instead of queryaction and cancelaction. It would be great to discuss my remaining open questions (above) in the next TD meeting.

If we can't come up with answers to those open questions then I think we may have to consider separating this issue into two separate issues to solve separately:

  1. How to query and cancel an ongoing action
  2. How to deal with multiple instances of the same action (executed in serial or in parallel)

We could start by defining queryaction and cancelaction operations which solve the former issue, and re-consider whether it's really feasible to solve the latter issue in a static Thing Description.

I think we're bumping up against two limitations of Thing Descriptions which we've experienced before:

  1. They aren't good at describing collections of resources (which we also experienced with directories, see Refactoring TDD Thing Description  wot-discovery#133)
  2. They aren't good for describing historical data (which we also experienced with events, see HTTP Binding for providing historical events. #892)

If we conclude that action queues are in fact too complex to describe in a static Thing Description, we could either:

  1. Continue to explore alternative solutions like modelling an action queue as a single resource rather than multiple resources where possible, or by using hypermedia approaches like the one proposed in Yet another Action Model #1223
  2. Decide that an action queue is too complex to describe in a TD at all and must either be described out of band (e.g. in a profile) or that a Consumer should be responsible for managing the queue instead of a Web Thing (and therefore the Web Thing would be limited to handling one instance of an action at a time)

@sebastiankb
Copy link
Contributor

@benfrancis many thanks for your examples and contributions to this topic.

I am not very happy about the very verbose TD either. I'm loud thinking if it makes sense to define a subprotocol which is assumed as default in TD definitions. Maybe this can be also reused for the WoT Profile.

he invokeaction operation includes an {id} to identify an action instance in both an HTTP header and the body of the response. I'm still not sure how a Consumer would know that the {id} variable in the header and href member of the actionStatus schema correspond to the {id} variable in the href of the queryactioninstance and cancelactioninstance forms?

This problem may be solved if we have a behavioral assertion in the proposed subprotocol definition.

Is defining the format of the href member of the actionStatus schema as "const": "/actions/fade/{id}" a valid use of const?

Good question. Maybe @handrews can help us here.

In renaming queryaction and cancelaction to queryactioninstance and queryactioninstance, it has made me think again about a simpler use case where an action is queryable and cancelable, but only one instance can be requested at a time and therefore there's no need for the concept of an "action instance". In catering for the more complex case of action queues, are we over-complicating things for that simpler use case?

Lets discuss this again in today's TD call. Another idea would be to replace "instance" by "byid". E.g., queryactionbyid etc

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ActionAffordance Topic realtated to the Actions Affordance Defer to next TD spec version This topic is not covered in this charter, maybe included for the next TD version.
Projects
None yet
Development

No branches or pull requests

10 participants