Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define protocol binding for actions - closes #81 #89

Merged
merged 7 commits into from
Sep 9, 2021

Conversation

benfrancis
Copy link
Member

@benfrancis benfrancis commented Jul 19, 2021

A first draft of the protocol binding for actions as discussed in #81 including:

  • invokeaction
    • Synchronous response
    • Asynchronous response
  • queryaction
  • cancelaction

Notes:

  • Currently does not define an updateaction operation, and therefore sidesteps the issue of the input data for now
  • Note the difference in error handling between synchronous and asynchronous responses to the invokeaction operation
    • In the synchronous response an error produces an error status code, as the response relates directly to the invokeaction request, with further error information included in the body
    • In the asynchronous response a follow-up queryaction operation produces a success response (to the query operation), with error information about the original invokeaction operation included in the body
  • The URL of an ActionStatus resource is currently provided only in the Location header. We can continue to discuss whether it should be provided in the body instead of or as well as the header.
  • There's currently still a status member in a synchronous response because the action request may still be pending or running when the HTTP response arrives, it's not as simple as just success or failure
  • There's currently no example of output data because the example I started with doesn't provide an obvious output, but we could change this if necessary

Preview | Diff

@benfrancis benfrancis mentioned this pull request Jul 19, 2021
@mlagally
Copy link
Contributor

This is an elegant contribution and a big step forward, thanks @benfrancis.

A couple of questions / comments:
Section 5.2.2.1 - why do you require a canonical TD?

Error member: please consider if this should be mandatory. If it is optional, people won't use it ;-)

Error response codes: Should we be more explicit and list all permitted codes or do we just allow all status codes (including the teapot?)

We could add JSON-Schema snippets from @relu91 in another iteration to the spec.

@mlagally mlagally mentioned this pull request Jul 20, 2021
@benfrancis
Copy link
Member Author

benfrancis commented Jul 20, 2021

@mlagally wrote:

This is an elegant contribution and a big step forward

Thank you, I'm pleased we're moving closer to a solution to this long-standing issue.

Section 5.2.2.1 - why do you require a canonical TD?

Every time the protocol binding describes locating a Form based on the value of its op member I use the term "canonical TD". It's basically shorthand for saying that a Consumer must first apply defaults to the TD and parse the resulting canonical TD in order to locate the Form. Without carrying out this step first a Consumer may fail to find a matching Form because the op member may be omitted by a Web Thing which relies on default values.

I'm open to suggestions on how to make this clearer, but as commented elsewhere I think what we should probably do is provide an example Thing Description at the start of the Protocol Binding chapter, and then provide an informative canonical TD with all of the protocol details from the profile applied, a bit like the TD provided for the Directory Service API in the WoT Discovery specification but in canonical form. I have just filed #90 for this.

Error member: please consider if this should be mandatory. If it is optional, people won't use it ;-)

Yes that may be true, I'd be interested to hear any other opinions on this? There may be cases where a Web Thing encounters an exception with an unknown error and therefore can't provide a useful error message, but I suppose the Web Thing could define an unknown error type for that purpose.

Error response codes: Should we be more explicit and list all permitted codes or do we just allow all status codes

I think it's a bit hard to imagine in advance every type of error that a Web Thing may encounter and therefore which error codes are appropriate. My instinct is to leave it open and allow Consumer implementations to decide how best to respond to different error codes. Most HTTP implementations have a shorthand method for determining whether an HTTP request was successful or not (e.g. Response.ok in JavaScript), so encountering unknown errors is not usually a huge problem. Even if a Consumer doesn't have a code path for a specific error code, it knows that the response was an error and can provide a generic error report.

However, if people think we can and should define a finite set of error codes up-front then I'd be open to specifying that.

(including the teapot?)

Well yes of course, in case someone tries to invoke a brewCoffee action on a web-connected teapot. I would argue that never before has there been a more appropriate use case for RFC 2324.

We could add JSON-Schema snippets from @relu91 in another iteration to the spec.

I agree, that could help provide a concise summary of the prose of the specification as in the WoT Discovery specification. We should perhaps discuss this further in #90.

@benfrancis benfrancis marked this pull request as ready for review July 21, 2021 11:55
@benfrancis benfrancis changed the title Define protocol binding for actions Define protocol binding for actions - closes #81 Jul 21, 2021
@benfrancis
Copy link
Member Author

Note that I have added a proposal for how the asynchronous actions protocol binding proposed in this PR could be described in a Thing Description in w3c/wot-thing-description#302 (comment)

@egekorkan
Copy link
Contributor

@benfrancis thank you for the contribution! Regarding

The URL of an ActionStatus resource is currently provided only in the Location header. We can continue to discuss whether it should be provided in the body instead of or as well as the header.

I think it would be better to put it in the body to allow the same mechanism to be used by other protocols. Putting in the body would even allow an MQTT implementation of this action mechanism

@benfrancis
Copy link
Member Author

benfrancis commented Jul 26, 2021

@egekorkan wrote:

I think it would be better to put it in the body to allow the same mechanism to be used by other protocols.

I'm not opposed to putting it in the body (actually I did that in an earlier draft), but only if there's a good reason. Being protocol agnostic is not a good reason in my opinion.

The point of a protocol binding is to bind a set of operations to a particular protocol, in whatever way makes sense for that protocol. A protocol binding for MQTT would likely be completely different since it uses a pub/sub approach rather than request/response. We shouldn't unnecessarily complicate the Core Profile by trying to guess what other protocols might want to do in the future. The HTTP protocol binding should do whatever makes sense in HTTP and the HTTP specification says that a 201 Created response identifies the created resource in a Location header.

However, it also says that "the 201 response payload typically describes and links to the resource(s) created", so I think it would be reasonable to include a URL in the body in addition. The downside of doing that is that it means defining another JSON data schema in addition to ActionStatus, which describes a link to an ActionStatus resource. Since that resource (let's call it ActionStatusLink) is likely to just look like { "href": "/things/lamp/actions/fade/123e4567-e89b-12d3-a456-426655"} it doesn't actually add any information over what's already provided in the Location header.

I'd therefore like to understand whether there's a reason that Consumers might not be able to access the header and therefore need the data to be duplicated in the body as well. If a URL is included in both the header and the body then it would be necessary to specify which is mandatory and which is optional, e.g. both are mandatory for a Producer, but a Consumer may use one or the other.

@mlagally
Copy link
Contributor

@benfrancis
On the actionStatus URL I prefer the approach that was suggested by @egekorkan, i.e. to put it into the payload.

One of the benefits is that you can simply store the JSON object in the consumer and have a handle to query the status without further protocol processing. I would even go so far to not include it in the header. This makes the protocol binding implementation easier to implement and is logically consistent.

I don't think we need a second JSON data schema for this link.
The actionStatus can just additionally contain this url, so all you have to extend is to add this URL to the response payload.
It is the same response payload both for the invocation and the status query.
It is common practice to have "self" references in response payloads.

@benfrancis
Copy link
Member Author

@mlagally wrote:

One of the benefits is that you can simply store the JSON object in the consumer and have a handle to query the status without further protocol processing. I would even go so far to not include it in the header. This makes the protocol binding implementation easier to implement

I am not convinced by this argument. In JavaScript for example...

In the header:
const actionStatus = response.headers.get('Location');

In the body:
response.json().then((data) => { const actionStatus = data.href; });

I don't think we need a second JSON data schema for this link.
The actionStatus can just additionally contain this url, so all you have to extend is to add this URL to the response payload.
It is the same response payload both for the invocation and the status query.
It is common practice to have "self" references in response payloads.

Yes that could work if the status member of ActionStatus was made optional rather than mandatory. We would have to define that if no status member is present then the Consumer should follow the link in the href member to get the status, but if a status member is present then the Consumer should interpret the href member as a self link. And that for a response to a queryaction operation the status member is mandatory. This also still involves defining how to serialise the link vs. using something that already exists.

As I said I am not opposed to putting the URL in the body, but I still haven't seen any compelling arguments for not following the HTTP specification and putting it in the Location header.

@mmccool
Copy link
Contributor

mmccool commented Jul 29, 2021

So as I mentioned in the last meeting I have been trying to look up best practices here. I cited the following resources:

These are related, in that the book is an expanded form of the AIP patterns. I was specifically looking at the "long-running operations" pattern:

Unfortunately I am slightly disappointed in this pattern, even in the book, because (for example) it only considers long-polling (e.g. keeping the connection open and only responding when the waited-for status change occurs) for signaling completion, and not other forms of completion notification, such as SSE. However, this is ok given our current KISS focus. However it does provide useful justifications for various other things under discussion, such as a need for query (if a consumer crashes, it needs a way to recover the ids of any pending actions...) and when doing a query (called a List operation generally by Google), only providing responses that a user is authorized to see (in general, it is ok for query actions to do this, and in fact they MUST to preserve privacy). There are also a few things we have not considered, such as what to do about completed/expired actions (when does the action resource get deleted? What happens if you try to access a deleted action resource?). I found the book's description to be more useful than the AIP one, since the latter is more focussed specifically on Google's APIs. Elsewhere the book also argues that an update action is useful if there is a possibility the object being updated can be expanded in the future: an update operation is more robust to older devices not knowing about new fields being added to a structure; they only need to update the parts they know about.

Anyway, the above is not a specific review of this proposal, just some notes on a useful resource.

Copy link
Contributor

@mmccool mmccool left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am generally happy with the proposal and think it provides a good baseline for further discussion, and so should be merged, modulo typo fixes. The comments I have provided (support for long-polling, error response for unsupported parallel operation) can be turned into issues and we can discuss each separately.

index.html Outdated
</p>
<ul>
<li>An error response status code (e.g. <code>400</code> for an
invalid input or <code>500</code> for a failed actuation)</li>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So one possible error here is caused by devices that cannot support parallel actions of the same type, when a action of that type is currently in progress. This situation perhaps should have its own error code, so that the consumer knows it can try again later (that the error is temporary). This situation should be quite common in IoT devices, unlike web services. It might also be useful to have a flag on actions (e.g. "parallel" : true) but that is something we should propose at the TD level.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that all operations can return an error response from those recommended in section 5.2.4 Error Responses. If you think there's a particular error code that should be recommended here, could you please file an issue for that?

index.html Outdated Show resolved Hide resolved
</pre>
</section>

<section class="ednote">Another operation under consideration is
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we may want to consider also "queryallactions" to list all actions (that the consumer has authorization to see). This is what I meant by the need to support query/List in my earlier comments. It's useful when a consumer loses the ActionResource link somehow (e.g. reboot, crash, etc).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. See #302 and #1200 for discussions on naming.

</li>
</ul>
<pre class="example">
GET /things/lamp/actions/fade/123e4567-e89b-12d3-a456-426655
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So my understanding of the current proposal is that it supports only polling to determine if an action is completed. This is OK, and a good place to start. We might want to consider a slight extension to also support long-polling, i.e. the ability to "observe" a dynamic ActionStatus resource. Then we just have to figure out how to specify whether an immediate response or a long-polling response is requested when doing a GET (could be a query parameter on the URL, i.e. ?observe) that would indicate a response is requested only when the status changes; to avoid race conditions, perhaps the request could also provide the current known status, i.e. ?observe,status=running and then the system would only (generally) respond once the status changes from the provided status. It might also need to time out, in which case it might respond even if there is no change to the status, and the consumer would have to reinvoke it in that case. Anyway... even if we don't add this now, I can see how it would be done as a relatively simple extension of the current proposal, so I think accepting the current proposal provides a good baseline, even if it only supports regular polling for now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So my understanding of the current proposal is that it supports only polling to determine if an action is completed. This is OK, and a good place to start. We might want to consider a slight extension to also support long-polling

That's correct, to keep things simple. Another option we could consider is to use Server-Sent Events to be consistent with the proposed Events protocol binding in #92.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then we just have to figure out how to specify whether an immediate response or a long-polling response is requested when doing a GET (could be a query parameter on the URL, i.e. ?observe)

For Server-Sent events this could be done with HTTP content negotiation, by requesting either a content type of application/json or text/event-stream in the Accept header.

@benfrancis
Copy link
Member Author

Action points from WoT Architecture meeting:

  1. Always respond to invoke action operations with an ActionStatus object, add a link to that object in the case of ongoing async actions, make the link mandatory in both the header and the body

Follow-up issues for later:

  1. Consider expanding examples to include actions with an output
  2. Recommend an error code for when an action can not be carried out because one is already running (and the Thing doesn't support queues)
  3. Consider making action status observable (e.g. using long-poll or SSE), but maybe in a future version of the WoT Profile specification

@benfrancis
Copy link
Member Author

benfrancis commented Aug 2, 2021

I think I have now addressed all the open issues on this PR:

  • Removed dependency on canonical TDs by describing applying defaults to forms instead
  • Both synchronous and asynchronous responses to an invokeaction request now include an ActionStatus object, which in the case of an Asynchronous Action Response must now contain a href member in the body in addition to the Location header
  • Simplified error handling so that a standard Error Response may be provided to an invokeaction request in both sync or async cases, meaning the error metadata in the ActionStatus object is now only used for queryaction operations

As far as I'm concerned this is now ready to land. Who can approve this?

</pre>
</section>

<section id="cancelaction">
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the cancelaction operation is underspecified in this proposal:

The current text does not describe the error behavior and status response mechanism, if cancellation is not successful.
It does not consider that cancelling an action could take longer than the request timeout, i.e. the cancel operation would be de facto another asynchronous action with a status URL.

It would be straightforward to impement cancel operations by just using the available mechanism and perform a regular invocation operation (note that it could be sync or async) depending on the use case.

Therefore I suggest to eliminate it completely and recommending a naming convention such as ending the action name with "Cancel" for a corresponding cancel operation, e.g.
on a class Teapot:

  • brewTea(duration)
  • brewTeaCancel()

Copy link
Member Author

@benfrancis benfrancis Aug 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @mlagally, thank you for the feedback.

The current text does not describe the error behavior and status response mechanism, if cancellation is not successful.

Error behaviour is common to all operations and is defined in section 5.2.4 as I was asked to move it out into a separate section. This does mean that there are no recommendations of specific error codes for specific scenarios, but I'd be happy to add those in a follow-up if you have suggestions.

It does not consider that cancelling an action could take longer than the request timeout

Do you have a particular use case in mind where cancelling an action might take more than 30 seconds? If this is a problem then one possibility would be to define a special case for this scenario by having the Web Thing respond to a cancelaction operation with 202 Accepted instead of 204 No Content to indicate that the request has been accepted but not yet acted upon.

It would be straightforward to impement cancel operations by just using the available mechanism and perform a regular invocation operation (note that it could be sync or async) depending on the use case.

Therefore I suggest to eliminate it completely and recommending a naming convention such as ending the action name with "Cancel" for a corresponding cancel operation, e.g.
on a class Teapot:

  • brewTea(duration)
  • brewTeaCancel()

I remember that you suggested this approach before. There are a couple of reasons I don't think we should take that approach in the general case. The first is that it means that every cancellable Action affordance becomes two Action affordances, unnecessarily cluttering the interaction model. But the biggest problem in my view is that it doesn't deal well with multiple ongoing instances of the same action as it's very hard to keep track of which cancel operations correspond with which invoke operations. The point of the action queue pattern is to spawn a dynamic resource for each ongoing action such that individual instances can be identified, queried and cancelled. The approach you have proposed has the effect of creating two separate queues (an invoke queue and a cancel queue) which must then somehow be kept in sync with each other.

I would instead suggest that for the vast majority of cases where cancelling an action doesn't take longer than 30 seconds, the current approach provides a simpler and more effective model. For edge cases where a cancel operation might take longer, developers always have the freedom to make the initial action non-cancellable and define a second action affordance which can reverse the other. For example, a 3D printer could have a "print" action and a separate "reset" action which resets its state back to the starting position and has the effect of cancelling any ongoing "print" actions. There's no need to enforce a naming convention for this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed this in the arch call on Sept. 9th:
On a timeout of a cancel operation it should be checked whether the status object is still available, in this case the cancel operation was not successful, since otherwise the action status object would no longer exist.

@mlagally mlagally merged commit d4cb192 into w3c:main Sep 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants