Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Usage Scenarios #117

Merged
merged 4 commits into from
Mar 29, 2018
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
166 changes: 166 additions & 0 deletions spec.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,172 @@ The following will not be part of the specification:
* Language-specific runtime APIs
* Selecting a single identity/access control system

## Usage Scenarios

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addition based on March 23 call: These scenarios are not normative; anyone is free to create a system that mixes these scenarios. These cases establish a common vocabulary of event Producer, Consumer, Middleware, and Framework.

The list below enumerates key usage scenarios and developer perspectives
that have been considered for the development of this specification.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

based on March 23 WG call, I propose adding something like:

"These usage scenarios are by no means exhaustive, and the specification does not aim to be prescriptive about usage."

These usage scenarios are by no means exhaustive, and the specification
does not aim to be prescriptive about usage.

These scenarios are not normative; anyone is free to create a system that
mixes these scenarios. These cases establish a common vocabulary of event
producer, consumer, middleware, and framework.

In these scenarios, we keep the roles of event producer and event consumer
distinct. A single application context can always take on multiple roles
concurrently, including being both a producer and a consumer of events.

1) Applications produce events for consumption by other parties, for instance
for providing consumers with insights about end-user activities, state
changes or environment observations, or for allowing complementing the
application's capabilities with event-driven extensions.

Events are typically produced related to a context or a producer-chosen
classification. For example, a temperature sensor in a room might be
context-qualified by mount position, room, floor, and building. A sports
result might be classified by league and team.

The producer application could run anywhere, such as on a server or a device.

The produced events might be rendered and emitted directly by the producer
or by an intermediary; as example for the latter, consider event data
transmitted by a device over payload-size-constrained networks such as
LoRaWAN or ModBus, and where events compliant to this
specification will be rendered by a network gateway on behalf of the
producer.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: the rendering of the event by an intermediary does not change the semantic meaning of the event, specifically that the producer remains the same, even if the event were not originally produced in the CloudEvent format.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sara: If we decide that the producer will still be the the original entity that emits the event, then we need to spell this out clearly. Let's take the following two examples to make sure we are on the same page.

  1. In an IoT example, a motion detection event is originally emitted by a motion detector and then sent to the cloud platform via a HTTP API GW, the producer will be the motion detector, not the API GW. Could you confirm this is what you propose?
  2. In the same IoT example, if the motion detection event triggers a video/image to be saved into a storage which in turn triggers a motion video/image storage event to the cloud platform, does this mean the semantic meaning of the event has been changed? Is the producer the storage entity or is it still the motion detector? I think we need to clearly define what constitutes "change of the semantic meaning of the event"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cathyhongzhang agree with your interpretation. If we added the following text, do you think that would be clear?

For example, an IoT devices capture an image and provides that as the payload of its motion detection event:

  1. The devices transmits a stream of events using a proprietary protocol, there exists an API Gateway that transforms these into CloudEvents, the IoT device is the producer, even though the API Gateway created the data in the CloudEvent format because the semantic meaning of the event remains unchanged
  2. The devices transmits a stream of events using a proprietary protocol and an intermediary saves the image in an ftp server, which then emits a object.created event. The ftp server is the producer of the object.created event (not the motion detection event).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree strongly with the spirit of this thread. I think I agree vehemently with everything said in the PR so far, though the vocab is dense (e.g. reader may not understand what "rendering" means by the time it they reach it in this use case) and it would be great to have some grounding examples (e.g. the IoT stories above).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@inlined I like the idea of more examples, too. There is still #53 open. A starter doc could contain exactly the desired examples. I wrote a comment there a few days ago regarding patterns. This gateway that renders events as CloudEvents could be one of them. Mapping the context attributes to a topic hierarchy to propagate events using a message broker could be another one.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that the text in this PR is very helpful, and on one of our calls this week I asked whether this should live in the spec or in a companion doc because its non-normative text. For now, and to keep us moving forward, I think putting this into the spec is fine and we can do follow-on PRs to extract it if we think it would be better to live in another doc (one in which we could really elaborate on things with more examples, and not bloat the spec).

To me the biggest goal of this PR is to get alignment on our thinking first , which I think we're really close on - we can leave word-for-word wordsmith-ing and document(s) organization for later.


For example, a weather station transmits a 12-byte, proprietary event
payload indicating weather conditions once every 5 minutes over LoRaWAN. A
LoRaWAN gateway is then used to publish the event to an Internet destination
in the Cloud Events format. The LoRaWAN gateway is the event producer,
publishing on behalf of the weather station, and will set event metadata
appropriately to reflect the source of the event.

2) Applications consume events for the purposes such as display, archival,
analytics, workflow processing, monitoring the condition and/or providing
transparency into the operation of a business solution and its foundational
building blocks.

The consumer application could run anywhere, such as on a server or a
device.

A consuming application will typically be interested in:
- distinguishing events such that the exact same event is not
processed twice.
- identifying and selecting the origin context or the
producer-assigned classification.
- identifying the temporal order of the events relative to the
originating context and/or relative to a wall-clock.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

relative to the originating context (which could be clock-time or other sequence number)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The consuming application is also interested in

  • correlating event instances from multiple event producers and send them to the same workflow instance.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could change the point two bullets above ("identifying and selecting the origin context or the producer-assigned classification") to say:

"identifying and selecting the origin context or the producer-assigned classification, potentially correlating multiple events"

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed to add this. Cathy is not asking for a new attribute at this time, she just wants to make sure the spec doesn't preclude the usecase

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I would like to makes sure "correlating event instances from multiple event producers" is a key interest point of the consuming application and the event data (metadata+payload) should carry "correlation-token" information. As to how to identify this correlation-token in the event data, there could be multiple ways of doing it depending on where/how the event producer put this info. For example, this correlation-token could be part of the origin context or could be a unique string in the event payload or could be an explicitly defined metadata field of the event.

- understanding the context-related detail information carried
in the event.
- correlating event instances from multiple event producers and send
them to the same consumer context.

In some cases, the consuming application might be interested in:
- obtaining further details about the event's subject from the
originating context, like obtaining detail information about a
changed object that requires privileged access authorization.
For example, a HR solution might only publish very limited
information in events for privacy reasons, and any event consumer
needing more data will have to obtain details related to the event
from the HR system under their own authorization context.
- interact with the event's subject at the originating context,
for instance reading a storage blob after having been informed
that this blob has just been created.

Consumer interests motivate requirements for which information
producers ought to include an event.

3) Middleware routes events from producers to consumers, or onwards
to other middleware. Applications producing events might delegate
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change: Middleware routes events from producers to consumers, or on to other middleware, without changing the event.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

without changing the event.

Does that include the context attributes or just the payload?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think when middleware routes the event, it should include both the original context attributes and payload since the consumer has interest in the context attributes too. For example, some context attribute is needed to identify/extract a field in the payload.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was referring to without changing. Do we exclude that there might be context attributes that can be changed, removed or added by the middleware?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Without changing" would be contradicted by the details of the section, because transcoding and transforming is doubtlessly a change. The transform description has a clause on it that semantic changes are not allowed, and that includes omissions.

On @deissnerk's point: I'd like to rename the "extensions" section we currently have in the spec to "annotations" and clarify that this section MAY be edited by intermediaries but MUST be forwarded. That would align with the notion that the AMQP TC arrived at, where the "bare message" which holds the content and core properties is immutable, but can be annotated in transit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Without changing" would be contradicted by the details of the section, because transcoding and transforming is doubtlessly a change. The transform description has a clause on it that semantic changes are not allowed, and that includes omissions.

+1

On @deissnerk's point: I'd like to rename the "extensions" section we currently have in the spec to "annotations" and clarify that this section MAY be edited by intermediaries but MUST be forwarded. That would align with the notion that the AMQP TC arrived at, where the "bare message" which holds the content and core properties is immutable, but can be annotated in transit.

@clemensv Some brief context: We chose extensions because it is term OpenAPI uses to describe custom attributes to its specification and we felt having some similarities between CloudEvents and OpenAPI would be helpful. That said, I like your suggestion of having that section be editable. It means that section will take on a new purpose, in addition to merely being a place for custom attributes. I believe having a section containing editable attributes may solve a handful of problems.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it true that middleware should never change the content of an event? A few examples I could think of:

  • Scrubbing PII from an event before forwarding to downstream consumers;
  • Enriching an event, such as hydrating an object on the event. Imagine a producer includes a productId in the event payload, then the middleware hydrates the product object based on that id to include product name, price, inventory, etc.

In both of these situations, I would still consider the original producer the source of the event rather than the middleware, as suggested by this thread.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexdebrie middleware may modify (transcode/transform) the content/payload of the event as mentioned in the paragraphs below. The question is whether middleware can modify the common event metadata attributes?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lfourie Yea, I'm wondering if it's ever appropriate for middleware to modify the event payload before passing along. Example:

Removing personally identifiable information (PII) from payload

Perhaps my Users service emits a user.created event which includes the social security number of the newly created user. I want to configure my middleware to scrub that attribute before sending to downstream consumers.

From:

{
  "firstName": "Alex",
  "lastName": "DeBrie",
  "dateOfBirth": "01-01-1980",
  "ssn": "555-55-5555",
}

to:

{
  "firstName": "Alex",
  "lastName": "DeBrie",
  "dateOfBirth": "01-01-1980",
}

In my mind, this is still a user.created event that's been emitted from my Users service rather than an entirely new event emitted by the middleware as suggested by @ultrasaurus's object.created example.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexdebrie IMO, if you're removing (or adding) anything, including PII, you're making a semantic change. A scrubbed event is no longer the original but rather originates from whoever does the scrubbing (or enriching); it is a derivative. A very basic test might be whether you can compute the same integrity hash over the payload values (not metadata or bits on wire) before and after transcoding or transform.

ESPECIALLY if you have PII involved, you'll want to have a clear break between the event data before and after the scrub, because you will want to have absolutely no doubt in any audit trails about what kind of data you're dealing with when the GDPR police shows up.

All that does not at all mean that such an intermediary weren't allowed to publish the derived event as to relate to the same context. Your event doesn't originate from the "users" service, but it still relates to exact same user context.

The event will have a different id (to delineate it from the unscrubbed original) and if you want to include provenance details in the event detail data, you might have a trail there that gives the id of the original event and a hash of that original event.

This is actually brilliant example for why "source" isn't the right concept in the current spec, because your scenario calls for an original and a derivative event to be posted related to the exact same context by different actors for different segments of a system.

certain tasks arising from their consumers' requirements to
middleware:
Copy link
Contributor

@lfourie lfourie Mar 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the event emitted by the middleware identical to the received event apart from possible event payload transformation? If not, there is a need to be able to correlate events received by middleware with the events emitted by middleware.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's assumed that the middleware doesn't make semantic changes and as I explain here I would be in favor declaring the event metadata immutable except for an annotation section.


- Management of many concurrent interested consumers for one of
multiple classes or originating contexts of events
- Processing of filter conditions over a class or originating context
of events on behalf of consumers.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From today's conversation, I think adding this clarification to the above point could address question raised by @cathyhongzhang : Filtering does not change the content of the event, rather selects particular events to forward to a consumer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None of the activities change the event semantically, and the one activity that touches the event structure is specifically stating that semantics may not be changed. A clarification for filtering specifically seems superfluous to me.

- Transcoding, like encoding in MsgPack after decoding from JSON
- Transformation that changes the event's structure, like mapping from
a proprietary format to Cloud Events, while preserving the
identity and semantic integrity of the event.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to overlap with rendering in #1 above

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see the overlap/conflict

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no conflict, just would be more readable if it was only in one place. I would move the bit about "rendering" from producer section to here (since transcoding seems like what you were describing as "rendering" to me, yet the language above speaks to a somewhat different use case)

- Instant "push-style" delivery to interested consumers.
- Storing events for eventual delivery, either for pick-up initiated
by the consumer ("pull"), or initiated by the middleware ("push")
after a delay.
- Observing event content or event flow for monitoring or
diagnostics purposes.

To satisfy these needs, middleware will be interested in:
- A metadata discriminator usable for classification or
contextualization of events so that consumers can express interest
in one or multiple such classes or contexts.
For instance, a consumer might be interested in all events related
to a specific directory inside a file storage account.
- A metadata discriminator that allows distinguishing the subject of
a particular event of that class or context.
For instance, a consumer might want to filter out all events related
to new files ending with ".jpg" (the file name being the "new file"
event's subject) for the context describing specific directory
inside a file storage account that it has registered interest on.
- An indicator for the encoding of the event and its data.
- An indicator for the structural layout (schema) for the event and
its data.

Whether its events are available for consumption via a middleware is
a delegation choice of the producer.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If middleware consumes the event, isn't it a consumer then?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ultrasaurus Good point. To me consuming an event means that some action is triggered by it and that it is not forwarded. Of course some new event can be emitted as part of the action. The text above mentions consumption via a middleware. In that case the middleware provides the means to consume the event but is not the consumer.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with both comments here. For example, a serverless platform will consume the event and trigger a function execution and pass filtered event data to the function. Is the serverless platform the middleware or the consumer or both?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@deissnerk @ultrasaurus via is indeed the keyword here, as @deissnerk explains

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My point is that the producer determines how it will communicate about its events. There seemed to be agreement when we discussed this. I would like the text here to make clear that an operator can connect producer and middleware, not that a producer somehow specifies it's middleware -- I wouldn't prevent that in the specification, but this language seems to imply that we always encode middleware configuration in the producer, which I don't think is your intent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "producer" is a piece of software and as such it is written, configured, and deployed. The operator may be changing configuration. I think those are IT ground rules that need no elaboration.


Copy link
Contributor

@rachelmyers rachelmyers Mar 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Proposed addition based on conversation during March 23 call: In practice, Middleware can take on role of a Producer when it changes the semantic meaning of an event, a Consumer when it takes action based on an event, or Middleware when it routes events without making semantic changes. cc @duglin

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like these additions!

In practice, middleware can take on role of a producer when it changes
the semantic meaning of an event, a consumer when it takes action based
on an event, or middleware when it routes events without making semantic
changes.

4) Frameworks and other abstractions make interactions with event platform
infrastructure simpler, and often provide common API surface areas
for multiple event platform infrastructures.

Frameworks are often used for turning events into an object graph,
and to dispatch the event to some specific handling user-code or
user-rule that permits the consuming application to react to
a particular kind of occurrence in the originating context and
on a particular subject.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: user => application developer

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm telling an app to specifically show the MSFT, GOOG, and IBM stock ticks for me, it's arguable that the subscription rule has been user supplied. This section is about frameworks that deal with dispatching events and therefore I chose this to be a bit broader than just application developers.


Frameworks are most interested in semantic metadata commonality
across the platforms they abstract, so that similar activities can
be handled uniformly.

For a sports application, a developer using the framework might be
interested in all events from today's game (subject) of a team in a
league (topic of interest), but wanting to handle reports
of "goal" differently than reports of "substitution".
For this, the framework will need a suitable metadata discriminator
that frees it from having to understand the event details.

## Status
At this time the specification is focused on the following scope:

* Agree upon a set of event metadata attributes (“context”) that:
* Offer a basic description of the event and the data it carries.
* Are currently implemented and semantically similar across multiple
platforms.
* Can be delivered separately from the event data in the transport headers
(e.g. HTTP, AMQP, Kafka) or together with the data in a serialized fashion
(e.g. JSON, protobuf, Avro).
* Include a description of the transport/protocol and encoding, with an
initial focus on HTTP.
* Can be extended to support experimental or uncommon features, while being
clearly indicated as an extension (e.g. extensions use a common prefix).
* Allow for evolution of both the payload and CloudEvents definition (e.g.
versioning).
* Can be embedded at different stages along the route of the event by
middleware (e.g. a router can add transport or auth information).
* Establish a backlog of prospective event metadata attributes (“context”)
for potential inclusion in the future.
* Include use-case examples to help users understand the value of CloudEvents,
with an initial focus on HTTP and Functions-as-a-Service/Serverless computing.
* Determine process and overall governance of the specification.
* Discuss additional architecture components that complement this specification.

## Notations and Terminology

### Notational Conventions
Expand Down