Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add some guidance on how to construct CloudEvents #404

Merged
merged 3 commits into from
Jun 6, 2019
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
53 changes: 52 additions & 1 deletion primer.md
Original file line number Diff line number Diff line change
Expand Up @@ -243,6 +243,57 @@ serialization for unkown, or even new, properties. It was also noted that the
HTTP specification is now following a similar pattern by no longer suggesting
that extension HTTP headers be prefixed with `X-`.

## Creating CloudEvents
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are attempting to write in guidance for namespace shadowing on event types per our conversation during the sdk call.

I strongly believe that it is bad practice to create an event type that corresponds to what the event producer would like to use if they produced cloudevents natively.

For context, this comes from a disagreement that Knative project Event Sources are doing where an event originates from GitHub, GitHub sends the event payload via webhook to an application running in the cluster local. That application marshals the request into a struct (using a lib that is not owned by GitHub) and then takes the GitHub event type and adds it as a post-fix like dev.knative.eventing.{GitHubEvent}.

Doug believes we should use com.github.{GitHubEvent}. The issue is that if GitHub ever decides to produce cloudevents, there could be events in the cluster that match in terms of Source + EventType + ID but do not match in terms of implementation choice for the data payload. GitHub is a good example of a complicated event struct, and they mix event types into an over shadowed struct, plus auth components in the headers. They could choose to implement their payload differently for easier consumption or routing.

My argument is the source application that runs in the cluster is not just a simple proxy, it is a full micro-service with logic and choices. We get to drop some of the auth headers because we validate them, we may not forward the payload correctly if GitHub adds or removes parts of their API, they don't control the lib we use...

Copy link
Collaborator Author

@duglin duglin Mar 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, that's not quite what I'm advocating. I'm not suggesting the CE generator try to co-opt the event producer's namespace, rather I think the CE generator should reuse data coming from the event producer rather than inventing data that makes it look like the CE is from the CE generator rather than the original event producer.

So, I'm not suggesting to use com.github.push, rather it should be just push if that is what github lists the event types as. If I mentioned com.github.xxx in previous chats it was just as an example of how it should not be something with knative in it because consumers don't care if knative is in the flow - I was not formally saying it should start with com.github..

While in the Kn case it isn't just a 'proxy' because there is some logic involved in generating the CE attributes, it is much closer to a proxy than it is an "event producer" because 1) it doesn't materially change the data or shouldn't, and 2) it is generating CE attributes for the purpose of spec compliance, meaning it needs to fill it in with something, and not doing so to convey that this Kn component is in the flow. IOW, the consumer doesn't not, and should not, care that Kn is involved w.r.t. the data it consumes. The app is subscribing to and wants to process Github events, not Kn events.


The CloudEvents specification purposely avoids being too prescriptive about
how CloudEvents are created. For example, it does not assume that the original
event source is the same entity that is constructing the associated
CloudEvent for that occurrence. This allows for a wide variety of implementation
choices. However, it can be useful for implementors of the specification
to understand the expectations that the specification authors had in mind
as this might help ensure interoperability and consistency.

As mentioned above, whether the entity that generated the initial event is
the same entity that creates the corresponding CloudEvent is an implementation
choice. However, when the entity that is constructing/populating the
CloudEvents attributes is acting on behalf of the event source, the values
of those attributes are meant to describe the event or the event source
and not the entity calculating the CloudEvent attribute values. In other words,
when the split between the event source and the CloudEvents producer are
not materially significant to the event consumers, the spec defined
attributes would typically not include any values to indicate this split
of responsibilities.

This isn't to suggest that the CloudEvents producer
couldn't add some additional attributes to the CloudEvent, but those
are outside the scope of the interoperability defined attributes of the spec.
This is similar to how an HTTP proxy would typically minimize changes to the
well-defined HTTP headers of an incoming message, but it might add some
additional headers that include proxy-specific metadata.

It is also worth noting that this separation between original event source
and CloudEvents producer could be small or large. Meaning, even if the
CloudEvent producer were not part of the original event source's ecosystem,
if it is acting on behalf of the event source, and its presence in the
flow of the event is not meaningful to event consumers, then the above
guidance would still apply.

When an entity is acting as both a receiver and sender of CloudEvents
for the purposes of forwarding, or transforming, the incoming event, the
degree to which the outbound CloudEvent matches the inbound CloudEvent
will vary based on the processing semantics of this entity. In cases where
it is acting as proxy, where it is simply forwarding CloudEvents
to another event consumer, then the outbound CloudEvent will typically
look identical to the inbound CloudEvent with respect to the spec defined
attributes - see previous paragraph concerning adding additional attributes.

However, if this entity is performing some type of semantic processing
of the CloudEvent, typically resulting in a change to the value of the
`data` attribute, then it may need to be considered a distinct "event
source" from the original event source. And as such, it is expected
that CloudEvents attributes related to the event producer (such as 'source`
and `id`) would be changed from the incoming CloudEvent.

## Qualifying Protocols and Encodings

The explicit goal of the CloudEvents effort, as expressed in the specification,
Expand Down Expand Up @@ -630,7 +681,7 @@ existing current event formats that are used in practice today was gathered.

#### AWS - CloudWatch Events

A high proportion of event-processing systems on AWS are converging on
A high proportion of event-processing systems on AWS are converging on
the use of this format.

```
Expand Down