-
Notifications
You must be signed in to change notification settings - Fork 584
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optional event id #422
Optional event id #422
Conversation
Reword "id" uniqueness description. Signed-off-by: Alan Conway <aconway@redhat.com>
…queness. Signed-off-by: Alan Conway <aconway@redhat.com>
Signed-off-by: Alan Conway <aconway@redhat.com>
Co-Authored-By: alanconway <aconway@redhat.com> Signed-off-by: Alan Conway <aconway@redhat.com>
Signed-off-by: Alan Conway <aconway@redhat.com>
See discussion at issue cloudevents#326. This PR is base on pR cloudevents#391 as it depends on those changes.
- MUST be a non-empty string | ||
- MUST be unique within the scope of the producer | ||
- OPTIONAL | ||
- Producers MAY omit `id` if de-duplication is not required. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Design Goal section states:
CloudEvents are typically used in a distributed system to allow for services to be loosely coupled during development, deployed independently, and later can be connected to create new applications.
In a loosely coupled system, a producer can't tell if de-duplication is not required (otherwise they're strongly coupled), and even if it could - can the producer tell that if it is later connected to create a new application, de-duplication won't be required in the future?
On Thu, Apr 25, 2019 at 12:42 PM Christoph Neijenhuis < ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In spec.md
<#422 (comment)>:
> - Constraints:
- - REQUIRED
- - MUST be a non-empty string
- - MUST be unique within the scope of the producer
+ - OPTIONAL
+ - Producers MAY omit `id` if de-duplication is not required.
The Design Goal section
<https://github.com/cloudevents/spec/blob/master/primer.md#design-goals>
states:
CloudEvents are typically used in a distributed system to allow for
services to be loosely coupled during development, deployed independently,
and later can be connected to create new applications.
In a loosely coupled system, a producer can't tell if de-duplication is
not required (otherwise they're strongly coupled), and even if it could -
can the producer tell that if it is later connected to create a new
application, de-duplication won't be required in the future?
De-duplication is a question of information identity not transfer coupling.
In an application where consumers need the latest thermostat reading, two
successive readings with the same temperature do not ever need to be
deduplicated - regardless of the path they take from producer to consumer.
It is duplicate information. Two bank transfers between the same accounts
for the same amount are definitely *not* duplicate information and need to
be explicitly marked as distinct. The producer can know this without being
tightly coupled to the consumer.
… —
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#422 (review)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AB3LUXXNAGWZKPOLKX5BCT3PSHNPPANCNFSM4HIHH5LQ>
.
|
This is exactly what the Design Goal section talks about. The producer makes an optimization-decision on what the application will be. By doing so, the producer can limit itself to "be later connected to create new applications." There is no guarantee that there won't be another application that isn't interested in the latest thermostat reading, and would actually need to de-duplicate events. I've learned this lesson the hard way. Whenever I tried to be smart and optimize, because "there will never be a use case for that", a few months (or sometimes years) later a valid use case for it pops up. CloudEvents has, for me, the strongest selling point when we're talking truly loosely coupled system - e.g. I buy something from vendor Foo, something from vendor Bar, and then I also consume events from my cloud provider Baz. Of course, CloudEvents is also nice when I use it for my internal services only, but that problem-space is already handled not-too-badly with existing tools. The cross vendor use cases really are more painful atm. |
I've read up on #326 and while I like @alanconway's line of argumentation about the IDs being optional in AMQP and MQTT, I don't think that necessarily means that the ID also needs to be optional in CloudEvents, because CloudEvents is an application of MQTT or AMQP or HTTP (etc) and therefore a layer further up. I agree with @cneijenhuis and the current state of the spec that each event should be uniquely identifiable and that there needs to be an interoperable and unambiguous criterion for determining that identity, which is the id. @jric argues that the timestamp might be sufficient together with source and type, but timestamps are iffy because of timer resolution and because systems clocks do reset backwards when compensating for skew or, for instance, when a container wakes up on a new host. We shouldn't limit the considerations to deduplication, but also think about archiving events in database systems where you also need some uniqueness criterion. |
On Thu, May 2, 2019 at 9:19 AM Clemens Vasters ***@***.***> wrote:
I've read up on #326 <#326> and
while I like @alanconway <https://github.com/alanconway>'s line of
argumentation about the IDs being optional in AMQP and MQTT, I don't think
that necessarily means that the ID also needs to be optional in
CloudEvents, because CloudEvents is an application of MQTT or AMQP or HTTP
(etc) and therefore a layer further up. I agree with @cneijenhuis
<https://github.com/cneijenhuis> and the current state of the spec that
each event should be uniquely identifiable and that there needs to be an
interoperable and unambiguous criterion for determining that identity,
which is the id.
So the spec is explicitly not addressing (forcibly pessimizes) use cases
where events do *not* need to be uniquely identifiable? In other words all
use cases with stateless conversations - since by definition, referring to
the ID of a previous event while handling the current one requires
conversational state. Of course you can ignore the ID in such systems, but
you still have to generate one to comply, and there's no way to indicate to
generic intermediaries that some events will never need to be uniquely
identified - which is useful information if you're mapping events to
protocols where optional identification implies something about QoS or
deduplication settings.
… |
+1 This is the part of the discussion that has been kind of bothering me. We seem to be very focused on dedup these days (here and for #391 ) and I think ID uniqueness has other uses beyond dedup and I'm worried that making it optional means that people will need to find some other (non-interoperable) solution that they can count on, and that will turn our optional ID into something useless. Which at that point we might as well just remove it from the spec. To @alanconway's point about the usecases where uniqueness checks of any kind are not needed, I can think of a couple of other choices, just to put out some other ideas to consider:
|
@alanconway can you join today's call? it's at 1pm ET |
On Thu, May 2, 2019 at 10:30 AM Doug Davis ***@***.***> wrote:
We shouldn't limit the considerations to deduplication
+1 This the part of the discussion that has been kind of bothering me. We
seem to be very focused on dedup these days (here and for #391
<#391> ) and I think ID
uniqueness has other uses beyond dedup and I'm worried that making it
optional means that people will need to find some other (non-interoperable)
solution that they can count on, and that will turn our optional ID into
something useless. Which at that point we might as well just remove it from
the spec.
To @alanconway <https://github.com/alanconway>'s point about the usecases
where uniqueness checks of any kind are not needed, I can think of a couple
of other choices, just to put out some other ideas to consider:
- make ID required but make it STRONGLY RECOMMENDED that it be unique
(rather than a MUST)
- make ID optional, but make it STRONGLY RECOMMENDED that it be
included and even add text that says we only made it optional for edge
cases where generating a unique ID is just not possible.
I would suggest:
* OPTIONAL: If present it MUST be unique. SHOULD be present unless the
application design is such that events will never need to be identified.
Sorry I can't make the meeting today - it fell off my calendar will add it
back.
…
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#422 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AB3LUXQ2STXZIYSW3SJPMU3PTL3IVANCNFSM4HIHH5LQ>
.
|
@alanconway in the kinds of pub/sub systems we've been discussing in the group so far, the publisher generally doesn't know about the consumer(s)/subscriber(s) and therefore it doesn't know about the subscriber's application design. A single event may reach multiple applications, each with a different design, and a new subscriber might require unique identification of events whereas existing subscribers yet do not. |
On Fri, May 3, 2019 at 7:59 AM Clemens Vasters ***@***.***> wrote:
@alanconway <https://github.com/alanconway> in the kinds of pub/sub
systems we've been discussing in the group so far, the publisher generally
doesn't know about the consumer(s)/subscriber(s) and therefore it doesn't
know about the subscriber's application design. A single event may reach
multiple applications, each with a different design, and a new subscriber
might require unique identification of events whereas existing subscribers
yet do not.
A producer must be aware of the semantics of the events that it produces. I
would argue that "universal identifiability" is a semantic property of the
event not a "requirement" that subscribers may or may not have.
It's analogous to the 'time' attribute being optional - in some
applications the time of production is required, in some it is not. The
producer has to know that.
—
… You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#422 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AB3LUXRMXXYPB3YD4HZGCCDPTQSK3ANCNFSM4HIHH5LQ>
.
|
@alanconway we discussed this on last week's call and there was general agreement to keeping it required. However, we didn't want to close it until you had a chance to join the call this week (tomorrow) to make a case for making it optional. Can you join the call to discuss this? |
I will let this one go, and live to fight another day. I couldn't make the call but it's on my calendar now for the next time I want to cause trouble. |
@alanconway thank you for your patience on this one - and yes please do keep "causing trouble" :-) |
On Thu, May 9, 2019 at 10:14 PM Doug Davis ***@***.***> wrote:
@alanconway <https://github.com/alanconway> thank you for your patience
on this one - and yes please do keep "causing trouble" :-)
It's what I do :)
|
See discussion at issue #326.
This PR is base on pR #391 as it depends on those changes.