Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Find a way to name processed correlation keys #680

Closed
cdavernas opened this issue Sep 14, 2022 · 15 comments
Closed

Find a way to name processed correlation keys #680

cdavernas opened this issue Sep 14, 2022 · 15 comments
Assignees
Labels
area: spec Changes in the Specification change: feature New feature or request. Impacts in a minor version change
Milestone

Comments

@cdavernas
Copy link
Member

What would you like to be added:

To allow for a way to name processed correlation keys.

What I propose is to add a new key or keyName or whatever property to the correlationDef object, so that multiple events can address different correlation keys:

...
events:
- name: ProductCreated
  kind: consumed
  source: ...
  type: ...
  correlation:
  - contextAttributeName: subject
    correlationKey: product  # the processed key will be stored as 'product'
- name: ProductSold
  kind: consumed
  source: ...
  type: ...
  correlation:
  - contextAttributeName: subject
    correlationKey: product # the processed key will be stored as 'product'
- name: OrderCreated
  kind: consumed
  source: ...
  type: ...
  correlation:
  - contextAttributeName: subject
    correlationKey: order # the processed key will be stored as 'order'
...

Why is this needed:

Consider the following definition:

...
events:
- name: MyEvent1
  kind: consumed
  source: ...
  type: ...
  correlation:
  - contextAttributeName: subject
- name: MyEvent2
  kind: consumed
  source: ...
  type: ...
  correlation:
  - contextAttributeName: subject
 - name: MyOtherEventThatHasNothingToDoWithTheTwoPrevious
  kind: consumed
  source: ...
  type: ...
  correlation:
  - contextAttributeName: subject
...

Image we have a sequential flow such as the following:

  • Consume MyEvent1 with subject 'foo'
  • Consume MyEvent2 with subject 'foo'
  • Consume MyOtherEventThatHasNothingToDoWithTheTwoPrevious with subject 'bar'

What would happen on Synapse is:

  1. Check for a match for a correlation mapping with the defined correlations for MyEvent1 (i.e. subject)
  2. No correlation mapping exist, so the event is consumed, and the mapping is set to 'subject: foo'
  3. Check for a match for a correlation mapping with the defined correlations for MyEvent2 (i.e. subject)
  4. Match exists, and is set to 'foo', so the event is correlated and therefore consumed
  5. Check for a match for a correlation mapping with the defined correlations for MyOtherEventThatHasNothingToDoWithTheTwoPrevious(i.e. subject)
  6. Match exists, but the value is not set to 'foo', so the event is consumed, and the mapping is set to 'subject: bar' ===> BOOOOOOOOM

I could very easily find a dirty work around that problem, but like I said, it would be dirty. This is, IMO, up to the workflow designer to decide how a newly created correlation key should be named.

Note: not setting the contextAttributeValue (and that should be better explained in the spec) basically says that first event to come is gonna set the correlation key based on the defined context attribute (i.e. subject).

@cdavernas cdavernas added change: feature New feature or request. Impacts in a minor version change area: spec Changes in the Specification schema labels Sep 14, 2022
@cdavernas
Copy link
Member Author

We could also possibly introduce a new $CORRELATION jq named parameter, which would allow workflows to retrieve and play with those keys.

@cdavernas
Copy link
Member Author

@ricardozanini @tsurdilo May I proceed with a PR to address that?
It's a minor though very important feature for correlation.

@cdavernas
Copy link
Member Author

cdavernas commented Oct 27, 2022

Ideally, we could also take the advantage of that feature to revamp correlation and add support for (discouraged but sometimes necessary) payload based (vs context attributes) correlation:

Correlate incoming event of drfined type and source using both (or maybe exclusive too?) event context attributes and payload:

events:
  - name: correlatedEvent
    kind: consumed
    type: com.test/cloudevents/test
    correlate:
      - byContextAttribute:
          name: Subject
          value: lights-.*
          key: lightId
      - byPayloadProperty:
          property: 'zone' #could also be a runtime expression used to retrieve the property value
          key: zoneId

@cdavernas
Copy link
Member Author

cdavernas commented Oct 27, 2022

Finally, to support all use cases, we could also add a condition property on eventDef, which would allow to define whether or not to consume/produce events based on their attributes and or payload.

Only consume defined event when the condition matches:

events:
  - name: onTemperatureChanged
    kind: consumed
    type: com.test/cloudevents/test
    condition: '${ .data.heater.isPoweredOn == false }'

We would therefore leverage the full power of what we already have and of cloudevents, allowing for extremely complex correlations, which is IMHO one of the core features of the spec. Hell, I came across the spec back in the days because/thanks to cloudevents!!!

@fjtirado
Copy link
Collaborator

fjtirado commented Oct 27, 2022

@cdavernas
I was thinking a lot about correlation in the last months and Im not sure it is a good idea to try to simulate a correlation engine within the spec.
Wont it be better to actually get rid of correlation and simplify the spec?
I know the answer is no, but I need to try ;)
The reason Im proposing this apparently radical idea is because I feel we are trying to do the job of a correlation engine, which should be typically done by another process (Im thinking on Napoleon and his divide&conquer strategy) and because I think we should focus on clarifying the way a flow is started through events (I know Tiho is going to write something on that regard).
Anyway, if we are getting deep into correlation (as Im afraid we are going to do ;)), then my first impression is that rather than use byContextAttribute or byPayloadAttribute properties we should just use JQ expression to select the attributes that should be used for the correlation. If we are interested in a payload atttribute we should just write .data.<payload attribute> (basically what you proposed in your last post)
Also, I think we need to rewrite some bits of the spec to clarify many tricky scenarios related with correlations, the one you describe is just one of them, but there are certainly more (and some of them are related with start event stated vs non start event states and how to identify an existing flow that needs to be notified)
Finally, I think this topic deserve a face to face preliminary meeting to discuss all the angles and possibilities, wdyt?

@ricardozanini
Copy link
Member

Complex event processing scenarios are a widespread use case and a notable research topic called Complex Event Processing. So I'd say we must be aware of that and stay away if possible. Otherwise, we end up writing a specification inside a specification.

I'd say that we need some correlation features, but they should be limited.

@cdavernas
Copy link
Member Author

@ricardozanini What I'm proposing is, I believe, reasonable and easily implemtable. I agree with @fjtirado, however, that the feature should be optional. It is imho a requirement for (complex, that is more than two even states) event based workflows.

@cdavernas
Copy link
Member Author

Ill work on a PR proposal that shapes my ideas as optional add ins to the spec if that ok with you guys

@ricardozanini
Copy link
Member

ricardozanini commented Nov 14, 2022

@cdavernas, absolutely! Nothing against this proposal; we just need to be aware that if we keep adding use cases and features on top of correlations we will build another beast. :D

@cdavernas
Copy link
Member Author

@ricardozanini yeah you are right. I'll keep it as concise as possible!

@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@github-actions
Copy link

github-actions bot commented Apr 4, 2023

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@cdavernas
Copy link
Member Author

cdavernas commented May 17, 2024

This has been addressed in 1.0.0-alpha1, and is closed as part of #843

@github-project-automation github-project-automation bot moved this from Todo to Done in Progress Tracker May 17, 2024
This was referenced May 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: spec Changes in the Specification change: feature New feature or request. Impacts in a minor version change
Projects
Status: Done
Development

No branches or pull requests

3 participants