Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

http: Batch Events from HTTP Request and Response #829

Merged
merged 5 commits into from
Jan 26, 2023

Conversation

markmandel
Copy link
Contributor

Simple implementations of NewEventsFromHTTPRequest and NewEventsFromHTTPResponse as a step towards providing some support for application/cloudevents-batch+json.

This doesn't help with batch sending of Events, but this provides some basics of being able to process batch requests and/or responses from standard Go HTTP handling.

This could be improved over time (See previous work on #301) as well, but optimised for having something that works to start, rather than implementing big design changes.

Signed-off-by: Mark Mandel markmandel@google.com

@@ -41,12 +42,30 @@ func (jsonFmt) Unmarshal(b []byte, e *event.Event) error {
return json.Unmarshal(b, e)
}

// JSONBatch is the buitl-in "application/cloudevents-batch+json" format.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/buitl/built/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch - thank you!

Simple implementations of NewEventsFromHTTPRequest and
NewEventsFromHTTPResponse as a step towards providing some support for
`application/cloudevents-batch+json`.

This doesn't help with batch sending of Events, but this provides some
basics of being able to process batch requests and/or responses
from standard Go HTTP handling.

This could be improved over time (See previous work on cloudevents#301) as well,
but optimised for having something that works to start, rather than
implementing big design changes.

Signed-off-by: Mark Mandel <markmandel@google.com>
@markmandel markmandel force-pushed the feature/batch-event-util branch from 0de68c5 to e9cfad8 Compare January 6, 2023 21:45
@duglin
Copy link
Contributor

duglin commented Jan 6, 2023

How does someone send a batch with this change? I feel like I'm missing something (but it could be me :-) ).

@duglin
Copy link
Contributor

duglin commented Jan 6, 2023

doi never mind - I just reread your origin post.

@duglin
Copy link
Contributor

duglin commented Jan 6, 2023

@lionelvillard any thoughts on this?

Will this feel incomplete w/o the sending side?

@markmandel
Copy link
Contributor Author

Will this feel incomplete w/o the sending side?

Thankfully sending of JSON HTTP batch events is quite straightforward if someone wanted to do it by hand -- but Client seems very coupled to singular events, so I couldn't see an easy way to extend it to handle multiple events without breaking every implementation of the Client interface, and having massive changes.

What I could do - which is inline with the utility package, is add a NewHTTPRequestFromEvents(ctx context.Context, url string, events[]Events) *http.Request function, which would spit out a http.Request object that would send batched events, that the end user could use with their own http client as they so desired.

So an end user could do something like:

var events []event.Event
event :=  cloudevents.NewEvent()
event.SetSource("example/uri")
event.SetType("example.type")
event.SetData(cloudevents.ApplicationJSON, map[string]string{"hello": "world"})

events := append(events, event) // repeat multiple times

request := NewHTTPRequestFromEvents(context.Background(), "http://mywebsiteaddress", events")
client := http.Client{}
resposne, err := client.Do(request)
// do something with the response and error.

Pretty easy addition, and allows for sending as well as recieving.

WDYT?

@markmandel
Copy link
Contributor Author

Actually, looking at this, I really like the idea of having a NewHTTPRequestFromEvents (and probably a NewHTTPRequestFromEvent (singular), because if you pair that with NewEventFromHTTPResponse - if the batch event endpoint returns a batched event, you can easily parse this as well.

Basically you can escape the Client if you need to break the glass for whatever reason, which is not a bad thing!

@markmandel
Copy link
Contributor Author

Actually, I like the idea of NewHTTPRequestFromEvents so much, I implemented it -- so now you can send batched events!

@markmandel markmandel force-pushed the feature/batch-event-util branch from 8685ff2 to 1915af2 Compare January 7, 2023 01:05
Now you can send batch events with your very own http.Client.

Signed-off-by: Mark Mandel <markmandel@google.com>
@markmandel markmandel force-pushed the feature/batch-event-util branch from 1915af2 to 5361386 Compare January 7, 2023 04:56
@@ -89,3 +90,67 @@ func TestNewEventFromHttpResponse(t *testing.T) {
})
}
}

func TestNewEventsFromHTTPRequest(t *testing.T) {
Copy link
Contributor

@lionelvillard lionelvillard Jan 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a test where NewEventsFromHTTPRequest returns an error (for instance when the request is an event in the JSON Event Format) and maybe another one with more than one event?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this suggestion! Ended up leading me to the code to see how easy it would be to do a NewHTTPRequestFromEvent, so added that in as well.

Updates to tests also implemented.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please add test cases which actually assert an error?

@lionelvillard
Copy link
Contributor

Have you considered having NewHTTPRequestFromEvents also handling single, non-batched events? It feels wrong that the caller must look at in the request headers what's the content mode to determine whether to call NewHTTPRequestFromEvents or NewHTTPRequestFromEvent.

@markmandel
Copy link
Contributor Author

Have you considered having NewHTTPRequestFromEvents also handling single, non-batched events?

Hmnn.. that's an interesting question 🤔

My thinking in this part of the library was to give people the low level building blocks they needed to put together what they needed to parse data as they needed. In which case, checking the Content-Type header seems like the appropriate thing to if they wanted a singular endpoint to handle either batch or singular events.

I'd honestly also assumed people would have specific endpoints to handle singular and a different endpoint to handle batched events -- but I could definitely be wrong with that assumption.

Also assumed that people may only want to accept either singular/batched -- so didn't want to bake in any higher level abstractions.

A good middle ground might something like adding a func IsRequestBatch(req http.Request) bool and func IsResponseBatch(req http.Response) bool which makes checking the header easy if people want to combine parsing logic into a singular endpoint. WDYT?

@lionelvillard
Copy link
Contributor

func IsRequestBatch(req http.Request) bool and func IsResponseBatch(req http.Response) bool seems like a great addition to me. +1

* Expanded test for NewEventsFromHTTPRequest
* Implemented NewHTTPRequestFromEvent as well

Signed-off-by: Mark Mandel <markmandel@google.com>
@markmandel markmandel force-pushed the feature/batch-event-util branch from 3da778d to f5d82ff Compare January 10, 2023 16:58
@markmandel
Copy link
Contributor Author

Done!

I ended up at IsHTTPBatch(header nethttp.Header) bool since both http.Request and http.Response both use the same http.Header type, so I figured it made sense to just check that value.

IsHTTPBatch is a convenience function such that an end user could
determine if a request or response is batch or not, and process it as
such.

Signed-off-by: Mark Mandel <markmandel@google.com>
@markmandel
Copy link
Contributor Author

Just gently bumping this PR, see if it's good to merge now?

@lionelvillard
Copy link
Contributor

LGTM

@embano1 @duglin wanna take a quick look at this one?

// Since Format doesn't support batch Marshalling, and we know it's structured batch json, we'll go direct to the
// json.UnMarshall(), since that is the best way to support batch operations for now.
var events []event.Event
err := json.NewDecoder(body).Decode(&events)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could directly return here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice trick. I like it 👍🏻

return event.ApplicationCloudEventsBatchJSON
}

func (jb jsonBatchFmt) Marshal(e *event.Event) ([]byte, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error threw me off until I saw your implementation in to_event.go - perhaps a short comment that this is handled by the other package?

// NewEventsFromHTTPRequest returns a batched set of Events from a http.Request
func NewEventsFromHTTPRequest(req *nethttp.Request) ([]event.Event, error) {
msg := NewMessageFromHttpRequest(req)
return binding.ToEvents(context.Background(), msg, msg.BodyReader)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: use req.Context() here? Not sure if that provides value/advantage in this case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was being consistent with all the other utility functions as previously written (they all used context.Background(). My suggestion would e that if we want to make that change it should probably be in a separate PR.

return binding.ToEvents(context.Background(), msg, msg.BodyReader)
}

// NewHTTPRequestFromEvent creates a http.Request object that can be used with any http.Client for a singular event.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: line wrapping seem off

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not quite sure what you mean here. The line is 116 chars (I usually wrap at 120), is that what you are referring to?

@@ -24,3 +26,63 @@ func NewEventFromHTTPResponse(resp *nethttp.Response) (*event.Event, error) {
msg := NewMessageFromHttpResponse(resp)
return binding.ToEvent(context.Background(), msg)
}

// NewEventsFromHTTPRequest returns a batched set of Events from a http.Request
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: better from a HTTP request ?

// ToEvents translates a Batch Message and corresponding Reader data to a slice of Events.
// This function returns the Events generated from the body data, or an error that points
// to the conversion issue.
func ToEvents(ctx context.Context, message MessageReader, body io.Reader) ([]event.Event, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: can we have a test with a partial invalid event in the batch request body? I would like to understand the semantics of this API, i.e. which states []event.Event can have depending on the input.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally MessageReader should define ReadBatch, even though currently batching mode is only defined by HTTP protocol binding spec. I'm fine with not adding ReadBatch but I wonder if it makes sense to match the ToEvent signature and do the type switch in the function body. Or have ToEvents take a *Message instead of MessageReader.

I think it's fine to not initially supporting transformers on batched events until someone ask for it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally MessageReader should define ReadBatch, even though currently batching mode is only defined by HTTP protocol binding spec.

I started playing with this, and as soon as that interface changes, it requires changes across lots of codebase, so I'm hesitant to make that change. Or maybe that change should be a subsequent PR for a larger architectural change?

I'm fine with not adding ReadBatch but I wonder if it makes sense to match the ToEvent signature and do the type switch in the function body. Or have ToEvents take a *Message instead of MessageReader.

Not sure I'm 100% sure I'm following here. Are you saying that there is a way I could get the io.Reader from the MessageReader? If so, I'd love some direction - it wasn't immediately obvious to me.

I think it's fine to not initially supporting transformers on batched events until someone ask for it.

👍🏻

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I'm 100% sure I'm following here. Are you saying that there is a way I could get the io.Reader from the MessageReader? If so, I'd love some direction - it wasn't immediately obvious to me.

You can do something like this:

switch mt := message.(type) {
case *http.Message:
  body := mt.BodyReader
  // Since Format doesn't support batch Marshalling, and we know it's structured batch json, we'll go direct to the
  // json.UnMarshall(), since that is the best way to support batch operations for now.
  var events []event.Event
  return events, json.NewDecoder(body).Decode(&events)
}

Then return an error for all other cases.

Does that make sense?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 interesting. Curious - why have the system return an error at runtime, when you could do the check at compile time?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ToEvents takes a MessageReader, which is more general than http.Message. In your case you have only one (http) caller but in theory it does not have to (be the only one).

You can either make ToEvents to handle only http message, in which case just use *http.Message as the second argument type and remove body, or make it a bit more generally applicable and consistent with ToEvent and pass MessageReader only.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, I see 🤔 that makes sense -- we could set ourselves up so if more implementations of Batch come down the pipe, we could handle them here in this place, and expand the switch. SGTM. I'll make the change to Message and do the type switch 👍🏻

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hit a pretty good wrinkle on making this change. I can't reference the concrete type at all within ToEvents -- because I get into a cyclic dependency issue.

package github.com/cloudevents/sdk-go/v2
        imports github.com/cloudevents/sdk-go/v2/binding
        imports github.com/cloudevents/sdk-go/v2/protocol/http
        imports github.com/cloudevents/sdk-go/v2/binding: import cycle not allowed

So my suggestion - leave this as is, and then as there is a need to create more implementation on Batch processing, implement a BatchEvents on MessageReader and do the work to implement that across the code base. I figure no point in doing all that work, if the demand may not be there.

return nil, err
}

req, err := nethttp.NewRequestWithContext(ctx, nethttp.MethodPost, url, nil)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I know this is a helper, but since we're hardcoding POST behavior here, would be nice to see this doc-ed in the comment. Not all webhooks use POST, unfortunately.

return request, nil
}

// IsHTTPBatch returns of the current http.Request or http.Response is a batch event operation, by checking the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: returns of?

NewEventsFromHTTPRequest = http.NewEventsFromHTTPRequest
NewEventsFromHTTPResponse = http.NewEventsFromHTTPResponse
NewHTTPRequestFromEvents = http.NewHTTPRequestFromEvents
NewHTTPRequestFromEvent = http.NewHTTPRequestFromEvent
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: be consistent and switch order to FromEvent then FromEvents - easy one to miss when going through this list.

@@ -89,3 +90,67 @@ func TestNewEventFromHttpResponse(t *testing.T) {
})
}
}

func TestNewEventsFromHTTPRequest(t *testing.T) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please add test cases which actually assert an error?

@embano1
Copy link
Member

embano1 commented Jan 13, 2023

Nice work @markmandel! Since you added a bit of code, would be nice to see a coverage report here. IMHO currently you don't have all code paths covered with tests.

@markmandel
Copy link
Contributor Author

Nice work @markmandel! Since you added a bit of code, would be nice to see a coverage report here. IMHO currently you don't have all code paths covered with tests.

Thanks for the detailed feedback! I'll jump on it.

}
}
var buffer bytes.Buffer
err := json.NewEncoder(&buffer).Encode(events)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for my own education, here we just encode the events directly, but above for a single event we use WriteRequest which seems to have quite a bit more logic. Is any of that logic needed for batch?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only way to send batch is in a JSON encoded format, so there is no need for much of that logic.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we don't need to worry about things like the "transformers"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 What is there to transform? From my reading of the batch spec, it's JSON in, JSON out.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LOL I dunno. I just noticed that for the single event it has that transform stuff and I didn't want to miss anything. @lionelvillard any idea what this transform stuff is for and why we need it for single events but not batch?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I figure at least we don't need it for the utility functions, since none of the existing functions in utility.go don't pass in transformers.

The good thing is, if there was a need to add an optional list of transformers to the method signature isn't going to break.

@markmandel
Copy link
Contributor Author

Okay! I think this is good to re-revue. Implements everything that I didn't have questions for.

As requested, coverage report. I ran it in my IDE and plugged all the holes I could find, which also corresponded to fixing several of the review comments, as well as some other paths as well.

❯ go test -cover  ./...
?       github.com/cloudevents/sdk-go/v2        [no test files]
ok      github.com/cloudevents/sdk-go/v2/binding        0.028s  coverage: 64.8% of statements
ok      github.com/cloudevents/sdk-go/v2/binding/buffering      0.023s  coverage: 81.0% of statements
ok      github.com/cloudevents/sdk-go/v2/binding/format 0.021s  coverage: 87.0% of statements
ok      github.com/cloudevents/sdk-go/v2/binding/spec   0.014s  coverage: 79.7% of statements
?       github.com/cloudevents/sdk-go/v2/binding/test   [no test files]
ok      github.com/cloudevents/sdk-go/v2/binding/transformer    0.025s  coverage: 85.4% of statements
ok      github.com/cloudevents/sdk-go/v2/binding/utils  0.017s  coverage: 77.8% of statements
ok      github.com/cloudevents/sdk-go/v2/client 6.033s  coverage: 72.5% of statements
ok      github.com/cloudevents/sdk-go/v2/client/test    0.023s  coverage: 67.4% of statements
ok      github.com/cloudevents/sdk-go/v2/context        0.020s  coverage: 65.4% of statements
ok      github.com/cloudevents/sdk-go/v2/event  0.011s  coverage: 89.3% of statements
ok      github.com/cloudevents/sdk-go/v2/event/datacodec        0.019s  coverage: 100.0% of statements
ok      github.com/cloudevents/sdk-go/v2/event/datacodec/json   0.019s  coverage: 100.0% of statements
ok      github.com/cloudevents/sdk-go/v2/event/datacodec/text   0.013s  coverage: 88.9% of statements
ok      github.com/cloudevents/sdk-go/v2/event/datacodec/xml    0.011s  coverage: 100.0% of statements
ok      github.com/cloudevents/sdk-go/v2/extensions     0.015s  coverage: 70.3% of statements
?       github.com/cloudevents/sdk-go/v2/observability  [no test files]
ok      github.com/cloudevents/sdk-go/v2/protocol       0.027s  coverage: 61.5% of statements
ok      github.com/cloudevents/sdk-go/v2/protocol/gochan        0.051s  coverage: 45.0% of statements
ok      github.com/cloudevents/sdk-go/v2/protocol/http  6.046s  coverage: 69.3% of statements
ok      github.com/cloudevents/sdk-go/v2/protocol/test  0.040s  coverage: 54.9% of statements
ok      github.com/cloudevents/sdk-go/v2/test   0.012s  coverage: 47.6% of statements
ok      github.com/cloudevents/sdk-go/v2/types  0.008s  coverage: 89.1% of statements

* Reorder in alias.go
* More coverage of TestNewEventsFromHTTPRequest
* tests for `to_events(...)`
* Better comment documentation.

Signed-off-by: Mark Mandel <markmandel@google.com>
@markmandel markmandel force-pushed the feature/batch-event-util branch from 4179b1a to 0cfc8f4 Compare January 18, 2023 23:31
@markmandel
Copy link
Contributor Author

Just gently bumping the PR, to see if there are any additional changes you would like made?

@cloudevents cloudevents deleted a comment from cncfcloudevents Jan 23, 2023
@duglin
Copy link
Contributor

duglin commented Jan 23, 2023

LGTM but I defer to @lionelvillard for the final decison

@lionelvillard
Copy link
Contributor

I added a small comment.

@lionelvillard lionelvillard merged commit 885b9ca into cloudevents:main Jan 26, 2023
@markmandel markmandel deleted the feature/batch-event-util branch January 26, 2023 18:13
@ghstahl
Copy link

ghstahl commented Mar 5, 2023

@markmandel

Can we extend this to the type Client interface.

we have

// Send will transmit the given event over the client's configured transport.
Send(ctx context.Context, event event.Event) protocol.Result

would like to have the following added.

// Send will transmit the given event over the client's configured transport.
SendEvents(ctx context.Context, events []event.Event) protocol.Result

I build out an http protocol where I add ApiKey, basicauth, bearer token stuff and pass it to.

	c, err := cloudevents_sdk_go.NewClient(p, cloudevents_sdk_go.WithTimeNow(),
		cloudevents_sdk_go.WithUUIDs(),
	)

and then call.

c.Send(...)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants