-
Notifications
You must be signed in to change notification settings - Fork 491
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ExecuteStoredProcedureStreamAsync is only half-Stream #1059
Comments
This would also bring |
Stream API's are scenarios where the payload is given to the middle tiers and they are just streamed to Cosmos service (no serialization/de-serialization in SDK/service except for indexing). With SPROC the stream payload do need to be de-serialized at-least once by SPROC on service and then process those operations. So there does exist some de-serialization and serialization still. In 3.4.0 Cosmos included support for TranasctionalBatch (#965) which enables pure stream based batching of operations. Can you please take a look at them once and see if they are better fit for your scenarios? |
My goal: Execute a stored procedure without the SDK needing to serialize or deserialize the payload. The The existing overload didn't accept a stream - only a Once it's out of the SDK's hands, I don't care what Cosmos does with it. I just don't want the SDK to serialize/deserialize it. ... Separately, regarding Which did bring me to create this issue & PR. In our software, when we add to a "batch" that we're preparing, we immediately serialize the payload. Later, when the batch is executed, we've basically got some |
Regarding use cases other than our own. While it's true that often parameters are scalar, they're often documents (without any relation to Batch operations). Example: https://docs.microsoft.com/en-us/azure/cosmos-db/how-to-write-stored-procedures-triggers-udfs#stored-procedures (second example) ... The control over serialization/deserialization for things like this would still be useful. |
Stream API's makes fully sense if the required payload comes to application layer (WebAPI etc...) already as stream. If the application is preparing the stream then serialization is happening at-least once on client (with SDK or application). We are not against having a stream input API for SPROC. But best is to address #1057. /cc: @ealsur, @abhijitpai |
I would love for #1057 to be addressed But I do think that without this issue being fixed, the SDK is inconsistent. Everywhere a document might be sent as part of the payload, the methods take a stream:
Except here.
Adding this would bring it in line with the rest.
|
@joshlang - what is the processBatch.js you wrote intended for - are you trying to get higher throughput by trading off latency of individual calls? We have bulk functionality in the SDK (blog here: https://medium.com/@Ealsur/azure-cosmos-db-net-sql-sdk-bulk-support-9dea1d0ee023) which automatically groups operations into server requests and attempts to achieve the same. If you are instead trying to achieve atomic multi-document transactions, do you have a bound on the number and size of the operations - what are these bounds? In the absence of such bounds, calls would hit some other limit even if we can figure out a way to bump up the 100 operation limit. |
@abhijitpai Transactions. The bounds are the 2MB payload limit. What I'd like is predictability in transactions. It's one of the reasons I was so excited to see I suppose that 100 documents is... well, predictable. The way I see it is that stored procedures have an unpredictable ability to do work before saying "Welp, that's all for you this time!". But whatever that limit is, it seemed to handle whatever we threw at it within the 2MB payload limit without any problems. My perception when I saw In many ways, 100 documents is more predictable than 2MB. Maybe it's the right way to go (if you need limits). Selfishly, I'd prefer that only the 2MB limit existed. Even more selfishly, I'd prefer that there were no limits at all. But knowing the feasibility of this in your back-end is beyond me. |
* #1059 - Introducing ExecuteStoredProcedureStreamAsync overload accepting a stream as a parameter * Update changelog to add draft pull request #1061 * Fixing ExecuteTestWithStreamParameter to use the streaming version in both calls. * Fixing a broken test. Serialize parameters must be an array. * Executed "testbaseline.cmd /update" to update DotNetSDKAPI.json * Adding unit tests for stored procedure calls to cover some edge cases: Unparseable json, not-an-object, "null" literal, and array-like objects * Correction to docs: An Object is accepted (but ignored if not array-like) by the stored procedure executor. * Additional edge cases covered in unit tests * Adding ExecuteStoredProcedureStreamAsync to ScriptsInlineCore
tldr: I would like an overload of
ExecuteStoredProcedureStreamAsync
which takes aStream
as a parameter.Scripts.
ExecuteStoredProcedureStreamAsync
returns a ResponseMessage (aka Stream) so we can handle it however we wish. Streams are handy because we can control serialization (and other reasons).However, it takes
dynamic[]
as a parameter representing the parameters to the stored procedure. In ScriptsCore, this is converted to a stream usingthis.clientContext.CosmosSerializer.ToStream<dynamic[]>(parameters);
. In other words, it uses CosmosSerializer.While I suppose this is fine, I would like more control over when things get serialized.
From a higher-level point of view, I would prefer that my entire experience with Cosmos is Stream-based, and that when using Stream methods, there's "no such thing" as serialization. In other words, Stream-based methods should never serialize nor deserialize "user" objects (but can still serialize internal structures, eg: change feed [documents:[...]]).
Challenges with adding an overload to ExecuteStoredProcedureStreamAsync:
dynamic[] parameters
andStream streamPayload
cannot be in the same position becausenull
can be passed in and this would be a breaking changeMitigations:
If the user serializes something that's not an array, just allow it and let the server blow up with a BadRequest. No big deal.
As to the parameter positions:
Option 1: Change the positions. It seems "awkward" but we can do
string storedProcedureId, Stream streamPayload, PartitionKey partitionKey
which is safely differentiated fromstring storedProcedureId, PartitionKey partitionKey, dynamic[] parameters
Option 2: Change the name. But I can't think of a non-awkward alternative name for
ExecuteStoredProcedureStreamAsync
.I'll post a PR for this. But I realize I might be stepping on some design-toes so feel free to consider the PR as just a further explanation to this issue.
The text was updated successfully, but these errors were encountered: