Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query: Fixes query plan via service interop to use custom serializer #3154

Merged
merged 9 commits into from
May 3, 2022

Conversation

j82w
Copy link
Contributor

@j82w j82w commented Apr 21, 2022

Pull Request Template

Description

Customer can pass a customer serializer in which can apply and data transformation necessary. For example it could convert a int or double to a string.

Query has 3 different ways to get the query plan.

  1. ServiceInterop.dll which requires Windows x64
  2. Antlr parser
  3. Gateway

The custom serializer is used in query for the following scenarios.

  1. ServiceInterop.dll-> Using a random JsonConvert instead of the standard serialization contract
  2. Antlr parser -> Does not take in parameters so serialization is not necessary. It just looks at query text.
  3. Gateway -> Applies the correct serialization
  4. Executing the query(sending it to gateway or a partition): Applies the correct serialization

Why didn't testing catch this?

There are existing tests which validate this scenario, but they are not calculating the correct expected count. The test assumed the query pipeline would only serialize the SqlQuerySpec once. The implementation serializes it for every page request to the backend and to generate the query plan.

Who is impacted?

Only customers that are running on Windows x64 and have custom serialization that filters on the partition key value only in the query. If the partition key is provided in the request options then it is handled correctly because request option overrides the query text.

Impact:

This could result in the query being sent to the wrong partitions' which would cause it query returning possibly less results than expected because it would be routed to the wrong partition.

Solution:

Based on the current models and contract the serialization logic should not be in the query code. The query pipeline should not know or care about how the parameters are serialized.

To keep this abstraction in place the serialization logic will be moved the CosmosQueryClientCore.cs like all of the other places that currently handle the custom serialization. This keeps all the serialization logic in the same file, and keeps it the same for all the different methods to get the query plan.

This will cause the serialized string to be passed down instead of the SqlQuerySpec. This is a better contract because the service interop only requires the serialized string. All the contracts now match what is actually needed for it to execute getting the query plan. This follows the same model as getting it from gateway or sending it to be executed.

Type of change

Please delete options that are not relevant.

  • [] Bug fix (non-breaking change which fixes an issue)
  • [] New feature (non-breaking change which adds functionality)
  • [] Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • [] This change requires a documentation update

Closing issues

To automatically close an issue: closes #3153

@j82w j82w changed the title Query: Fixes a bug where custom serializer not used in query plan creation Query: Fixes service interop query plan creation to use custom serializer Apr 21, 2022
@j82w j82w changed the title Query: Fixes service interop query plan creation to use custom serializer Query: Fixes query plan via service interop to use custom serializer Apr 21, 2022
@neildsh
Copy link
Contributor

neildsh commented Apr 21, 2022

Please add more description to explain why this change is required. A good example would be something like this: #3137


In reply to: 1105722954

@neildsh
Copy link
Contributor

neildsh commented Apr 21, 2022

Please add baseline tests that illustrate how this change is used

@neildsh neildsh self-requested a review April 22, 2022 05:56
Copy link
Contributor

@neildsh neildsh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🕐

ealsur
ealsur previously approved these changes Apr 22, 2022
Copy link
Member

@ealsur ealsur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, only minor comments

@kr-santosh
Copy link
Contributor

@sboshra Hi Samer could you have a look at it and sign off on the PR.

@j82w j82w dismissed stale reviews from kr-santosh and ealsur via 7153d24 April 29, 2022 14:42
ealsur
ealsur previously approved these changes Apr 29, 2022
Copy link
Contributor

@neildsh neildsh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@j82w j82w merged commit 8a0d18a into master May 3, 2022
@j82w j82w deleted the users/jawilley/query/serializerFix branch May 3, 2022 14:54
j82w added a commit that referenced this pull request May 5, 2022
…3154)

Customer can pass a customer serializer in which can apply and data transformation necessary. For example it could convert a int or double to a string. 

Query has 3 different ways to get the query plan.
1. ServiceInterop.dll which requires Windows x64
2. Antlr parser
3. Gateway 

### The custom serializer is used in query for the following scenarios.
1. ServiceInterop.dll-> Using a random [JsonConvert ](https://github.com/Azure/azure-cosmos-dotnet-v3/blob/c8935ac2f864fb829f5d941dda07c74aec86a677/Microsoft.Azure.Cosmos/src/Query/Core/QueryPlan/QueryPartitionProvider.cs#L169)instead of the [standard serialization contract](https://github.com/Azure/azure-cosmos-dotnet-v3/blob/c8935ac2f864fb829f5d941dda07c74aec86a677/Microsoft.Azure.Cosmos/src/Query/v3Query/CosmosQueryClientCore.cs#L123)
2. Antlr parser -> Does not take in parameters so serialization is not necessary. It just looks at query text.
3. Gateway -> Applies the[ correct serialization ](https://github.com/Azure/azure-cosmos-dotnet-v3/blob/c8935ac2f864fb829f5d941dda07c74aec86a677/Microsoft.Azure.Cosmos/src/Query/v3Query/CosmosQueryClientCore.cs#L163)
4. Executing the query(sending it to gateway or a partition):  Applies the[ correct serialization ](https://github.com/Azure/azure-cosmos-dotnet-v3/blob/c8935ac2f864fb829f5d941dda07c74aec86a677/Microsoft.Azure.Cosmos/src/Query/v3Query/CosmosQueryClientCore.cs#L123)

### Why didn't testing catch this?
There are existing[ tests which validate this scenario](https://github.com/Azure/azure-cosmos-dotnet-v3/blob/1d1d4c753cae896e6d96a98ef07a276cf1e4f130/Microsoft.Azure.Cosmos/tests/Microsoft.Azure.Cosmos.EmulatorTests/CosmosItemTests.cs#L839), but they are not calculating the correct expected count. The test assumed the query pipeline would only serialize the SqlQuerySpec once. The implementation serializes it for every page request to the backend and to generate the query plan.

### Who is impacted?
Only customers that are running on Windows x64 and have custom serialization that filters on the partition key value only in the query. If the partition key is provided in the request options then it is handled correctly because request option overrides the query text.

### Impact:
This could result in the query being sent to the wrong partitions' which would cause it query returning possibly less results  than expected because it would be routed to the wrong partition.

### Solution:
Based on the current models and contract the serialization logic should not be in the query code. The query pipeline should not know or care about how the parameters are serialized. 

To keep this abstraction in place the serialization logic will be moved the CosmosQueryClientCore.cs like all of the other places that currently handle the custom serialization. This keeps all the serialization logic in the same file, and keeps it the same for all the different methods to get the query plan. 

This will cause the serialized string to be passed down instead of the SqlQuerySpec. This is a better contract because the service interop only requires the serialized string. All the contracts now match what is actually needed for it to execute getting the query plan. This follows the same model as getting it from gateway or sending it to be executed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Query: Missing results with custom serializer or streams in query definition
5 participants