Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for parameterized SQL queries for CosmosDB #18332

Closed
lmorchard opened this issue Jun 7, 2022 · 28 comments · Fixed by #18577
Closed

Add support for parameterized SQL queries for CosmosDB #18332

lmorchard opened this issue Jun 7, 2022 · 28 comments · Fixed by #18577
Assignees
Labels
Client This issue points to a problem in the data-plane of the library. Cosmos customer-reported Issues that are reported by GitHub users external to the Azure organization. feature-request This issue requires a new behavior in the product in order be resolved. needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team

Comments

@lmorchard
Copy link

Feature Request

Azure Cosmos DB supports queries with parameters.

However, ContainerClient.NewQueryItemsPager in azcosmos only supports a query string without accompanying parameters.

It would be handy to be able supply parameters to mitigate SQL injection.

@ghost ghost added needs-triage Workflow: This is a new issue that needs to be triaged to the appropriate team. customer-reported Issues that are reported by GitHub users external to the Azure organization. question The issue doesn't require a change to the product in order to be resolved. Most issues start as that labels Jun 7, 2022
@jhendrixMSFT jhendrixMSFT added Cosmos feature-request This issue requires a new behavior in the product in order be resolved. labels Jun 8, 2022
@ghost ghost removed the needs-triage Workflow: This is a new issue that needs to be triaged to the appropriate team. label Jun 8, 2022
@jhendrixMSFT jhendrixMSFT removed the question The issue doesn't require a change to the product in order to be resolved. Most issues start as that label Jun 8, 2022
@RickWinter RickWinter added the Client This issue points to a problem in the data-plane of the library. label Jun 8, 2022
@ghost ghost added the needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team label Jun 8, 2022
@ealsur
Copy link
Member

ealsur commented Jun 8, 2022

@RickWinter what is the guidance for Parametrized Queries? Is there a type that can be leveraged from azcore?

We could roll our own but I wonder if other SDKs would benefit from them.

@RickWinter
Copy link
Member

We do not have a parametrized query pattern or type in azcore. I agree it makes sense to have a solution that could be used by other services. Do you have a proposed solution you want to kick start the design-discussion with?

cc: @JeffreyRichter @jhendrixMSFT

@JeffreyRichter
Copy link
Member

@ealsur should consider this

@ItalyPaleAle
Copy link

+1 on this. It's currently a blocker to migrate Dapr to the azcosmos SDK.

@ealsur
Copy link
Member

ealsur commented Jul 11, 2022

@ItalyPaleAle can you clarify how is this a blocker? Also this is mostly a security feature, it all ends up being converted to a string query.

EDIT: The previously linked library does have parameter support

@ItalyPaleAle
Copy link

Hi @ealsur the previous library (a8m/documentdb) does have parameter support. We need to maintain compatibility with the current capabilities (@RyanLettieri is working on this migration).

As a security feature, it's important because it allows preventing (No-)SQL injections.

@ealsur
Copy link
Member

ealsur commented Jul 11, 2022

@RickWinter @JeffreyRichter currently the query method supports only string:

NewQueryItemsPager(query string, partitionKey PartitionKey, o *QueryOptions) *runtime.Pager[QueryItemsResponse]

The Cosmos DB Service already supports queries with parameters on the REST API level: https://docs.microsoft.com/en-us/rest/api/cosmos-db/query-documents#body

Option 1

Leave the public surface API as is (using string) but create a struct Query with the query text and parameters (similar to https://github.com/a8m/documentdb/blob/master/query.go) and implement the Stringer interface.

The pro is that we don't change the public surface, the con is that the Stringer implementation makes it so the SDK needs to do parameter replacing in order to construct the final query text, when it's something that it can defer to the service on Option 2.

Option 2

Change the public surface to always take a Query:

NewQueryItemsPager(query Query, partitionKey PartitionKey, o *QueryOptions) *runtime.Pager[QueryItemsResponse]

The Query will always have a query text and has optional parameters.

The pro is that this maps with the service payload, no need to do parameter replacement on the client. The con is that we change the public surface (but this SDK is in beta anyway).

Thoughts?

@JeffreyRichter
Copy link
Member

Do Option #1; not Option #2.

@ItalyPaleAle
Copy link

Can I propose option #3 as an alternative that is backwards-compatible too? Keep query as string, and add the ability to pass parameters as QueryOption values.

@ealsur
Copy link
Member

ealsur commented Jul 11, 2022

Something like this @ItalyPaleAle ?

Option 3

Add a QueryParameter type like:

type QueryParameter struct {
	Name  string
	Value string
}

Into the QueryOptions like:

type QueryOptions struct {
	.... existing options...

        QueryParameters QueryParameter[]
}

@JeffreyRichter What do you think about this one? Do you still recommend Option 1? I am fine any way we decide.

@jhendrixMSFT
Copy link
Member

Or this.

// QueryOptions includes options for query operations on items.
type QueryOptions struct {
	.... existing options ....
	// Parameters contains optional HTTP query parameters to include with the request.
	Parameters url.Values
}

@ItalyPaleAle
Copy link

IMHO I wouldn't use url.Values because they are not meant to be serialized in a URL. But yes. That's an option.

No strong preference on option 1 or 3. Just wanted to toss it on the table too :)

@jhendrixMSFT
Copy link
Member

Maybe I'm missing something here, but that's what the Encode() method is for.

@ealsur
Copy link
Member

ealsur commented Jul 11, 2022

The query parameters are not sent in the REST API URL, but as part of the body: https://docs.microsoft.com/en-us/rest/api/cosmos-db/query-documents#body

@ItalyPaleAle
Copy link

There's also option #4 now that I think of it, which is to use generics to pass either a string or a struct :)

@JeffreyRichter
Copy link
Member

So the method would take both a string and values in an options struct? If both were passed, what should the method do? If this is what is being proposed, I'm against it.

If the values is ultimately turned into a string which the current operation expects, then we could add a new method that calls the method we already have

@JeffreyRichter
Copy link
Member

Generics is not a potential fix for this. The code inside the method would have to use reflection to make this work.

@ealsur
Copy link
Member

ealsur commented Jul 11, 2022

From the REST API perspective, queries should always be executed as a POST with the query in the body (not url).

Looking at queries from the REST API payload, queries can be simple queries (not parametrized):

{  
  "query": "SELECT * FROM Families f WHERE f.id = 'AndersenFamily' "
}

Or parametrized:

{  
  "query": "SELECT * FROM Families f WHERE f.id = @id AND f.Address.City = @city",  
  "parameters": [  
    {  
      "name": "@id",  
      "value": "AndersenFamily"  
    },  
    {  
      "name": "@city",  
      "value": "Seattle"  
    }  
  ]  
}

The service will do the parameter replacement on their end, there is no need for the client to do the parameter replacement.

The current SDK public API takes a string and what we are currently doing is use that string as the query property in the body. We do not expose a way to set the parameters. That still allows users to execute filtered queries but for those that want more secure parameter handling to avoid injection, we currently don't give them an option.

Currently users can do this:

NewQueryItemsPager("SELECT * FROM Families f WHERE f.id = 'AndersenFamily'", partitionKey , o) 

And we will take the query and construct the body in the request.

Option 1 - Stringer

Assuming we create the Query struct that implements String() and generates the full JSON payload of a parametrized query.

The only problem with this approach is that for the SDK, it's hard to distinguish between a string that is a simple query (first case, like "SELECT * FROM Families f") from the Query string representation which would be a JSON structure when deciding how to construct the REST API payload.

Option 2 - Use only Query

Replace string for Query, it's a breaking change but within the beta.

NewQueryItemsPager(query Query, partitionKey PartitionKey, o *QueryOptions) *runtime.Pager[QueryItemsResponse]

This allows users that want to execute a simple query or a complex (parametrized) query to use the same API.

Option 3 - Parameters as optional

The parameters in the query are in the QueryOptions:

o := &QueryOptions {
    Parameters: []QueryParameter{
        NewQueryParameter("id", "AndersenFamily"),
        NewQueryParameter("city", "Seattle"),
    }
}
NewQueryItemsPager("SELECT * FROM Families f WHERE f.id = @id AND f.Address.City = @city", partitionKey , o) 

The Parameters only are used if the user wants to leverage parametrized queries, any invalid combination (using a parametrized query without parameters or viceversa) is validated by the service, not the client.


I like either Option 2 or 3. Trying to implement Option 1 so far is an issue because when constructing the REST API payload it's hard to distinguish between the 2 scenarios.

@JeffreyRichter
Copy link
Member

I see. What is the value that the service is providing by allowing the "parameters" property in the JSON body? This just lets teh service do the replacement instead of the client? If so, why is this useful?

@ealsur
Copy link
Member

ealsur commented Jul 11, 2022

According to the service docs: https://docs.microsoft.com/en-us/azure/cosmos-db/sql/sql-query-parameterized-queries

Azure Cosmos DB supports queries with parameters expressed by the familiar @ notation. Parameterized SQL provides robust handling and escaping of user input, and prevents accidental exposure of data through SQL injection.

From a REST API caller perspective, I guess it reduces the work a caller needs to do to validate and verify against injection attacks by unifying it as a service operation regardless of the language of the caller.

@ItalyPaleAle
Copy link

ItalyPaleAle commented Jul 11, 2022

@JeffreyRichter For us (Dapr), passing parameters to queries, rather than doing string replacements, is almost a strict requirement as it's the most effective way to prevent SQL injection.

I am not sure how things work internally in Cosmos DB, but for example in the case of databases like Postgres, when queries use parameters the database does not (to my understanding) ever perform a string replacement in the query, instead referencing the parameters when needed. This makes SQL injections physically impossible.

Even if Cosmos DB didn't work that way, and internally in the service it did perform a string replacement, we would still prefer letting Cosmos DB do that so we can trust it's done properly and safely (and if an update were needed, it'd be done automatically for us)

@JeffreyRichter
Copy link
Member

Thanks, I understand the situation better now. So yes, we can add a Parameters property to the QueryOptions structs. It should be a []struct{ name, value string }

@ealsur
Copy link
Member

ealsur commented Jul 11, 2022

Will send a PR tomorrow

@ItalyPaleAle
Copy link

One quick thing: per docs:

Parameter values can be any valid JSON: strings, numbers, Booleans, null, even arrays or nested JSON. Since Azure Cosmos DB is schemaless, parameters aren't validated against any type.

Can you make Value of type any please?

@JeffreyRichter
Copy link
Member

Sure, []struct{ name string, value any} is fine. The SDK will have to JSON serialize each and every value in the slice and serialization may fail returning the error back to the client.

@ealsur
Copy link
Member

ealsur commented Jul 12, 2022

@JeffreyRichter @jhendrixMSFT PR with API View with diff is now ready #18577

@ealsur
Copy link
Member

ealsur commented Jul 13, 2022

Will be included in the 0.3.2 release on the August 9th release train

@lmorchard
Copy link
Author

Also, belated thanks to everyone for implementing my feature request!

@github-actions github-actions bot locked and limited conversation to collaborators Apr 11, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Client This issue points to a problem in the data-plane of the library. Cosmos customer-reported Issues that are reported by GitHub users external to the Azure organization. feature-request This issue requires a new behavior in the product in order be resolved. needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants