Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Client Encryption: Adds fix to allow partition key path and id to be part of client encryption policy. #3211

Merged
merged 18 commits into from
Jun 9, 2022

Conversation

kr-santosh
Copy link
Contributor

@kr-santosh kr-santosh commented May 23, 2022

Description

The PR brings in the following changes.

  • Allow support to include partition key path and id during encryption policy creation (only deterministic encryption type) and bumps up the format version.

Type of change

Please delete options that are not relevant.

  • [] New feature (non-breaking change which adds functionality)
  • [] This change requires a documentation update

Closing issues

To automatically close an issue: closes #IssueNumber

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please follow the required format: "[Internal] Category: (Adds|Fixes|Refactors|Removes) Description"

Internal should be used for PRs that have no customer impact. This flag is used to help generate the changelog to know which PRs should be included. Examples:
Diagnostics: Adds GetElapsedClientLatency to CosmosDiagnostics
PartitionKey: Fixes null reference when using default(PartitionKey)
[v4] Client Encryption: Refactors code to external project
[Internal] Query: Adds code generator for CosmosNumbers for easy additions in the future.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please follow the required format: "[Internal] Category: (Adds|Fixes|Refactors|Removes) Description"

Internal should be used for PRs that have no customer impact. This flag is used to help generate the changelog to know which PRs should be included. Examples:
Diagnostics: Adds GetElapsedClientLatency to CosmosDiagnostics
PartitionKey: Fixes null reference when using default(PartitionKey)
[v4] Client Encryption: Refactors code to external project
[Internal] Query: Adds code generator for CosmosNumbers for easy additions in the future.

@kr-santosh kr-santosh changed the title Add support for new PartitionKey Constructor and allow partition key path and id to be part of client encryption policy. PartitionKey: Adds new PartitionKey constructor and allow partition key path and id to be part of client encryption policy. May 23, 2022
@kr-santosh kr-santosh changed the title PartitionKey: Adds new PartitionKey constructor and allow partition key path and id to be part of client encryption policy. PartitionKey: Adds new PartitionKey constructor and allows partition key path and id to be part of client encryption policy. May 23, 2022
@kr-santosh kr-santosh marked this pull request as ready for review May 23, 2022 09:28
@kr-santosh kr-santosh requested a review from abhijitpai May 23, 2022 11:25
Copy link
Member

@FabianMeiswinkel FabianMeiswinkel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My 2 cents: Don't even try to support non-strings as partition keys. numbers in json are evil (neither range nor precision are clearly standardized - so supported ranges/precisions depend on the json parsers/libraries used - and this can cause hard to diagnose and mitigate issues). The good thing for end-to-end encryption is that all workloads are green-field - no scenario where a customer has TBs of data already. I think we should simply only support strings, instead of hacking something together that might work in most scenarios but doesn't solve the underlying problem that json doesn't have a clear definition for numbers - that solves the encryption-specific challenge of having to know the type and prevents customers form running into the other issues with numbers as PK later.
To be honest only supporting strings as PK is what we should have done across the board - too late now, because it is a breaking change. But with encryption we have the benefit of a "fresh start" for every workload - so let's do it right this time.

@kr-santosh kr-santosh changed the title PartitionKey: Adds new PartitionKey constructor and allows partition key path and id to be part of client encryption policy. Client Encryption: Adds fix to allow partition key path and id to be part of client encryption policy. May 25, 2022
@kr-santosh kr-santosh marked this pull request as ready for review May 26, 2022 04:09
@kr-santosh kr-santosh closed this May 26, 2022
@kr-santosh kr-santosh reopened this May 26, 2022
@kr-santosh
Copy link
Contributor Author

Closed by mistake :)

@kr-santosh
Copy link
Contributor Author

My 2 cents: Don't even try to support non-strings as partition keys. numbers in json are evil (neither range nor precision are clearly standardized - so supported ranges/precisions depend on the json parsers/libraries used - and this can cause hard to diagnose and mitigate issues). The good thing for end-to-end encryption is that all workloads are green-field - no scenario where a customer has TBs of data already. I think we should simply only support strings, instead of hacking something together that might work in most scenarios but doesn't solve the underlying problem that json doesn't have a clear definition for numbers - that solves the encryption-specific challenge of having to know the type and prevents customers form running into the other issues with numbers as PK later. To be honest only supporting strings as PK is what we should have done across the board - too late now, because it is a breaking change. But with encryption we have the benefit of a "fresh start" for every workload - so let's do it right this time.

Fixed.

@abhijitpai
Copy link
Contributor

abhijitpai commented May 26, 2022

    public IEnumerable<ClientEncryptionIncludedPath> IncludedPaths

Please add a comment on this class, something like: 'The <see cref=ClientEncryptionPolicy should be initialized with policyFormatVersion 2 if "id" or properties that are part of the partition key need to be encrypted'. Also we need to mention that all PK property values need to be JSON strings. Goal is that this documentation should suffice to let customers know how to use the functionality. Ideally add some examples too where possible.


Refers to: Microsoft.Azure.Cosmos/src/Resource/Settings/ClientEncryptionPolicy.cs:40 in cba1c4d. [](commit_id = cba1c4d, deletion_comment = False)

Copy link
Contributor

@abhijitpai abhijitpai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@kr-santosh kr-santosh requested review from j82w and abhijitpai June 1, 2022 09:55
Copy link
Member

@ealsur ealsur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kirankumarkolli Please review public API change

Copy link
Member

@FabianMeiswinkel FabianMeiswinkel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks - looks good to me now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants