-
Notifications
You must be signed in to change notification settings - Fork 138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature/multi_tenancy] Add ifSeqNo and ifPrimaryTerm for update concurrency #2605
[Feature/multi_tenancy] Add ifSeqNo and ifPrimaryTerm for update concurrency #2605
Conversation
87f8223
to
1cf4c43
Compare
Signed-off-by: Daniel Widdis <widdis@gmail.com>
1cf4c43
to
aa83319
Compare
Signed-off-by: Daniel Widdis <widdis@gmail.com>
aa83319
to
4cc2070
Compare
// TODO need to return SEQ_NO here | ||
// If document doesn't exist, increment and return highest seq no ever seen, but we would have to track seqNo here | ||
// If document never existed, return -2 (unassigned) for seq no (probably what we have to do here) | ||
// If document exists, increment and return SEQ_NO |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case of delete, why does seq no be updated? can we safely delete document as long as seq no matches?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@arjunkumargiri This is somewhat of an artifact of how "soft deletes" work in OpenSearch.
When you call the Delete API, it only marks the document as deleted in the metadata but it doesn't actually remove it until the segment is merged. It's actually still available in search results unless you use the refresh
parameter.
So it still tracks the seq_no
as a change to the document for replication/tracking purposes until the document actually goes away on a merge, at which point I assume that tracking disappears.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How can we distinguish between 'If the document doesn't exist' and 'If the document never existed'?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How can we distinguish between 'If the document doesn't exist' and 'If the document never existed'?
We can only do so on OpenSearch for some time period following the deletion.
However, this TODO is in the context of other data stores, and I'm not sure we can, or that we want to.
My intent in writing these comments was to document OpenSearch behavior, specifically:
- The
seq_no
is incremented when a document is deleted - The
seq_no
is maintained in the segment metadata for at least as long as it takes for all merge and indexing and replication operations to safely complete, as well as remaining part of the translog. - At some point, depending on usage of the cluster, the translog is eventually trimmed, and the reference to the document's
seq_no
is deleted.
From a practical standpoint, the only difference would be seen if someone created a new document with the same document ID as a previously deleted document, within the time period that the old seq_no
remains. In that case the value would be a number higher than 0.
I'd assume that a reasonable implementation for a deletion would be to increment and return the seq_no
, but then not track it any more, so the next creation would start again at 0. THat would match current behavior except in the rare edge case of deleting and recreating the same doc-id.
I'm not clear what the action item is? The code currently throws an exception and returns an error to the user; wouldn't it be exactly the same error? Are you referring to per-tenant keys or something else? |
So currently Optype is Index by default. But for config key creation, we want |
Got it... I'm not sure we want to use "OpType" generaly. "Upsert" is the standard language here. We probably want to implement conditional writes on the DynamoDB version. Let's align on a name for it... will track separately. |
a421f49
into
opensearch-project:feature/multi_tenancy
|
Description
Tracks SeqNo and PrimaryTerm used for strong concurrency for updates.
Still TODO (@arjunkumargiri will do in another PR):
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.