You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The StorageNode includes a LogStreamReplicaMetadata RPC that supplies metadata for a log stream replica. This RPC's response utilizes local low and high watermarks to denote log entries' first and last positions. These markers are pivotal in identifying the log range within a stream at any moment. Initially, both watermarks are set to {LLSN: 0, GLSN: 0}, indicating an empty log stream. As logs are appended, the watermarks are updated to reflect the stored log range accurately.
A notable limitation emerges when all logs within a stream are trimmed, leading both watermarks to revert to {LLSN: 0, GLSN: 0}. This reversion makes it challenging to differentiate between a newly created log stream and one that has undergone complete trimming since both instances display identical watermark values. As a result, means are absent for users to discern previously stored historical log sequence numbers (LSNs) or to predict future LSNs based on current watermark values.
Example for Clarification
Initial State: A new log stream starts with no logs, and both local low and high watermarks are {LLSN: 0, GLSN: 0}.
After Writing Logs: Adding ten logs to the stream adjusts the watermarks to indicate the new range, setting the local low watermark at {LLSN: 1, GLSN: 1} and the high watermark at {LLSN: 10, GLSN: 10}.
After Trimming Logs: Removing the first five logs changes the watermarks to {LLSN: 6, GLSN: 6} for the low and {LLSN: 10, GLSN: 10} for the high, indicating the presence of logs 6 through 10.
After Complete Trimming: Trimming all logs from the stream resets the watermarks to {LLSN: 0, GLSN: 0}. This obscures the history of operations and future log position.
Proposal for Improvement
To overcome this issue, I suggest redefining the local low and high watermarks as follows:
Local Low Watermark: Should indicate the position following the last trimmed log among stored logs. For a newly initiated log stream, it would be set to 1. After trimming all logs entirely, it would adjust to 11 in the given example, denoting the starting position for future logs.
Local High Watermark: Should mark the position succeeding the last stored log, indicating where the following log entry will be placed. Like the low watermark, it would adjust to 11 after complete trimming, ensuring consistency in the log range definition.
This redefinition guarantees that:
A local low and high watermark of {LLSN: 1, GLSN: 1} signals a newly initiated log stream.
Identical local low and high watermarks, aside from {LLSN: 1, GLSN: 1}, indicate a fully trimmed log stream.
Impact
Implementing these changes will substantially improve the clarity and utility of our log stream data. It will enable a more intuitive understanding of a log stream's current status and history, especially concerning trimming operations.
Alternatives
We could maintain the current semantics of local low and high watermarks and introduce an additional field in the LogStreamReplicaMetadataDescriptor to signal the following log sequence number for storage. This approach would keep the initial and fully trimmed log stream's local low and high watermarks at {LLSN: 0, GLSN: 0}, while the following log sequence number would start at {LLSN:1, GLSN: 1} for a new stream and shift to {LLSN: 11, GLSN: 11} post-trimming, facilitating differentiation between new and fully trimmed streams by this subsequent log sequence number. Although this requires adding a new field to the LogStreamReplicaMetadata response, it preserves the current watermark semantics.
Request for Comments
I invite other developers to offer feedback on this proposal, focusing on potential implications or enhancements that could further refine our approach to managing and representing log stream watermarks.
The text was updated successfully, but these errors were encountered:
Description
The
StorageNode
includes aLogStreamReplicaMetadata
RPC that supplies metadata for a log stream replica. This RPC's response utilizes local low and high watermarks to denote log entries' first and last positions. These markers are pivotal in identifying the log range within a stream at any moment. Initially, both watermarks are set to{LLSN: 0, GLSN: 0}
, indicating an empty log stream. As logs are appended, the watermarks are updated to reflect the stored log range accurately.A notable limitation emerges when all logs within a stream are trimmed, leading both watermarks to revert to
{LLSN: 0, GLSN: 0}
. This reversion makes it challenging to differentiate between a newly created log stream and one that has undergone complete trimming since both instances display identical watermark values. As a result, means are absent for users to discern previously stored historical log sequence numbers (LSNs) or to predict future LSNs based on current watermark values.Example for Clarification
{LLSN: 0, GLSN: 0}
.{LLSN: 1, GLSN: 1}
and the high watermark at{LLSN: 10, GLSN: 10}
.{LLSN: 6, GLSN: 6}
for the low and{LLSN: 10, GLSN: 10}
for the high, indicating the presence of logs 6 through 10.{LLSN: 0, GLSN: 0}
. This obscures the history of operations and future log position.Proposal for Improvement
To overcome this issue, I suggest redefining the local low and high watermarks as follows:
This redefinition guarantees that:
{LLSN: 1, GLSN: 1}
signals a newly initiated log stream.{LLSN: 1, GLSN: 1}
, indicate a fully trimmed log stream.Impact
Implementing these changes will substantially improve the clarity and utility of our log stream data. It will enable a more intuitive understanding of a log stream's current status and history, especially concerning trimming operations.
Alternatives
We could maintain the current semantics of local low and high watermarks and introduce an additional field in the
LogStreamReplicaMetadataDescriptor
to signal the following log sequence number for storage. This approach would keep the initial and fully trimmed log stream's local low and high watermarks at{LLSN: 0, GLSN: 0}
, while the following log sequence number would start at{LLSN:1, GLSN: 1}
for a new stream and shift to{LLSN: 11, GLSN: 11}
post-trimming, facilitating differentiation between new and fully trimmed streams by this subsequent log sequence number. Although this requires adding a new field to theLogStreamReplicaMetadata
response, it preserves the current watermark semantics.Request for Comments
I invite other developers to offer feedback on this proposal, focusing on potential implications or enhancements that could further refine our approach to managing and representing log stream watermarks.
The text was updated successfully, but these errors were encountered: