-
Notifications
You must be signed in to change notification settings - Fork 458
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add closed timestamps to architecture docs #11762
Conversation
✔️ Netlify Preview 🔨 Explore the source changes: 07874e6 🔍 Inspect the deploy log: https://app.netlify.com/sites/cockroachdb-docs/deploys/6155e87816d94e00081b7ce1 😎 Browse the preview: https://deploy-preview-11762--cockroachdb-docs.netlify.app |
Hey Andrei and Nathan, Added you both to review as you are the authors of the epic blog post from which much of this info was taken. Feel free to unassign or punt me to other reviewers if needed ! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @nvanbenschoten and @rmloveland)
v21.2/architecture/transaction-layer.md, line 47 at r1 (raw file):
CockroachDB provides the following types of reads: - Strongly-consistent (a.k.a. "non-stale") reads: These are the default and most common type of read. For example, read-write transactions will always perform strongly-consistent reads. These reads go through the [leaseholder](replication-layer.html#leases) and always return data that is correct and up to date.
If you're not gonna go into it more, I would strike the part about "read-write" transactions, because you cast doubt about what other transactions do. The thing I'd say in this paragraph is that these reads see all writes performed by writers that committed before the reading transaction started.
v21.2/architecture/transaction-layer.md, line 48 at r1 (raw file):
They can only be used in read-only transactions.
I'd make it concrete and say something about as of system time
here.
v21.2/architecture/transaction-layer.md, line 101 at r1 (raw file):
As part of providing serializability, whenever an operation reads a value, we store the operation's timestamp in a timestamp cache, which shows the high-water mark for values being read. The timestamp cache is a data structure used to store information about the reads performed by [leaseholders](replication-layer.html#leases). This is used to ensure that once some transaction *t1* reads a row, another transaction *t2* that comes along and tries to write to that row will be ordered after *t1*, thus ensuring a serial order of transactions, a.k.a. serializability.
nicely put
v21.2/architecture/transaction-layer.md, line 105 at r1 (raw file):
Whenever a write occurs, its timestamp is checked against the timestamp cache. If the timestamp is less than the timestamp cache's latest value, we attempt to push the timestamp for its transaction forward to a later time. Pushing the timestamp might cause the transaction to restart in the second phase of the transaction (see [read refreshing](#read-refreshing)). ### Closed timestamps
consider linking to the blog post somewhere in this section
v21.2/architecture/transaction-layer.md, line 107 at r1 (raw file):
advances forward by some target interval behind the current time
perhaps "advances continuously, and lag the current time by a target interval".
v21.2/architecture/transaction-layer.md, line 107 at r1 (raw file):
it rejects the write
"rejects" sounds harsh if you don't qualify it. Consider saying that a transaction trying to write below the closed ts is forced to change its timestamp (which might result in a retry error), and link to "refreshing" if we talk about that anywhere.
v21.2/architecture/transaction-layer.md, line 115 at r1 (raw file):
Once the follower replica has applied the abovementioned Raft commands, it has all the data necessary to serve reads with timestamps less than or equal to the closed timestamp. Note that closed timestamps are durable even if the leaseholder changes; in other words, they are preserved across [lease transfers](replication-layer.html#epoch-based-leases-table-data).
Unfortunately, they currently are not "durable" in the usual meaning of the word, which is "persisted to disk; recoverable after node restart". Hopefully they will be, one day.
I'd say that closed timestamps are valid even as leaseholders change; a new leaseholder will not break a promise made by an old leaseholder.
f523c3f
to
9795596
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review Andrei! Updated based on your comments, thanks for the good clarifications & suggestions
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @andreimatei and @nvanbenschoten)
v21.2/architecture/transaction-layer.md, line 47 at r1 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
If you're not gonna go into it more, I would strike the part about "read-write" transactions, because you cast doubt about what other transactions do. The thing I'd say in this paragraph is that these reads see all writes performed by writers that committed before the reading transaction started.
Updated to remove the "read-write" comment and use that language
v21.2/architecture/transaction-layer.md, line 48 at r1 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
They can only be used in read-only transactions.
I'd make it concrete and say something about
as of system time
here.
Updated to "read-only transactions that use AOST (link)"
v21.2/architecture/transaction-layer.md, line 101 at r1 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
nicely put
Thanks! Though I think it's mostly adapted from your epic blog post :-)
v21.2/architecture/transaction-layer.md, line 105 at r1 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
consider linking to the blog post somewhere in this section
Added "For more information about the implementation of closed timestamps and Follower Reads, see our blog post An Epic Read on Follower Reads" to the end of the section
v21.2/architecture/transaction-layer.md, line 107 at r1 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
advances forward by some target interval behind the current time
perhaps "advances continuously, and lag the current time by a target interval".
Updated to say that
v21.2/architecture/transaction-layer.md, line 107 at r1 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
it rejects the write
"rejects" sounds harsh if you don't qualify it. Consider saying that a transaction trying to write below the closed ts is forced to change its timestamp (which might result in a retry error), and link to "refreshing" if we talk about that anywhere.
Updated to "If the range receives a write at a timestamp less than its closed timestamp, the write is forced to change its timestamp, which might result in a retry error (see read refreshing)."
v21.2/architecture/transaction-layer.md, line 115 at r1 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
Unfortunately, they currently are not "durable" in the usual meaning of the word, which is "persisted to disk; recoverable after node restart". Hopefully they will be, one day.
I'd say that closed timestamps are valid even as leaseholders change; a new leaseholder will not break a promise made by an old leaseholder.
Revised based on this comment to say
"Note that closed timestamps are valid even if the leaseholder changes, since they are preserved across lease transfers. Once a lease transfer occurs, the new leaseholder will not break the closed timestamp promise made by the old leaseholder."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly had minor terminology/clarification questions around "timestamp", but otherwise lgtm!
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @andreimatei, @nvanbenschoten, and @rmloveland)
v21.2/follower-reads.md, line 25 at r2 (raw file):
"closed timestamp"
Should this be quotes or italics? You use the latter below
v21.2/architecture/replication-layer.md, line 76 at r2 (raw file):
a.k.a.
Nit: aka
v21.2/architecture/replication-layer.md, line 76 at r2 (raw file):
incurring networking round trips
Should this be something like "incurring latency from networking round trips"?
v21.2/architecture/transaction-layer.md, line 47 at r2 (raw file):
a.k.a.
Nit: aka
v21.2/architecture/transaction-layer.md, line 47 at r2 (raw file):
up to date
Nit: up-to-date
v21.2/architecture/transaction-layer.md, line 101 at r2 (raw file):
a.k.a.
Nit: aka
v21.2/architecture/transaction-layer.md, line 107 at r2 (raw file):
"closed timestamp"
Ditto: quotes or italics?
v21.2/architecture/transaction-layer.md, line 107 at r2 (raw file):
retry error
transaction retry error?
v21.2/architecture/transaction-layer.md, line 107 at r2 (raw file):
below that timestamp
Is "below" a common timestamp term? (Disregard if so.)
If not, would "before" (or "earlier than", "prior to") that timestamp be clearer?
v21.2/architecture/transaction-layer.md, line 107 at r2 (raw file):
a timestamp less than its closed timestamp
Ditto - "less" also seems to have a different meaning than "below"; would "a timestamp with a lower value than" be too much?
v21.2/architecture/transaction-layer.md, line 109 at r2 (raw file):
t will not accept writes below that timestamp.
Ditto
v21.2/architecture/transaction-layer.md, line 111 at r2 (raw file):
closed timestamps
Necessary to hyphenate? Not sure
v21.2/architecture/transaction-layer.md, line 111 at r2 (raw file):
piggy-backing
Happy to report that "piggybacking" is one (1) word!
v21.2/architecture/transaction-layer.md, line 113 at r2 (raw file):
timestamps less than or equal to
Same general terminology question as above, although the meaning here is very clear to me
v21.2/architecture/transaction-layer.md, line 115 at r2 (raw file):
closed timestamp
Ditto re: hyphenation
v21.2/architecture/transaction-layer.md, line 412 at r2 (raw file):
closed timestamp
Maybe ital this?
Fixes #6793 Addresses #9040 Also a prerequisite for #11016 (we can't really get into the details of bounded staleness without more of this background info) Summary of changes: - Update the 'Reading' section of the Transaction Layer page to list consistent vs. stale reads - Add a new 'Closed timestamp' section to the Transaction Layer page - Link to the above from the Follower Reads page - Update the transaction retry errors page to link to the new closed timestamp section of the Transaction Layer page - Opportunistic update to the 'Timestamp Cache' section of the Transaction Layer page based on the contents of the blog post 'An Epic Read on Follower Reads'[1] (on which this PR is based) - Update the 'Read refreshing' section of the Transaction Layer page to add links to closed timestamps and to link to more docs on the transaction retry error type mentioned there. [1]: https://www.cockroachlabs.com/blog/follower-reads-stale-data/ (Added to v21.1 and v21.2 docs)
9795596
to
07874e6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Ryan!!!
PS "Piggybacking" finally made it in! Yay!!!
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @andreimatei, @nvanbenschoten, and @taroface)
v21.2/follower-reads.md, line 25 at r2 (raw file):
Previously, taroface (Ryan Kuo) wrote…
"closed timestamp"
Should this be quotes or italics? You use the latter below
Changed to use italics for first use on the Follower Reads page
v21.2/architecture/replication-layer.md, line 76 at r2 (raw file):
Previously, taroface (Ryan Kuo) wrote…
a.k.a.
Nit: aka
Updated
v21.2/architecture/replication-layer.md, line 76 at r2 (raw file):
Previously, taroface (Ryan Kuo) wrote…
incurring networking round trips
Should this be something like "incurring latency from networking round trips"?
Updated to say that!
v21.2/architecture/transaction-layer.md, line 47 at r2 (raw file):
Previously, taroface (Ryan Kuo) wrote…
a.k.a.
Nit: aka
Updated!
v21.2/architecture/transaction-layer.md, line 47 at r2 (raw file):
Previously, taroface (Ryan Kuo) wrote…
up to date
Nit: up-to-date
Updated
v21.2/architecture/transaction-layer.md, line 101 at r2 (raw file):
Previously, taroface (Ryan Kuo) wrote…
a.k.a.
Nit: aka
Updated
v21.2/architecture/transaction-layer.md, line 107 at r2 (raw file):
Previously, taroface (Ryan Kuo) wrote…
"closed timestamp"
Ditto: quotes or italics?
Updated to use italics upon first use - matches FR page now
v21.2/architecture/transaction-layer.md, line 107 at r2 (raw file):
Previously, taroface (Ryan Kuo) wrote…
retry error
transaction retry error?
yes, added
v21.2/architecture/transaction-layer.md, line 107 at r2 (raw file):
Previously, taroface (Ryan Kuo) wrote…
below that timestamp
Is "below" a common timestamp term? (Disregard if so.)
If not, would "before" (or "earlier than", "prior to") that timestamp be clearer?
Yeah it's in common usage - elsewhere we also talk about a "high water mark" or similar that is always rising
v21.2/architecture/transaction-layer.md, line 107 at r2 (raw file):
Previously, taroface (Ryan Kuo) wrote…
a timestamp less than its closed timestamp
Ditto - "less" also seems to have a different meaning than "below"; would "a timestamp with a lower value than" be too much?
I think "less than" and "below" are meant to be synonyms, but just using different words to avoid overusing "below"
v21.2/architecture/transaction-layer.md, line 109 at r2 (raw file):
Previously, taroface (Ryan Kuo) wrote…
t will not accept writes below that timestamp.
Ditto
Same as above
v21.2/architecture/transaction-layer.md, line 111 at r2 (raw file):
Previously, taroface (Ryan Kuo) wrote…
closed timestamps
Necessary to hyphenate? Not sure
I don't think so (but of course I wouldn't! :-} )
v21.2/architecture/transaction-layer.md, line 111 at r2 (raw file):
Previously, taroface (Ryan Kuo) wrote…
piggy-backing
Happy to report that "piggybacking" is one (1) word!
WHAT
ok updated
it has finally evolved into normal usage!@!!!!!1!
v21.2/architecture/transaction-layer.md, line 113 at r2 (raw file):
Previously, taroface (Ryan Kuo) wrote…
timestamps less than or equal to
Same general terminology question as above, although the meaning here is very clear to me
I think same as above this is just to provide another way of thinking about it, but essentially a synonym for "below"
v21.2/architecture/transaction-layer.md, line 115 at r2 (raw file):
Previously, taroface (Ryan Kuo) wrote…
closed timestamp
Ditto re: hyphenation
I don't think so?
v21.2/architecture/transaction-layer.md, line 412 at r2 (raw file):
Previously, taroface (Ryan Kuo) wrote…
closed timestamp
Maybe ital this?
I think I'll leave the italics for the first usage in the 'Closed Timestamps' section. Tbh I don't really know which is right, but that feels ok
Fixes #6793
Addresses #9040
Also a prerequisite for #11016 (we can't really get into the details of
bounded staleness without more of this background info)
Summary of changes:
Update the 'Reading' section of the Transaction Layer page to list
consistent vs. stale reads
Add a new 'Closed timestamp' section to the Transaction Layer page
Link to the above from the Follower Reads page
Update the transaction retry errors page to link to the new closed
timestamp section of the Transaction Layer page
Opportunistic update to the 'Timestamp Cache' section of the
Transaction Layer page based on the contents of the blog post 'An Epic
Read on Follower Reads'1 (on which this PR is based)
Update the 'Read refreshing' section of the Transaction Layer page to
add links to closed timestamps and to link to more docs on the
transaction retry error type mentioned there.