Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add closed timestamps to architecture docs #11762

Merged
merged 1 commit into from
Sep 30, 2021

Conversation

rmloveland
Copy link
Contributor

Fixes #6793

Addresses #9040

Also a prerequisite for #11016 (we can't really get into the details of
bounded staleness without more of this background info)

Summary of changes:

  • Update the 'Reading' section of the Transaction Layer page to list
    consistent vs. stale reads

  • Add a new 'Closed timestamp' section to the Transaction Layer page

  • Link to the above from the Follower Reads page

  • Update the transaction retry errors page to link to the new closed
    timestamp section of the Transaction Layer page

  • Opportunistic update to the 'Timestamp Cache' section of the
    Transaction Layer page based on the contents of the blog post 'An Epic
    Read on Follower Reads'1 (on which this PR is based)

  • Update the 'Read refreshing' section of the Transaction Layer page to
    add links to closed timestamps and to link to more docs on the
    transaction retry error type mentioned there.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@netlify
Copy link

netlify bot commented Sep 23, 2021

✔️ Netlify Preview

🔨 Explore the source changes: 07874e6

🔍 Inspect the deploy log: https://app.netlify.com/sites/cockroachdb-docs/deploys/6155e87816d94e00081b7ce1

😎 Browse the preview: https://deploy-preview-11762--cockroachdb-docs.netlify.app

@rmloveland
Copy link
Contributor Author

Hey Andrei and Nathan,

Added you both to review as you are the authors of the epic blog post from which much of this info was taken. Feel free to unassign or punt me to other reviewers if needed !

Copy link
Contributor

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:LGTM: thanks!

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @nvanbenschoten and @rmloveland)


v21.2/architecture/transaction-layer.md, line 47 at r1 (raw file):

CockroachDB provides the following types of reads:

- Strongly-consistent (a.k.a. "non-stale") reads: These are the default and most common type of read. For example, read-write transactions will always perform strongly-consistent reads. These reads go through the [leaseholder](replication-layer.html#leases) and always return data that is correct and up to date.

If you're not gonna go into it more, I would strike the part about "read-write" transactions, because you cast doubt about what other transactions do. The thing I'd say in this paragraph is that these reads see all writes performed by writers that committed before the reading transaction started.


v21.2/architecture/transaction-layer.md, line 48 at r1 (raw file):

They can only be used in read-only transactions.

I'd make it concrete and say something about as of system time here.


v21.2/architecture/transaction-layer.md, line 101 at r1 (raw file):

As part of providing serializability, whenever an operation reads a value, we store the operation's timestamp in a timestamp cache, which shows the high-water mark for values being read.

The timestamp cache is a data structure used to store information about the reads performed by [leaseholders](replication-layer.html#leases). This is used to ensure that once some transaction *t1* reads a row, another transaction *t2* that comes along and tries to write to that row will be ordered after *t1*, thus ensuring a serial order of transactions, a.k.a. serializability.

nicely put


v21.2/architecture/transaction-layer.md, line 105 at r1 (raw file):

Whenever a write occurs, its timestamp is checked against the timestamp cache. If the timestamp is less than the timestamp cache's latest value, we attempt to push the timestamp for its transaction forward to a later time. Pushing the timestamp might cause the transaction to restart in the second phase of the transaction (see [read refreshing](#read-refreshing)).

### Closed timestamps

consider linking to the blog post somewhere in this section


v21.2/architecture/transaction-layer.md, line 107 at r1 (raw file):

advances forward by some target interval behind the current time

perhaps "advances continuously, and lag the current time by a target interval".


v21.2/architecture/transaction-layer.md, line 107 at r1 (raw file):

it rejects the write

"rejects" sounds harsh if you don't qualify it. Consider saying that a transaction trying to write below the closed ts is forced to change its timestamp (which might result in a retry error), and link to "refreshing" if we talk about that anywhere.


v21.2/architecture/transaction-layer.md, line 115 at r1 (raw file):

Once the follower replica has applied the abovementioned Raft commands, it has all the data necessary to serve reads with timestamps less than or equal to the closed timestamp.

Note that closed timestamps are durable even if the leaseholder changes; in other words, they are preserved across [lease transfers](replication-layer.html#epoch-based-leases-table-data).

Unfortunately, they currently are not "durable" in the usual meaning of the word, which is "persisted to disk; recoverable after node restart". Hopefully they will be, one day.
I'd say that closed timestamps are valid even as leaseholders change; a new leaseholder will not break a promise made by an old leaseholder.

Copy link
Contributor Author

@rmloveland rmloveland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review Andrei! Updated based on your comments, thanks for the good clarifications & suggestions

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @andreimatei and @nvanbenschoten)


v21.2/architecture/transaction-layer.md, line 47 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

If you're not gonna go into it more, I would strike the part about "read-write" transactions, because you cast doubt about what other transactions do. The thing I'd say in this paragraph is that these reads see all writes performed by writers that committed before the reading transaction started.

Updated to remove the "read-write" comment and use that language


v21.2/architecture/transaction-layer.md, line 48 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…
They can only be used in read-only transactions.

I'd make it concrete and say something about as of system time here.

Updated to "read-only transactions that use AOST (link)"


v21.2/architecture/transaction-layer.md, line 101 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

nicely put

Thanks! Though I think it's mostly adapted from your epic blog post :-)


v21.2/architecture/transaction-layer.md, line 105 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

consider linking to the blog post somewhere in this section

Added "For more information about the implementation of closed timestamps and Follower Reads, see our blog post An Epic Read on Follower Reads" to the end of the section


v21.2/architecture/transaction-layer.md, line 107 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…
advances forward by some target interval behind the current time

perhaps "advances continuously, and lag the current time by a target interval".

Updated to say that


v21.2/architecture/transaction-layer.md, line 107 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…
it rejects the write

"rejects" sounds harsh if you don't qualify it. Consider saying that a transaction trying to write below the closed ts is forced to change its timestamp (which might result in a retry error), and link to "refreshing" if we talk about that anywhere.

Updated to "If the range receives a write at a timestamp less than its closed timestamp, the write is forced to change its timestamp, which might result in a retry error (see read refreshing)."


v21.2/architecture/transaction-layer.md, line 115 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

Unfortunately, they currently are not "durable" in the usual meaning of the word, which is "persisted to disk; recoverable after node restart". Hopefully they will be, one day.
I'd say that closed timestamps are valid even as leaseholders change; a new leaseholder will not break a promise made by an old leaseholder.

Revised based on this comment to say

"Note that closed timestamps are valid even if the leaseholder changes, since they are preserved across lease transfers. Once a lease transfer occurs, the new leaseholder will not break the closed timestamp promise made by the old leaseholder."

Copy link
Contributor

@taroface taroface left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly had minor terminology/clarification questions around "timestamp", but otherwise lgtm!

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @andreimatei, @nvanbenschoten, and @rmloveland)


v21.2/follower-reads.md, line 25 at r2 (raw file):

"closed timestamp"

Should this be quotes or italics? You use the latter below


v21.2/architecture/replication-layer.md, line 76 at r2 (raw file):

a.k.a.

Nit: aka


v21.2/architecture/replication-layer.md, line 76 at r2 (raw file):

incurring networking round trips

Should this be something like "incurring latency from networking round trips"?


v21.2/architecture/transaction-layer.md, line 47 at r2 (raw file):

a.k.a.

Nit: aka


v21.2/architecture/transaction-layer.md, line 47 at r2 (raw file):

up to date

Nit: up-to-date


v21.2/architecture/transaction-layer.md, line 101 at r2 (raw file):

a.k.a.

Nit: aka


v21.2/architecture/transaction-layer.md, line 107 at r2 (raw file):

"closed timestamp"

Ditto: quotes or italics?


v21.2/architecture/transaction-layer.md, line 107 at r2 (raw file):

retry error

transaction retry error?


v21.2/architecture/transaction-layer.md, line 107 at r2 (raw file):

below that timestamp

Is "below" a common timestamp term? (Disregard if so.)

If not, would "before" (or "earlier than", "prior to") that timestamp be clearer?


v21.2/architecture/transaction-layer.md, line 107 at r2 (raw file):

a timestamp less than its closed timestamp

Ditto - "less" also seems to have a different meaning than "below"; would "a timestamp with a lower value than" be too much?


v21.2/architecture/transaction-layer.md, line 109 at r2 (raw file):

t will not accept writes below that timestamp.

Ditto


v21.2/architecture/transaction-layer.md, line 111 at r2 (raw file):

closed timestamps

Necessary to hyphenate? Not sure


v21.2/architecture/transaction-layer.md, line 111 at r2 (raw file):

piggy-backing

Happy to report that "piggybacking" is one (1) word!


v21.2/architecture/transaction-layer.md, line 113 at r2 (raw file):

timestamps less than or equal to

Same general terminology question as above, although the meaning here is very clear to me


v21.2/architecture/transaction-layer.md, line 115 at r2 (raw file):

closed timestamp

Ditto re: hyphenation


v21.2/architecture/transaction-layer.md, line 412 at r2 (raw file):

closed timestamp

Maybe ital this?

Fixes #6793

Addresses #9040

Also a prerequisite for #11016 (we can't really get into the details of
bounded staleness without more of this background info)

Summary of changes:

- Update the 'Reading' section of the Transaction Layer page to list
  consistent vs. stale reads

- Add a new 'Closed timestamp' section to the Transaction Layer page

- Link to the above from the Follower Reads page

- Update the transaction retry errors page to link to the new closed
  timestamp section of the Transaction Layer page

- Opportunistic update to the 'Timestamp Cache' section of the
  Transaction Layer page based on the contents of the blog post 'An Epic
  Read on Follower Reads'[1] (on which this PR is based)

- Update the 'Read refreshing' section of the Transaction Layer page to
  add links to closed timestamps and to link to more docs on the
  transaction retry error type mentioned there.

[1]: https://www.cockroachlabs.com/blog/follower-reads-stale-data/

(Added to v21.1 and v21.2 docs)
Copy link
Contributor Author

@rmloveland rmloveland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Ryan!!!

PS "Piggybacking" finally made it in! Yay!!!

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @andreimatei, @nvanbenschoten, and @taroface)


v21.2/follower-reads.md, line 25 at r2 (raw file):

Previously, taroface (Ryan Kuo) wrote…
"closed timestamp"

Should this be quotes or italics? You use the latter below

Changed to use italics for first use on the Follower Reads page


v21.2/architecture/replication-layer.md, line 76 at r2 (raw file):

Previously, taroface (Ryan Kuo) wrote…
a.k.a.

Nit: aka

Updated


v21.2/architecture/replication-layer.md, line 76 at r2 (raw file):

Previously, taroface (Ryan Kuo) wrote…
incurring networking round trips

Should this be something like "incurring latency from networking round trips"?

Updated to say that!


v21.2/architecture/transaction-layer.md, line 47 at r2 (raw file):

Previously, taroface (Ryan Kuo) wrote…
a.k.a.

Nit: aka

Updated!


v21.2/architecture/transaction-layer.md, line 47 at r2 (raw file):

Previously, taroface (Ryan Kuo) wrote…
up to date

Nit: up-to-date

Updated


v21.2/architecture/transaction-layer.md, line 101 at r2 (raw file):

Previously, taroface (Ryan Kuo) wrote…
a.k.a.

Nit: aka

Updated


v21.2/architecture/transaction-layer.md, line 107 at r2 (raw file):

Previously, taroface (Ryan Kuo) wrote…
"closed timestamp"

Ditto: quotes or italics?

Updated to use italics upon first use - matches FR page now


v21.2/architecture/transaction-layer.md, line 107 at r2 (raw file):

Previously, taroface (Ryan Kuo) wrote…
retry error

transaction retry error?

yes, added


v21.2/architecture/transaction-layer.md, line 107 at r2 (raw file):

Previously, taroface (Ryan Kuo) wrote…
below that timestamp

Is "below" a common timestamp term? (Disregard if so.)

If not, would "before" (or "earlier than", "prior to") that timestamp be clearer?

Yeah it's in common usage - elsewhere we also talk about a "high water mark" or similar that is always rising


v21.2/architecture/transaction-layer.md, line 107 at r2 (raw file):

Previously, taroface (Ryan Kuo) wrote…
a timestamp less than its closed timestamp

Ditto - "less" also seems to have a different meaning than "below"; would "a timestamp with a lower value than" be too much?

I think "less than" and "below" are meant to be synonyms, but just using different words to avoid overusing "below"


v21.2/architecture/transaction-layer.md, line 109 at r2 (raw file):

Previously, taroface (Ryan Kuo) wrote…
t will not accept writes below that timestamp.

Ditto

Same as above


v21.2/architecture/transaction-layer.md, line 111 at r2 (raw file):

Previously, taroface (Ryan Kuo) wrote…
closed timestamps

Necessary to hyphenate? Not sure

I don't think so (but of course I wouldn't! :-} )


v21.2/architecture/transaction-layer.md, line 111 at r2 (raw file):

Previously, taroface (Ryan Kuo) wrote…
piggy-backing

Happy to report that "piggybacking" is one (1) word!

WHAT

ok updated

it has finally evolved into normal usage!@!!!!!1!


v21.2/architecture/transaction-layer.md, line 113 at r2 (raw file):

Previously, taroface (Ryan Kuo) wrote…
timestamps less than or equal to

Same general terminology question as above, although the meaning here is very clear to me

I think same as above this is just to provide another way of thinking about it, but essentially a synonym for "below"


v21.2/architecture/transaction-layer.md, line 115 at r2 (raw file):

Previously, taroface (Ryan Kuo) wrote…
closed timestamp

Ditto re: hyphenation

I don't think so?


v21.2/architecture/transaction-layer.md, line 412 at r2 (raw file):

Previously, taroface (Ryan Kuo) wrote…
closed timestamp

Maybe ital this?

I think I'll leave the italics for the first usage in the 'Closed Timestamps' section. Tbh I don't really know which is right, but that feels ok

@rmloveland rmloveland merged commit f4262b3 into master Sep 30, 2021
@rmloveland rmloveland deleted the 20210922-closed-timestamps branch September 30, 2021 16:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add closed timestamps to the architecture docs
4 participants