Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: assign txn deadline from extended, not initial, lease #24041

Closed
tbg opened this issue Mar 19, 2018 · 3 comments
Closed

sql: assign txn deadline from extended, not initial, lease #24041

tbg opened this issue Mar 19, 2018 · 3 comments
Assignees
Labels
A-schema-changes C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.

Comments

@tbg
Copy link
Member

tbg commented Mar 19, 2018

The drop roachtest currently sporadically fails on this line with a

transaction status error: deadline exceeded.

As I understand, this is due to the following:

  1. the transaction runs for a long time, and its deadline is set when it starts
  2. the transaction can be pushed forward in time significantly (but should still be able to commit due to the deadline)
  3. the table lease is either not extended (because there are no other new requests coming in for the table) or it is extended but the in-progress transaction does not get to use the new descriptor (even though it theoretically should be able to).

Either way, what should happen is that the sql subsystem realizes that there is a long-running transaction holding on to that lease, and extends the lease appropriately, while allowing the transaction to use the extended lease's deadline when it commits.

This is bound to be a relatively common issue with long-running transactions.

See also #23982 (comment)

cc @benesch for fact checking

@tbg tbg added the C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. label Mar 19, 2018
@tbg tbg added this to the 2.1 milestone Mar 19, 2018
m-schneider pushed a commit to m-schneider/cockroach that referenced this issue Mar 20, 2018
Before the drop test would fail trying to delete a large number of rows
in one transaction due to cockroachdb#24041.

Now we split the deletes into multiple statements. We're also checking
disk space to make sure it is cleared after all of the tpcc tables are
dropped.

Release note: None
@benesch
Copy link
Contributor

benesch commented Mar 20, 2018

Yeah, this is my understanding of the issue as well.

m-schneider pushed a commit to m-schneider/cockroach that referenced this issue Mar 22, 2018
Before the drop test would fail trying to delete a large number of rows
in one transaction due to cockroachdb#24041.

Now we split the deletes into multiple statements. We're also checking
disk space to make sure it is cleared after all of the tpcc tables are
dropped.

Release note: None
m-schneider pushed a commit to m-schneider/cockroach that referenced this issue Mar 22, 2018
Before the drop test would fail trying to delete a large number of rows
in one transaction due to cockroachdb#24041.

Now we split the deletes into multiple statements. We're also checking
disk space to make sure it is cleared after all of the tpcc tables are
dropped.

Release note: None
m-schneider pushed a commit to m-schneider/cockroach that referenced this issue Mar 26, 2018
Before the drop test would fail trying to delete a large number of rows
in one transaction due to cockroachdb#24041.

Now we split the deletes into multiple statements. We're also checking
disk space to make sure it is cleared after all of the tpcc tables are
dropped.

Release note: None
@vivekmenezes vivekmenezes added A-schema-changes and removed A-schema-descriptors Relating to SQL table/db descriptor handling. A-bulkio-schema-changes labels Jul 24, 2018
@vivekmenezes vivekmenezes removed this from the 2.1 milestone Jul 24, 2018
@andreimatei
Copy link
Contributor

Also see #18684 for a larger tracking bug for deadline issues.

vivekmenezes added a commit to vivekmenezes/cockroach that referenced this issue Aug 13, 2018
The change:
1. Sets the expiration of an epoch lease to a very large expiration
of 20 years. This should really be hlc.MaxTimestamp but is set to a
very high number because expiration is part of the primary key in the
lease table, and there are various conditions under which two leases
can be acquired for the same table simulatenously resulting in
one of them failing because of a uniqueness constraint violation.
2. The lease expiration of an epoch based lease is the expiration of the
epoch = the last time the epoch was seen + LeaseManager.leaseDuration.

related to cockroachdb#23978
related to cockroachdb#18684
related to cockroachdb#24041

Release note: None
@andreimatei
Copy link
Contributor

I'm going to close this in favor of #18684 which is the tracking bug for txn deadlines and improvements thereof.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-schema-changes C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
Projects
None yet
Development

No branches or pull requests

4 participants