pingcap · ti-chi-bot · May 19, 2021 · May 18, 2021 · May 18, 2021 · May 19, 2021
diff --git a/docs/design/2021-04-26-lock-view.md b/docs/design/2021-04-26-lock-view.md
@@ -1,7 +1,7 @@
 # TiDB Design Documents
 
 - Author(s): [longfangsong](https://github.com/longfangsong), [MyonKeminta](http://github.com/MyonKeminta)
-- Last updated: May 6, 2021
+- Last updated: May 18, 2021
 - Discussion PR: N/A
 - Tracking Issue: https://github.com/pingcap/tidb/issues/24199
 
@@ -35,11 +35,11 @@ Several tables will be provided in `information_schema`. Some tables has both lo
 
 | Field | Type | Comment |
 |------------|------------|---------|
-|`TRX_ID` | `unsigned bigint` | The transaction ID (aka. start ts) |
-|`TRX_STARTED`|`time`| Human readable start time of the transaction |
-|`DIGEST`|`text`| The digest of the current executing SQL statement |
-|`SQLS` | `text` | A list of all executed SQL statements' digests |
-|`STATE`| `enum('Running', 'Lock waiting', 'Committing', 'RollingBack')`| The state of the transaction |
+| `TRX_ID` | `unsigned bigint` | The transaction ID (aka. start ts) |
+| `TRX_STARTED`|`time`| Human readable start time of the transaction |
+| `DIGEST`|`text`| The digest of the current executing SQL statement |
+| `ALL_SQLS` | `text` | A list of all executed SQL statements' digests |
+| `STATE`| `enum('Running', 'Lock waiting', 'Committing', 'RollingBack')`| The state of the transaction |
 | `WAITING_START_TIME` | `time` | The elapsed time since the start of the current lock waiting (if any) |
 | `SCOPE` | `enum('Global', 'Local')` | The scope of the transaction |
 | `ISOLATION_LEVEL` | `enum('RR', 'RC')` | |
@@ -79,7 +79,7 @@ Several tables will be provided in `information_schema`. Some tables has both lo
 * Permission:
  * `PROCESS` privilege is needed to access this table.
 
-### Table `(CLUSTER_)DEAD_LOCK`
+### Table `(CLUSTER_)DEADLOCKS`
 
 | Field | Type | Comment |
 |------------|------------|---------|
@@ -88,15 +88,19 @@ Several tables will be provided in `information_schema`. Some tables has both lo
 | `TRY_LOCK_TRX_ID` | `unsigned bigint` | The transaction ID (start ts) of the transaction that's trying to acquire the lock |
 | `CURRENT_SQL_DIGEST` | `text` | The SQL that's being blocked |
 | `KEY` | `varchar` | The key that's being locked, but locked by another transaction in the deadlock event |
-| `SQLS` | `text` | A list of the digest of SQL statements that the transaction has executed |
+| `ALL_SQLS` | `text` | A list of the digest of SQL statements that the transaction has executed |
 | `TRX_HOLDING_LOCK` | `unsigned bigint` | The transaction that's currently holding the lock. There will be another record in the table with the same `DEADLOCK_ID` for that transaction. |
+| `RETRYABLE` | `bool` | Is the deadlock retryable. TiDB tries to determine if the current statement is (indirectly) waiting for a lock locked by the current statement. |
 
 * Life span of rows:
  * Create after TiDB receive a deadlock error
  * FIFO，clean the oldest after buffer is full
 * Collecting, storing and querying:
- * All of these information can be collected on TiDB side. It just need to add the information to the table when receives deadlock error from TiKV. The information of other transactions involved in the deadlock circle needed to be fetched from elsewhere (the `TIDB_TRX` table) when handling the deadlock error.
- * Currently there are no much information in the deadlock error (it doesn't has the SQLs and keys' information), which needs to be improved.
+ * All of these information can be collected on TiDB side. It just need to add the information to the table when receives deadlock error from TiKV. The information of other transactions involved in the deadlock circle needed to be fetched from elsewhere (the `CLUSTER_TIDB_TRX` table) when handling the deadlock error.
+ * TiKV needs to report more rich information in the deadlock error for collecting.
+ * There are two types of deadlock errors internally: retryable or non-retryable. The transaction will internally retry on retryable deadlocks and won't report error to the client. Therefore, the user are typically more interested in the non-retryable deadlocks. 
+ * Retryable deadlock errors are by default not collected, and can be enabled with configuration.
+ * Collecting `CLUSTER_TIDB_TRX` for more rich information for retryable deadlock is possible to make the performance worse. Whether it will be collected for retryable deadlock will be decided after some tests.
 * Permission:
  * `PROCESS` privilege is needed to access this table.
 
@@ -151,9 +155,25 @@ The locking key and `resource_group_tag` that comes from the `Context` of the pe
 
 The wait chain will be added to the `Deadlock` error which is returned by the `PessimisticLock` request, so that when deadlock happens, the full wait chain information can be passed to TiDB.
 
+### Configurations
+
+#### TiDB Config File `pessimistic-txn.tidb_deadlock_history_capacity`
+
+Specifies how many recent deadlock events each TiDB node should keep.
+Dynamically changeable via HTTP API.
+Value: 0 to 10000
+Default: 10
+
+#### TiDB Config File `pessimistic-txn.tidb_deadlock_history_collect_retryable`
+
+Specifies whether to collect retryable deadlock errors to the `(CLUSTER_)DEADLOCKS` table.
+Dynamically changeable via HTTP API.
+Value: 0 (do not collect) or 1 (collect)
+Default: 0
+
 ## Compatibility
 
-This feature is not expected to be incompatible with other features. During upgrading, when there are different versions of TiDB nodes exists at the same time, it's possible that the `CLUSTER_` prefixed tables may encounter errors. But since this feature is typically used by user manually, this shouldn't be a severe problem. So we don't need to care much about that.
+This feature is not expected to be incompatible with other features. During upgrading, when there are different versions of TiDB nodes exists at the same time, it's possible that the `CLUSTER_` prefixed tables may encounter errors. However, since this feature is typically used by user manually, this shouldn't be a severe problem. So we don't need to care much about that.
 
 ## Test Design
 
@@ -190,7 +210,7 @@ This feature is not expected to be incompatible with other features. During upgr
 
 * Since lock waiting on TiKV may timeout and retry, it's possible that in a single query to `DATA_LOCK_WAIT` table doesn't shows all (logical) lock waiting.
 * Information about internal transactions may not be collected in our first version of implementation.
-* Since TiDB need to query transaction information after it receives the deadlock error, the transactions' status may be changed during that time. As a result the information in `(CLUSTER_)DEAD_LOCK` table can't be promised to be accurate and complete.
+* Since TiDB need to query transaction information after it receives the deadlock error, the transactions' status may be changed during that time. As a result the information in `(CLUSTER_)DEADLOCKS` table can't be promised to be accurate and complete.
 * Statistics about transaction conflicts is still not enough.
 * Historical information of `TIDB_TRX` and `DATA_LOCK_WAITS` is not kept, which possibly makes it still difficult to investigate some kind of problems.
 * The SQL digest that's holding lock and blocking the current transaction is hard to retrieve and is not included in the current design.