-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
statistics: move history-related functions into the stats handle #55163
statistics: move history-related functions into the stats handle #55163
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #55163 +/- ##
================================================
+ Coverage 72.8365% 74.8445% +2.0079%
================================================
Files 1564 1569 +5
Lines 439076 447605 +8529
================================================
+ Hits 319808 335008 +15200
+ Misses 99580 92011 -7569
- Partials 19688 20586 +898
Flags with carried forward coverage won't be shown. Click here to find out more.
|
Failpoint test:
|
Tested locally:
#!/usr/bin/env -S cargo +nightly -Zscript
---cargo
[dependencies]
clap = { version = "4.2", features = ["derive"] }
sqlx = { version = "0.7", features = ["runtime-tokio-rustls", "mysql"] }
tokio = { version = "1", features = ["full"] }
fake = { version = "2.5", features = ["derive"] }
---
use clap::Parser;
use fake::{Fake, Faker};
use sqlx::mysql::MySqlPoolOptions;
#[derive(Parser, Debug)]
#[clap(version)]
struct Args {
#[clap(short, long, help = "MySQL connection string")]
database_url: String,
}
#[derive(Debug)]
struct TableRow {
partition_key: u32,
column1: String,
column2: i32,
column3: i32,
column4: String,
}
#[tokio::main]
async fn main() -> Result<(), sqlx::Error> {
let args = Args::parse();
let pool = MySqlPoolOptions::new()
.max_connections(5)
.connect(&args.database_url)
.await?;
// Create partitioned table if not exists
sqlx::query(
"CREATE TABLE IF NOT EXISTS t (
partition_key INT NOT NULL,
column1 VARCHAR(255) NOT NULL,
column2 INT NOT NULL,
column3 INT NOT NULL,
column4 VARCHAR(255) NOT NULL
) PARTITION BY RANGE (partition_key) (
PARTITION p0 VALUES LESS THAN (3000),
PARTITION p1 VALUES LESS THAN (6000),
PARTITION p2 VALUES LESS THAN (9000),
PARTITION p3 VALUES LESS THAN (12000),
PARTITION p4 VALUES LESS THAN (15000),
PARTITION p5 VALUES LESS THAN (18000),
PARTITION p6 VALUES LESS THAN (21000),
PARTITION p7 VALUES LESS THAN (24000),
PARTITION p8 VALUES LESS THAN (27000),
PARTITION p9 VALUES LESS THAN (30000),
PARTITION p10 VALUES LESS THAN (33000),
PARTITION p11 VALUES LESS THAN (36000),
PARTITION p12 VALUES LESS THAN (39000),
PARTITION p13 VALUES LESS THAN (42000),
PARTITION p14 VALUES LESS THAN (45000),
PARTITION p15 VALUES LESS THAN (48000),
PARTITION p16 VALUES LESS THAN (51000),
PARTITION p17 VALUES LESS THAN (54000),
PARTITION p18 VALUES LESS THAN (57000),
PARTITION p19 VALUES LESS THAN (60000),
PARTITION p20 VALUES LESS THAN (63000)
)"
)
.execute(&pool)
.await?;
// Insert 3000 rows into each of the 20 partitions
for partition in 1..=20 {
let partition_key = partition * 3000 + 1; // This ensures each partition key is unique
for _ in 0..3000 {
let row = TableRow {
partition_key, // Use the current partition key
column1: Faker.fake::<String>(),
column2: Faker.fake::<i32>(),
column3: Faker.fake::<i32>(),
column4: Faker.fake::<String>(),
};
sqlx::query(
"INSERT INTO t (partition_key, column1, column2, column3, column4)
VALUES (?, ?, ?, ?, ?)"
)
.bind(row.partition_key)
.bind(&row.column1)
.bind(row.column2)
.bind(row.column3)
.bind(&row.column4)
.execute(&pool)
.await?;
}
println!("Successfully inserted 3000 rows into partition {} of the 't' table.", partition);
}
Ok(())
}
mysql> select * from mysql.analyze_jobs;
+----+---------------------+--------------+------------+----------------+-------------------------------------------------------------+----------------+---------------------+---------------------+----------+-------------+----------------+------------+
| id | update_time | table_schema | table_name | partition_name | job_info | processed_rows | start_time | end_time | state | fail_reason | instance | process_id |
+----+---------------------+--------------+------------+----------------+-------------------------------------------------------------+----------------+---------------------+---------------------+----------+-------------+----------------+------------+
| 1 | 2024-08-05 14:32:28 | test | t | p10 | auto analyze table with 256 buckets, 100 topn, 1 samplerate | 3000 | 2024-08-05 14:32:28 | 2024-08-05 14:32:28 | finished | NULL | 127.0.0.1:4000 | NULL |
| 2 | 2024-08-05 14:32:28 | test | t | p11 | auto analyze table with 256 buckets, 100 topn, 1 samplerate | 3000 | 2024-08-05 14:32:28 | 2024-08-05 14:32:28 | finished | NULL | 127.0.0.1:4000 | NULL |
| 3 | 2024-08-05 14:32:28 | test | t | p13 | auto analyze table with 256 buckets, 100 topn, 1 samplerate | 3000 | 2024-08-05 14:32:28 | 2024-08-05 14:32:28 | finished | NULL | 127.0.0.1:4000 | NULL |
| 4 | 2024-08-05 14:32:28 | test | t | p16 | auto analyze table with 256 buckets, 100 topn, 1 samplerate | 3000 | 2024-08-05 14:32:28 | 2024-08-05 14:32:28 | finished | NULL | 127.0.0.1:4000 | NULL |
| 5 | 2024-08-05 14:32:28 | test | t | p5 | auto analyze table with 256 buckets, 100 topn, 1 samplerate | 3000 | 2024-08-05 14:32:28 | 2024-08-05 14:32:28 | finished | NULL | 127.0.0.1:4000 | NULL |
| 6 | 2024-08-05 14:32:28 | test | t | p3 | auto analyze table with 256 buckets, 100 topn, 1 samplerate | 3000 | 2024-08-05 14:32:28 | 2024-08-05 14:32:28 | finished | NULL | 127.0.0.1:4000 | NULL |
| 7 | 2024-08-05 14:32:28 | test | t | p15 | auto analyze table with 256 buckets, 100 topn, 1 samplerate | 3000 | 2024-08-05 14:32:28 | 2024-08-05 14:32:28 | finished | NULL | 127.0.0.1:4000 | NULL |
| 8 | 2024-08-05 14:32:28 | test | t | p19 | auto analyze table with 256 buckets, 100 topn, 1 samplerate | 3000 | 2024-08-05 14:32:28 | 2024-08-05 14:32:28 | finished | NULL | 127.0.0.1:4000 | NULL |
| 9 | 2024-08-05 14:32:28 | test | t | p20 | auto analyze table with 256 buckets, 100 topn, 1 samplerate | 3000 | 2024-08-05 14:32:28 | 2024-08-05 14:32:28 | finished | NULL | 127.0.0.1:4000 | NULL |
| 10 | 2024-08-05 14:32:28 | test | t | p1 | auto analyze table with 256 buckets, 100 topn, 1 samplerate | 3000 | 2024-08-05 14:32:28 | 2024-08-05 14:32:28 | finished | NULL | 127.0.0.1:4000 | NULL |
| 11 | 2024-08-05 14:32:28 | test | t | p8 | auto analyze table with 256 buckets, 100 topn, 1 samplerate | 3000 | 2024-08-05 14:32:28 | 2024-08-05 14:32:28 | finished | NULL | 127.0.0.1:4000 | NULL |
| 12 | 2024-08-05 14:32:28 | test | t | p14 | auto analyze table with 256 buckets, 100 topn, 1 samplerate | 3000 | 2024-08-05 14:32:28 | 2024-08-05 14:32:28 | finished | NULL | 127.0.0.1:4000 | NULL |
| 13 | 2024-08-05 14:32:28 | test | t | p7 | auto analyze table with 256 buckets, 100 topn, 1 samplerate | 3000 | 2024-08-05 14:32:28 | 2024-08-05 14:32:28 | finished | NULL | 127.0.0.1:4000 | NULL |
| 14 | 2024-08-05 14:32:28 | test | t | p4 | auto analyze table with 256 buckets, 100 topn, 1 samplerate | 3000 | 2024-08-05 14:32:28 | 2024-08-05 14:32:28 | finished | NULL | 127.0.0.1:4000 | NULL |
| 15 | 2024-08-05 14:32:28 | test | t | p6 | auto analyze table with 256 buckets, 100 topn, 1 samplerate | 3000 | 2024-08-05 14:32:28 | 2024-08-05 14:32:28 | finished | NULL | 127.0.0.1:4000 | NULL |
| 16 | 2024-08-05 14:32:28 | test | t | p9 | auto analyze table with 256 buckets, 100 topn, 1 samplerate | 3000 | 2024-08-05 14:32:28 | 2024-08-05 14:32:28 | finished | NULL | 127.0.0.1:4000 | NULL |
| 17 | 2024-08-05 14:32:28 | test | t | p12 | auto analyze table with 256 buckets, 100 topn, 1 samplerate | 3000 | 2024-08-05 14:32:28 | 2024-08-05 14:32:28 | finished | NULL | 127.0.0.1:4000 | NULL |
| 18 | 2024-08-05 14:32:28 | test | t | p17 | auto analyze table with 256 buckets, 100 topn, 1 samplerate | 3000 | 2024-08-05 14:32:28 | 2024-08-05 14:32:28 | finished | NULL | 127.0.0.1:4000 | NULL |
| 19 | 2024-08-05 14:32:28 | test | t | p18 | auto analyze table with 256 buckets, 100 topn, 1 samplerate | 3000 | 2024-08-05 14:32:28 | 2024-08-05 14:32:28 | finished | NULL | 127.0.0.1:4000 | NULL |
| 20 | 2024-08-05 14:32:28 | test | t | p2 | auto analyze table with 256 buckets, 100 topn, 1 samplerate | 3000 | 2024-08-05 14:32:28 | 2024-08-05 14:32:28 | finished | NULL | 127.0.0.1:4000 | NULL |
| 21 | 2024-08-05 14:32:28 | test | t | | merge global stats for test.t columns | 0 | 2024-08-05 14:32:28 | 2024-08-05 14:32:28 | finished | NULL | 127.0.0.1:4000 | NULL |
+----+---------------------+--------------+------------+----------------+-------------------------------------------------------------+----------------+---------------------+---------------------+----------+-------------+----------------+------------+
21 rows in set (0.01 sec)
mysql> set global tidb_analyze_version = 1;
Query OK, 0 rows affected (0.02 sec)
mysql> select @@tidb_analyze_version;
+------------------------+
| @@tidb_analyze_version |
+------------------------+
| 1 |
+------------------------+
1 row in set (0.00 sec)
mysql> analyze table t partition p0;
Query OK, 0 rows affected (0.30 sec)
mysql> select * from mysql.analyze_jobs;
+----+---------------------+--------------+------------+----------------+-------------------------------------------------------------+----------------+---------------------+---------------------+----------+-------------+----------------+------------+
| id | update_time | table_schema | table_name | partition_name | job_info | processed_rows | start_time | end_time | state | fail_reason | instance | process_id |
+----+---------------------+--------------+------------+----------------+-------------------------------------------------------------+----------------+---------------------+---------------------+----------+-------------+----------------+------------+
| 1 | 2024-08-05 14:32:28 | test | t | p10 | auto analyze table with 256 buckets, 100 topn, 1 samplerate | 3000 | 2024-08-05 14:32:28 | 2024-08-05 14:32:28 | finished | NULL | 127.0.0.1:4000 | NULL |
| 2 | 2024-08-05 14:32:28 | test | t | p11 | auto analyze table with 256 buckets, 100 topn, 1 samplerate | 3000 | 2024-08-05 14:32:28 | 2024-08-05 14:32:28 | finished | NULL | 127.0.0.1:4000 | NULL |
| 3 | 2024-08-05 14:32:28 | test | t | p13 | auto analyze table with 256 buckets, 100 topn, 1 samplerate | 3000 | 2024-08-05 14:32:28 | 2024-08-05 14:32:28 | finished | NULL | 127.0.0.1:4000 | NULL |
| 4 | 2024-08-05 14:32:28 | test | t | p16 | auto analyze table with 256 buckets, 100 topn, 1 samplerate | 3000 | 2024-08-05 14:32:28 | 2024-08-05 14:32:28 | finished | NULL | 127.0.0.1:4000 | NULL |
| 5 | 2024-08-05 14:32:28 | test | t | p5 | auto analyze table with 256 buckets, 100 topn, 1 samplerate | 3000 | 2024-08-05 14:32:28 | 2024-08-05 14:32:28 | finished | NULL | 127.0.0.1:4000 | NULL |
| 6 | 2024-08-05 14:32:28 | test | t | p3 | auto analyze table with 256 buckets, 100 topn, 1 samplerate | 3000 | 2024-08-05 14:32:28 | 2024-08-05 14:32:28 | finished | NULL | 127.0.0.1:4000 | NULL |
| 7 | 2024-08-05 14:32:28 | test | t | p15 | auto analyze table with 256 buckets, 100 topn, 1 samplerate | 3000 | 2024-08-05 14:32:28 | 2024-08-05 14:32:28 | finished | NULL | 127.0.0.1:4000 | NULL |
| 8 | 2024-08-05 14:32:28 | test | t | p19 | auto analyze table with 256 buckets, 100 topn, 1 samplerate | 3000 | 2024-08-05 14:32:28 | 2024-08-05 14:32:28 | finished | NULL | 127.0.0.1:4000 | NULL |
| 9 | 2024-08-05 14:32:28 | test | t | p20 | auto analyze table with 256 buckets, 100 topn, 1 samplerate | 3000 | 2024-08-05 14:32:28 | 2024-08-05 14:32:28 | finished | NULL | 127.0.0.1:4000 | NULL |
| 10 | 2024-08-05 14:32:28 | test | t | p1 | auto analyze table with 256 buckets, 100 topn, 1 samplerate | 3000 | 2024-08-05 14:32:28 | 2024-08-05 14:32:28 | finished | NULL | 127.0.0.1:4000 | NULL |
| 11 | 2024-08-05 14:32:28 | test | t | p8 | auto analyze table with 256 buckets, 100 topn, 1 samplerate | 3000 | 2024-08-05 14:32:28 | 2024-08-05 14:32:28 | finished | NULL | 127.0.0.1:4000 | NULL |
| 12 | 2024-08-05 14:32:28 | test | t | p14 | auto analyze table with 256 buckets, 100 topn, 1 samplerate | 3000 | 2024-08-05 14:32:28 | 2024-08-05 14:32:28 | finished | NULL | 127.0.0.1:4000 | NULL |
| 13 | 2024-08-05 14:32:28 | test | t | p7 | auto analyze table with 256 buckets, 100 topn, 1 samplerate | 3000 | 2024-08-05 14:32:28 | 2024-08-05 14:32:28 | finished | NULL | 127.0.0.1:4000 | NULL |
| 14 | 2024-08-05 14:32:28 | test | t | p4 | auto analyze table with 256 buckets, 100 topn, 1 samplerate | 3000 | 2024-08-05 14:32:28 | 2024-08-05 14:32:28 | finished | NULL | 127.0.0.1:4000 | NULL |
| 15 | 2024-08-05 14:32:28 | test | t | p6 | auto analyze table with 256 buckets, 100 topn, 1 samplerate | 3000 | 2024-08-05 14:32:28 | 2024-08-05 14:32:28 | finished | NULL | 127.0.0.1:4000 | NULL |
| 16 | 2024-08-05 14:32:28 | test | t | p9 | auto analyze table with 256 buckets, 100 topn, 1 samplerate | 3000 | 2024-08-05 14:32:28 | 2024-08-05 14:32:28 | finished | NULL | 127.0.0.1:4000 | NULL |
| 17 | 2024-08-05 14:32:28 | test | t | p12 | auto analyze table with 256 buckets, 100 topn, 1 samplerate | 3000 | 2024-08-05 14:32:28 | 2024-08-05 14:32:28 | finished | NULL | 127.0.0.1:4000 | NULL |
| 18 | 2024-08-05 14:32:28 | test | t | p17 | auto analyze table with 256 buckets, 100 topn, 1 samplerate | 3000 | 2024-08-05 14:32:28 | 2024-08-05 14:32:28 | finished | NULL | 127.0.0.1:4000 | NULL |
| 19 | 2024-08-05 14:32:28 | test | t | p18 | auto analyze table with 256 buckets, 100 topn, 1 samplerate | 3000 | 2024-08-05 14:32:28 | 2024-08-05 14:32:28 | finished | NULL | 127.0.0.1:4000 | NULL |
| 20 | 2024-08-05 14:32:28 | test | t | p2 | auto analyze table with 256 buckets, 100 topn, 1 samplerate | 3000 | 2024-08-05 14:32:28 | 2024-08-05 14:32:28 | finished | NULL | 127.0.0.1:4000 | NULL |
| 21 | 2024-08-05 14:32:28 | test | t | | merge global stats for test.t columns | 0 | 2024-08-05 14:32:28 | 2024-08-05 14:32:28 | finished | NULL | 127.0.0.1:4000 | NULL |
| 22 | 2024-08-05 14:37:04 | test | t | p0 | analyze columns | 0 | 2024-08-05 14:37:04 | 2024-08-05 14:37:04 | finished | NULL | 127.0.0.1:4000 | NULL |
| 23 | 2024-08-05 14:37:05 | test | t | | merge global stats for test.t columns | 0 | 2024-08-05 14:37:04 | 2024-08-05 14:37:05 | finished | NULL | 127.0.0.1:4000 | NULL |
+----+---------------------+--------------+------------+----------------+-------------------------------------------------------------+----------------+---------------------+---------------------+----------+-------------+----------------+------------+
23 rows in set (0.01 sec) |
/retest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🔢 Self-check (PR reviewed by myself and ready for feedback.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: AilinKid, elsa0520 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What problem does this PR solve?
Issue Number: ref #55043 (comment)
Problem Summary:
In the above issue comment, we decided to remove the dedicated analyze sessions. But after we removed them we need to find a better way to write the analysis job history. Theoretically we can call
_, _, err := exec.ExecRestrictedSQL(ctx, []sqlexec.OptionFuncAlias{sqlexec.ExecOptionUseSessionPool}, sql, args...)
concurrently to do it. But it is counterintuitive because we need to use the same session to get the executor and then use it to exec SQL with the session pool.Because we already moved
InsertAnalyzeJob
into the stats handle, so I decided to move other job history-related functions into it as well.What changed and how does it work?
For the single thread code, I decided to pass the stats handle to it. For workers, every worker will load it by themself.
Check List
Tests
Side effects
Documentation
Release note
Please refer to Release Notes Language Style Guide to write a quality release note.