Squashed commit of the following:

commit 74809fa Author: Drumil Patel <42701709+weastel@users.noreply.github.com> Date: Tue Feb 2 11:57:45 2021 +0530 copr: Vectorize AddDatetimeAndString (tikv#9610) Signed-off-by: Drumil Patel <drumilpatel720@gmail.com>  ### What problem does this PR solve? Issue Number: Partially fixes tikv#9016  Problem Summary: Added Vectorize version for AddDatetimeAndString ### What is changed and how it works? What's Changed: Implmented Vectorize vesion for AddDatetimeAndString ### Check List  Tests  - Unit test ### Release note  - Partially fixes tikv#9016 commit 103f5ff Author: Xintao <hunterlxt@live.com> Date: Tue Feb 2 13:13:45 2021 +0800 Make TiKV not to create dict files when master_key cannot work (tikv#9540) Signed-off-by: Xintao <hunterlxt@live.com>  ### What problem does this PR solve? Issue Number: close tikv#9488 Problem Summary: At the first time TiKV access AWS KMS failed, it leaves corrupted dict file which causes TiKV to crash continuously. ### What is changed and how it works? check the `master_key` before creating the dicts. ### Related changes - PR to update `pingcap/docs`/`pingcap/docs-cn`: - PR to update `pingcap/tidb-ansible`: - Need to cherry-pick to the release branch ### Check List  Tests  - Unit test - Integration test - Manual test (add detailed scripts or steps below) - No code Side effects - Performance regression - Consumes more CPU - Consumes more MEM - Breaking backward compatibility ### Release note  - Fix the bug that TiKV could not start normally after failing to access AWS KMS. commit af4432f Author: Wallace <bupt2013211450@gmail.com> Date: Tue Feb 2 11:49:45 2021 +0800 importer: fix test_ingest_sst (tikv#9617) Signed-off-by: Little-Wallace <bupt2013211450@gmail.com>  ### What problem does this PR solve? Issue Number: close tikv#9472 Problem Summary: TiKV may cleanup sst which has been upload just, so the ingest-job will success because it can not found file in path before it starts proposing, and the error is checked only when applied this entries. ### What is changed and how it works? We do not need to check and cleanup sst very quickly in this tests, so I remove the configuration for `cleanup_import_sst_interval`. ### Related changes - PR to update `pingcap/docs`/`pingcap/docs-cn`: - PR to update `pingcap/tidb-ansible`: - Need to cherry-pick to the release branch ### Check List  Tests  - Unit test - Integration test - Manual test (add detailed scripts or steps below) - No code Side effects - Performance regression - Consumes more CPU - Consumes more MEM - Breaking backward compatibility ### Release note  - No release note. commit 80582cb Author: wangggong <793160615@qq.com> Date: Mon Feb 1 13:15:45 2021 +0800 copr: Implement regexp & regexp_utf8 (tikv#9238) Signed-off-by: wangggong <793160615@qq.com>  ### What problem does this PR solve? Issue Number: tikv#9016 Problem Summary: ### What is changed and how it works? What's Changed: Add regexp & regexp_utf8 ### Related changes ### Check List  Tests  - Unit test Side effects ### Release note  No release note. commit 7b260d1 Author: Jay <BusyJay@users.noreply.github.com> Date: Fri Jan 29 21:13:44 2021 +0800 resolve: detect tombstone correctly (tikv#9593)  ### What problem does this PR solve? Issue Number: close tikv#9590 Problem Summary: PD client filter tombstone store and return error instead. Resolver should recognize the error and handle it correctly. ### Check List  Tests  - Unit test - Integration test ### Release note  - Fix repeated tombstone logs when sunset nodes commit a2c7aac Author: Liqi Geng <gengliqiii@gmail.com> Date: Fri Jan 29 04:03:44 2021 -0600 raftstore: enlarge leader-transfer-max-log-lag (tikv#9592) Signed-off-by: gengliqi <gengliqiii@gmail.com> ### What problem does this PR solve? Problem Summary: The meaning of `leader-transfer-max-log-lag` has already been changed by tikv#6539. For now, it means the difference value between the leader's last log index and the follower's last applied index. On the basis of implementation, we get the follower's last applied index through an RPC then we get the leader's last log index. There are many gaps between this process. 1. the applied index in peer fsm may be smaller than the real one in its corresponding apply fsm 2. the network latency 3. some logs may be proposed after sending the query RPC to this follower So the follower's last applied index may be much larger when we calculate the difference value. It thus appears that the default value of `leader-transfer-max-log-lag`(10) is too small which leads to many failures of leader transfer. I test the value of 128 and the failure is much less than before. * I test 256 and there is almost no failure. But maybe it's too large. Maybe we should find a more scientific approach to solve this problem later. ### What is changed and how it works? What's Changed: Change the default value of `leader-transfer-max-log-lag` to 128. ### Related changes - PR to update `pingcap/docs`/`pingcap/docs-cn`: - PR to update `pingcap/tidb-ansible`: - Need to cherry-pick to the release branch Tests  - No code Side effects - No ### Release note  * Change the default `leader-transfer-max-log-lag` to 128 to increase the success rate of leader transfer commit 3a4b8bb Author: Greg Weber <greg@pingcap.com> Date: Thu Jan 28 20:41:44 2021 -0600 fix: broken build on Mac (tikv#9598) On Mac: ``` --> components/file_system/src/rate_limiter.rs:5:5 5 | use crossbeam_utils::CachePadded; | ^^^^^^^^^^^^^^^ use of undeclared crate or module `crossbeam_utils` ``` ### Release note * None commit 20b0e95 Author: Xinye Tao <xy.tao@outlook.com> Date: Thu Jan 28 18:05:43 2021 +0800 engine: refactor `MetricsFlusher` and add fallback storage IO measuring (tikv#9458) Signed-off-by: tabokie <xy.tao@outlook.com> ### What is changed and how it works? What's Changed: Remove metrics flusher in engine_traits and use background worker instead, add IO metrics task which reports statistics from iosnooper, or IO rate limiter if BCC is unavailable. ### Check List  Tests  - Unit test ### Release note  No release note Signed-off-by: Little-Wallace <bupt2013211450@gmail.com>
Little-Wallace · Feb 2, 2021 · 7e9c0fb · 7e9c0fb
1 parent a975b26
commit 7e9c0fb
Show file tree

Hide file tree

Showing 36 changed files with 814 additions and 484 deletions.
diff --git a/Cargo.lock b/Cargo.lock
diff --git a/Cargo.toml b/Cargo.toml
@@ -236,7 +236,6 @@ members = [
   "components/test_pd",
   "components/tikv_alloc",
   "components/match_template",
-  "components/engine_traits/tests",
   "components/codec",
   "components/configuration",
   "components/panic_hook",

diff --git a/cmd/Cargo.toml b/cmd/Cargo.toml
@@ -15,6 +15,7 @@ portable = ["tikv/portable"]
 sse = ["tikv/sse"]
 mem-profiling = ["tikv/mem-profiling"]
 failpoints = ["tikv/failpoints"]
+bcc-iosnoop = ["tikv/bcc-iosnoop"]
 protobuf-codec = [
   "backup/protobuf-codec",
   "cdc/protobuf-codec",
@@ -86,7 +87,7 @@ error_code = { path = "../components/error_code", default-features = false }
 file_system = { path = "../components/file_system" }
 fs2 = "0.4"
 futures = "0.3"
-tokio = { version = "0.2", features = ["rt-threaded"] }
+tokio = { version = "0.2", features = ["rt-threaded", "time"] }
 grpcio = { version = "0.7", default-features = false, features = ["openssl-vendored"] }
 hex = "0.4"
 keys = { path = "../components/keys", default-features = false }

diff --git a/cmd/src/server.rs b/cmd/src/server.rs
@@ -16,6 +16,7 @@ use std::{
     net::SocketAddr,
     path::{Path, PathBuf},
     sync::{atomic::AtomicU64, Arc, Mutex},
+    time::{Duration, Instant},
 };
 
 use concurrency_manager::ConcurrencyManager;
@@ -25,11 +26,16 @@ use engine_rocks::{
     RocksEngine,
 };
 use engine_traits::{
-    compaction_job::CompactionJobInfo, EngineFileSystemInspector, Engines, MetricsFlusher,
-    RaftEngine, CF_DEFAULT, CF_WRITE,
+    compaction_job::CompactionJobInfo, EngineFileSystemInspector, Engines, RaftEngine, CF_DEFAULT,
+    CF_WRITE,
+};
+use file_system::{
+    set_io_rate_limiter, BytesFetcher, IORateLimiter, MetricsManager as IOMetricsManager,
 };
 use fs2::FileExt;
+use futures::compat::Stream01CompatExt;
 use futures::executor::block_on;
+use futures::stream::StreamExt;
 use grpcio::{EnvBuilder, Environment};
 use kvproto::{
     backup::create_backup, cdcpb::create_change_data, deadlock::create_deadlock,
@@ -74,18 +80,17 @@ use tikv::{
         mvcc::MvccConsistencyCheckObserver,
     },
 };
-use tikv_util::config::{ReadableSize, VersionTrack};
 use tikv_util::{
     check_environment_variables,
-    config::ensure_dir_exist,
+    config::{ensure_dir_exist, ReadableSize, VersionTrack},
     sys::sys_quota::SysQuota,
     time::Monitor,
-    worker::{Builder as WorkerBuilder, FutureWorker, Worker},
+    timer::GLOBAL_TIMER_HANDLE,
+    worker::{Builder as WorkerBuilder, FutureWorker, LazyWorker, Worker},
 };
 use tokio::runtime::Builder;
 
 use crate::{setup::*, signal_handler};
-use tikv_util::worker::LazyWorker;
 
 /// Run a TiKV server. Returns when the server is shutdown by the user, in which
 /// case the server will be properly stopped.
@@ -110,11 +115,8 @@ pub fn run_tikv(config: TiKvConfig) {
 
     macro_rules! run_impl {
         ($ER: ty) => {{
-            let enable_io_snoop = config.enable_io_snoop;
             let mut tikv = TiKVServer::<$ER>::init(config);
-            if enable_io_snoop {
-                tikv.init_io_snooper();
-            }
+            let fetcher = tikv.init_io_utility();
             tikv.check_conflict_addr();
             tikv.init_fs();
             tikv.init_yatp();
@@ -124,7 +126,7 @@ pub fn run_tikv(config: TiKvConfig) {
             let gc_worker = tikv.init_gc_worker();
             let server_config = tikv.init_servers(&gc_worker);
             tikv.register_services();
-            tikv.init_metrics_flusher();
+            tikv.init_metrics_flusher(fetcher);
             tikv.run_server(server_config);
             tikv.run_status_server();
 
@@ -142,6 +144,8 @@ pub fn run_tikv(config: TiKvConfig) {
 
 const RESERVED_OPEN_FDS: u64 = 1000;
 
+const DEFAULT_METRICS_FLUSH_INTERVAL: Duration = Duration::from_millis(10_000);
+
 /// A complete TiKV server.
 struct TiKVServer<ER: RaftEngine> {
     config: TiKvConfig,
@@ -841,25 +845,36 @@ impl<ER: RaftEngine> TiKVServer<ER> {
         }
     }
 
-    fn init_metrics_flusher(&mut self) {
-        let mut metrics_flusher = Box::new(MetricsFlusher::new(
-            self.engines.as_ref().unwrap().engines.clone(),
-        ));
-
-        // Start metrics flusher
-        if let Err(e) = metrics_flusher.start() {
-            error!(%e; "failed to start metrics flusher");
-        }
-
-        self.to_stop.push(metrics_flusher);
+    fn init_io_utility(&mut self) -> BytesFetcher {
+        let io_snooper_on = self.config.enable_io_snoop
+            && file_system::init_io_snooper()
+                .map_err(|e| error!(%e; "failed to init io snooper"))
+                .is_ok();
+        let (fetcher, limiter) = if io_snooper_on {
+            (BytesFetcher::FromIOSnooper(), IORateLimiter::new(0, false))
+        } else {
+            let limiter = IORateLimiter::new(0, true);
+            (BytesFetcher::FromRateLimiter(limiter.statistics()), limiter)
+        };
+        set_io_rate_limiter(Some(Arc::new(limiter)));
+        fetcher
     }
 
-    fn init_io_snooper(&mut self) {
-        if let Err(e) = file_system::init_io_snooper() {
-            error!(%e; "failed to init io snooper");
-        } else {
-            info!("init io snooper successfully"; "pid" => nix::unistd::getpid().to_string());
-        }
+    fn init_metrics_flusher(&mut self, fetcher: BytesFetcher) {
+        let handle = self.background_worker.clone_raw_handle();
+        let mut interval = GLOBAL_TIMER_HANDLE
+            .interval(Instant::now(), DEFAULT_METRICS_FLUSH_INTERVAL)
+            .compat();
+        let mut engine_metrics =
+            EngineMetricsManager::new(self.engines.as_ref().unwrap().engines.clone());
+        let mut io_metrics = IOMetricsManager::new(fetcher);
+        handle.spawn(async move {
+            while let Some(Ok(_)) = interval.next().await {
+                let now = Instant::now();
+                engine_metrics.flush(now);
+                io_metrics.flush(now);
+            }
+        });
     }
 
     fn run_server(&mut self, server_config: Arc<ServerConfig>) {
@@ -1146,12 +1161,6 @@ where
     }
 }
 
-impl<ER: RaftEngine> Stop for MetricsFlusher<RocksEngine, ER> {
-    fn stop(mut self: Box<Self>) {
-        (*self).stop()
-    }
-}
-
 impl Stop for Worker {
     fn stop(self: Box<Self>) {
         Worker::stop(&self);
@@ -1163,3 +1172,29 @@ impl<T: fmt::Display + Send + 'static> Stop for LazyWorker<T> {
         self.stop_worker();
     }
 }
+
+const DEFAULT_ENGINE_METRICS_RESET_INTERVAL: Duration = Duration::from_millis(60_000);
+
+pub struct EngineMetricsManager<R: RaftEngine> {
+    engines: Engines<RocksEngine, R>,
+    last_reset: Instant,
+}
+
+impl<R: RaftEngine> EngineMetricsManager<R> {
+    pub fn new(engines: Engines<RocksEngine, R>) -> Self {
+        EngineMetricsManager {
+            engines,
+            last_reset: Instant::now(),
+        }
+    }
+
+    pub fn flush(&mut self, now: Instant) {
+        self.engines.kv.flush_metrics("kv");
+        self.engines.raft.flush_metrics("raft");
+        if now.duration_since(self.last_reset) >= DEFAULT_ENGINE_METRICS_RESET_INTERVAL {
+            self.engines.kv.reset_statistics();
+            self.engines.raft.reset_statistics();
+            self.last_reset = now;
+        }
+    }
+}
diff --git a/components/encryption/src/encrypted_file/mod.rs b/components/encryption/src/encrypted_file/mod.rs
@@ -16,7 +16,7 @@ use crate::Result;
 mod header;
 pub use header::*;
 
-pub const TMP_FILE_SUFFIX: &str = ".tmp";
+pub const TMP_FILE_SUFFIX: &str = "tmp";
 
 /// An file encrypted by master key.
 pub struct EncryptedFile<'a> {

diff --git a/components/encryption/src/file_dict_file.rs b/components/encryption/src/file_dict_file.rs
@@ -111,11 +111,16 @@ impl FileDictionaryFile {
         Ok((file_dict_file, file_dict))
     }
 
+    /// Get the file path.
+    pub fn file_path(&self) -> PathBuf {
+        self.base.join(&self.name)
+    }
+
     /// Rewrite the log file to reduce file size and reduce the time of next recovery.
     fn rewrite(&mut self) -> Result<()> {
         let file_dict_bytes = self.file_dict.write_to_bytes()?;
         if self.enable_log {
-            let origin_path = self.base.join(&self.name);
+            let origin_path = self.file_path();
             let mut tmp_path = origin_path.clone();
             tmp_path.set_extension(format!("{}.{}", thread_rng().next_u64(), TMP_FILE_SUFFIX));
             let mut tmp_file = OpenOptions::new()
@@ -146,9 +151,7 @@ impl FileDictionaryFile {
 
     /// Recovery from the log file and return `FileDictionary`.
     pub fn recovery(&mut self) -> Result<FileDictionary> {
-        let mut f = OpenOptions::new()
-            .read(true)
-            .open(self.base.join(&self.name))?;
+        let mut f = OpenOptions::new().read(true).open(self.file_path())?;
         let mut buf = Vec::new();
         f.read_to_end(&mut buf)?;
         let remained = buf.as_slice();
@@ -187,6 +190,7 @@ impl FileDictionaryFile {
                 }
             }
         }
+
         self.file_dict = file_dict.clone();
         Ok(file_dict)
     }

diff --git a/components/encryption/src/manager/mod.rs b/components/encryption/src/manager/mod.rs
@@ -118,6 +118,13 @@ impl Dicts {
                 info!("encryption: none of key dictionary and file dictionary are found.");
                 Ok(None)
             }
+            (Ok((file_dict_file, file_dict)), Err(Error::Io(key_err)))
+                if key_err.kind() == ErrorKind::NotFound && file_dict.files.is_empty() =>
+            {
+                std::fs::remove_file(file_dict_file.file_path())?;
+                info!("encryption: file dict is empty and none of key dictionary are found.");
+                Ok(None)
+            }
             // ...else, return either error.
             (file_dict_file, key_bytes) => {
                 if let Err(key_err) = key_bytes {
@@ -730,6 +737,10 @@ mod tests {
     };
     use tempfile::TempDir;
 
+    lazy_static::lazy_static! {
+        static ref LOCK_FOR_GAUGE: Mutex<i32> = Mutex::new(1);
+    }
+
     fn new_mock_backend() -> Box<MockBackend> {
         Box::new(MockBackend::default())
     }
@@ -849,6 +860,8 @@ mod tests {
     // If master_key is the wrong key, fallback to previous_master_key.
     #[test]
     fn test_key_manager_rotate_master_key() {
+        let mut _guard = LOCK_FOR_GAUGE.lock().unwrap();
+
         // create initial dictionaries.
         let tmp_dir = tempfile::TempDir::new().unwrap();
         let manager = new_key_manager_def(&tmp_dir, None).unwrap();
@@ -879,6 +892,7 @@ mod tests {
 
     #[test]
     fn test_key_manager_rotate_master_key_rewrite_failure() {
+        let _guard = LOCK_FOR_GAUGE.lock().unwrap();
         // create initial dictionaries.
         let tmp_dir = tempfile::TempDir::new().unwrap();
         let manager = new_key_manager_def(&tmp_dir, None).unwrap();
@@ -1187,4 +1201,28 @@ mod tests {
         }
         assert!(count >= 101);
     }
+
+    #[test]
+    fn test_master_key_failure_and_succeed() {
+        let _guard = LOCK_FOR_GAUGE.lock().unwrap();
+        let tmp_dir = tempfile::TempDir::new().unwrap();
+
+        let wrong_key = Box::new(MockBackend {
+            is_wrong_master_key: true,
+            encrypt_fail: true,
+            ..MockBackend::default()
+        });
+        let right_key = Box::new(MockBackend {
+            is_wrong_master_key: true,
+            encrypt_fail: false,
+            ..MockBackend::default()
+        });
+        let previous = Box::new(PlaintextBackend::default()) as Box<dyn Backend>;
+
+        let result = new_key_manager(&tmp_dir, None, wrong_key, &*previous);
+        // When the master key is invalid, the key manager left a empty file dict and return errors.
+        assert!(result.is_err());
+        let result = new_key_manager(&tmp_dir, None, right_key, &*previous);
+        assert!(result.is_ok());
+    }
 }