Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: db cannot resume after disk has no space #1765

Merged
merged 7 commits into from
Jul 26, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 16 additions & 0 deletions conf/pika.conf
Original file line number Diff line number Diff line change
Expand Up @@ -217,6 +217,22 @@ slave-priority : 100
# [NOTICE]: compact-interval is prior than compact-cron.
#compact-interval :

# The minimum disk usage ratio for checking resume.
# If the disk usage ratio is lower than min-check-resume-ratio, it will not check resume, only higher will check resume.
# Its default value is 0.7.
#min-check-resume-ratio : 0.7

# The minimum free disk space to trigger db resume.
# If the db has a background error, only the free disk size is larger than this configuration can trigger manually resume db.
# Its default value is 256MB.
# [NOTICE]: least-free-disk-resume-size should not smaller than write-buffer-size!
#least-free-disk-resume-size : 256M

# Manually trying to resume db interval is configured by manually-resume-interval.
# If db has a background error, it will try to manually call resume() to resume db if satisfy the least free disk to resume.
# Its default value is 60 seconds.
#manually-resume-interval : 60

# This window-size determines the amount of data that can be transmitted in a single synchronization process.
# [Tip] In the scenario of high network latency. Increasing this size can improve synchronization efficiency.
# Its default value is 9000. the [maximum] value is 90000.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the provided code patch, here are some observations and suggestions:

  1. The added configurations least-free-disk-resume-size and manually-resume-interval seem appropriate in terms of functionality. However, it's important to ensure that these values are suitable for your specific use case and environment.

  2. For the least-free-disk-resume-size configuration, the value of 268435456 bytes corresponds to approximately 256MB, as mentioned in the comment. If this value aligns with your requirements, there doesn't appear to be any bug or issue.

  3. Similarly, for the manually-resume-interval configuration, the default value of 600 seconds (10 minutes) seems reasonable. Again, make sure it suits your needs.

  4. Reviewing the rest of the code patch, there don't appear to be any bugs or risks. However, it's difficult to provide a comprehensive review without seeing the context or the surrounding code.

  5. If you have any specific concerns or further details regarding the code, please provide them for a more thorough review.

Remember to thoroughly test the changes and monitor the system after applying the patch to ensure it behaves as expected in your specific setup.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are some observations and suggestions for the code patch:

  1. The commented out section "#compact-interval :" indicates that the configuration option is not being used. If it's necessary, make sure to uncomment it and provide a value.

  2. In the added configurations related to disk usage and resume, ensure that the values provided are appropriate for your system requirements and environment.

  3. The comment "[NOTICE]: least-free-disk-resume-size should not be smaller than write-buffer-size!" highlights an important consideration. Make sure to set the "least-free-disk-resume-size" value accordingly to avoid potential issues.

  4. Review the default values for "min-check-resume-ratio" (0.7) and "manually-resume-interval" (60 seconds) to confirm if they align with your desired behavior.

  5. In the last comment about "window-size," consider reviewing the current value (9000) and determining if increasing it would be beneficial for your specific network latency scenario.

  6. Overall, code reviews often require context and cannot be done solely based on the provided code snippet. It's recommended to review the entire codebase, including dependencies and integrated components, to identify potential bugs, risks, and improvement opportunities more accurately.

Remember to thoroughly test any changes made and consider seeking feedback from other developers familiar with the project for a comprehensive review.

Expand Down
30 changes: 30 additions & 0 deletions include/pika_conf.h
Original file line number Diff line number Diff line change
Expand Up @@ -86,6 +86,18 @@ class PikaConf : public pstd::BaseConf {
std::shared_lock l(rwlock_);
return compact_interval_;
}
int64_t least_resume_free_disk_size() {
std::shared_lock l(rwlock_);
return least_free_disk_to_resume_;
}
int64_t resume_interval() {
std::shared_lock l(rwlock_);
return resume_check_interval_;
}
double min_check_resume_ratio() {
std::shared_lock l(rwlock_);
return min_check_resume_ratio_;
}
int64_t write_buffer_size() {
std::shared_lock l(rwlock_);
return write_buffer_size_;
Expand Down Expand Up @@ -477,6 +489,21 @@ class PikaConf : public pstd::BaseConf {
TryPushDiffCommands("compact-interval", value);
compact_interval_ = value;
}
void SetLeastResumeFreeDiskSize(const int64_t& value) {
std::lock_guard l(rwlock_);
TryPushDiffCommands("least-free-disk-resume-size", std::to_string(value));
least_free_disk_to_resume_ = value;
}
void SetResumeInterval(const int64_t& value) {
std::lock_guard l(rwlock_);
TryPushDiffCommands("manually-resume-interval", std::to_string(value));
resume_check_interval_ = value;
}
void SetMinCheckResumeRatio(const double& value) {
std::lock_guard l(rwlock_);
TryPushDiffCommands("min-check-resume-ratio", std::to_string(value));
min_check_resume_ratio_ = value;
}
void SetSyncWindowSize(const int& value) {
TryPushDiffCommands("sync-window-size", std::to_string(value));
sync_window_size_.store(value);
Expand Down Expand Up @@ -546,6 +573,9 @@ class PikaConf : public pstd::BaseConf {
int db_sync_speed_ = 0;
std::string compact_cron_;
std::string compact_interval_;
int64_t resume_check_interval_ = 60; // seconds
int64_t least_free_disk_to_resume_ = 268435456; // 256 MB
double min_check_resume_ratio_ = 0.7;
int64_t write_buffer_size_ = 0;
int64_t arena_block_size_ = 0;
int64_t slotmigrate_thread_num_ = 0;
Expand Down
6 changes: 6 additions & 0 deletions include/pika_server.h
Original file line number Diff line number Diff line change
Expand Up @@ -504,6 +504,7 @@ class PikaServer : public pstd::noncopyable {
* TimingTask use
*/
void DoTimingTask();
void AutoResumeDB();
void AutoCompactRange();
void AutoPurge();
void AutoDeleteExpiredDump();
Expand Down Expand Up @@ -534,6 +535,11 @@ class PikaServer : public pstd::noncopyable {
bool have_scheduled_crontask_ = false;
struct timeval last_check_compact_time_;

/*
* ResumeDB used
*/
struct timeval last_check_resume_time_;

/*
* Communicate with the client used
*/
Expand Down
19 changes: 19 additions & 0 deletions src/pika_conf.cc
Original file line number Diff line number Diff line change
Expand Up @@ -354,6 +354,22 @@ int PikaConf::Load() {
}
}

// least-free-disk-resume-size
GetConfInt64Human("least-free-disk-resume-size", &least_free_disk_to_resume_);
if (least_free_disk_to_resume_ <= 0) {
least_free_disk_to_resume_ = 268435456; // 256Mb
}

GetConfInt64("manually-resume-interval", &resume_check_interval_);
if (resume_check_interval_ <= 0) {
resume_check_interval_ = 60; // seconds
}

GetConfDouble("min-check-resume-ratio", &min_check_resume_ratio_);
if (min_check_resume_ratio_ < 0) {
min_check_resume_ratio_ = 0.7;
}

// write_buffer_size
GetConfInt64Human("write-buffer-size", &write_buffer_size_);
if (write_buffer_size_ <= 0) {
Expand Down Expand Up @@ -621,6 +637,9 @@ int PikaConf::ConfigRewrite() {
SetConfInt("db-sync-speed", db_sync_speed_);
SetConfStr("compact-cron", compact_cron_);
SetConfStr("compact-interval", compact_interval_);
SetConfInt64("least-free-disk-resume-size", least_free_disk_to_resume_);
SetConfInt64("manually-resume-interval", resume_check_interval_);
SetConfDouble("min-check-resume-ratio", min_check_resume_ratio_);
SetConfInt("slave-priority", slave_priority_);
SetConfInt("sync-window-size", sync_window_size_.load());
SetConfInt("consensus-level", consensus_level_.load());
Expand Down
54 changes: 54 additions & 0 deletions src/pika_server.cc
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,7 @@ PikaServer::PikaServer()
: exit_(false),
slot_state_(INFREE),
last_check_compact_time_({0, 0}),
last_check_resume_time_({0, 0}),
repl_state_(PIKA_REPL_NO_CONNECT),
role_(PIKA_ROLE_SINGLE) {
// Init server ip host
Expand Down Expand Up @@ -1295,6 +1296,8 @@ void PikaServer::PubSubNumSub(const std::vector<std::string>& channels,
/******************************* PRIVATE *******************************/

void PikaServer::DoTimingTask() {
// Resume DB if satisfy the condition
AutoResumeDB();
// Maybe schedule compactrange
AutoCompactRange();
// Purge log
Expand Down Expand Up @@ -1487,6 +1490,57 @@ void PikaServer::AutoKeepAliveRSync() {
}
}

void PikaServer::AutoResumeDB() {
struct statfs disk_info;
int ret = statfs(g_pika_conf->db_path().c_str(), &disk_info);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see it uses statfs(deprecated) on the other one. I perfer to use statvfs based on https://www.mkssoftware.com/docs/man3/statfs.3.asp

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@infdahai Could you please help me review the code? #1822 I have made relevant changes to your suggestion

if (ret == -1) {
LOG(WARNING) << "statfs error: " << strerror(errno);
return;
}

int64_t interval = g_pika_conf->resume_interval();
int64_t least_free_size = g_pika_conf->least_resume_free_disk_size();
double min_check_resume_ratio = g_pika_conf->min_check_resume_ratio();
uint64_t free_size = disk_info.f_bsize * disk_info.f_bfree;
uint64_t total_size = disk_info.f_bsize * disk_info.f_blocks;
double disk_use_ratio = 1.0 - static_cast<double>(free_size) / static_cast<double>(total_size);

struct timeval now;
gettimeofday(&now, nullptr);
// first check or time interval between now and last check is larger than variable "interval"
if (last_check_resume_time_.tv_sec == 0 || now.tv_sec - last_check_resume_time_.tv_sec >= interval) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

在这个 if 上面加上英文注释,体现 ”第一次检验 或者 距离上次 check 时间超过 interval“ 这个 condition

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

第二,我建议先检测下当前磁盘总体使用情况,整体空闲率低于某个阈值【如 30%】时,才进来执行 check 操作,以减少加锁次数

gettimeofday(&last_check_resume_time_, nullptr);
if (disk_use_ratio < min_check_resume_ratio || free_size < least_free_size){
return;
}

std::map<std::string, uint64_t> background_errors;
std::shared_lock db_rwl(g_pika_server->dbs_rw_);
// loop every db
for (const auto& db_item : g_pika_server->dbs_) {
Mixficsol marked this conversation as resolved.
Show resolved Hide resolved
if (!db_item.second) {
continue;
}
std::shared_lock slot_rwl(db_item.second->slots_rw_);
// loop every slot
for (const auto& slot_item : db_item.second->slots_) {
Mixficsol marked this conversation as resolved.
Show resolved Hide resolved
background_errors.clear();
slot_item.second->DbRWLockReader();
slot_item.second->db()->GetUsage(storage::PROPERTY_TYPE_ROCKSDB_BACKGROUND_ERRORS, &background_errors);
slot_item.second->DbRWUnLock();
for (const auto& item : background_errors) {
if (item.second != 0) {
rocksdb::Status s = slot_item.second->db()->GetDBByType(item.first)->Resume();
if (!s.ok()) {
LOG(WARNING) << s.ToString();
}
}
}
}
}
}
}

void PikaServer::AutoUpdateNetworkMetric() {
monotime current_time = getMonotonicUs();
size_t factor = 5e6; // us, 5s
Expand Down