Don't use extra port for replication #200

ShooterIT · 2021-03-09T10:52:22Z

As we know, in order to implement replication, kvrocks has specified replication threads that use 'port + 1' as listening port, that may make somethings easy, but it is not convenient to deploy and maintain.

In this PR, kvrocks doesn't use extra port for replication instead of processing replication requests in general work threads.

Some problems we need to care
Blocking
We can't block work threads for processing replication requests. Currently, for replication commands, _fetch_meta and _fetch_file are blocked, in this commit, we process both commands in new threads.

Compatibility
We provide a configuration item master-use-repl-port to make new version able to replicate old version master. If the masters are old version, you should set master-use-repl-port to yes, otherwise you should set it to no.

src/redis_cmd.cc

karelrooted

LGTM

git-hulk

Awesome!

src/redis_cmd.cc

ShooterIT · 2021-03-11T08:48:21Z

hi @git-hulk @karelrooted Currently i just implement basic function. we still need to test its performance although it doesn't have damage.
We still have some other jobs for this change, such as clean up jobs, replication connection identifier, ... But i don't want to make one PR too complicated to review, it is hard to find problems. We can fix these in other PRs.

git-hulk · 2021-03-11T08:55:58Z

It should be fine to merge the PR since the master was defined as unstable now. Also, those problems weren't imported by this PR so makes a lot of sense to file a new PR.

Currently, master creates a new snapshot by backup function of rocksdb when replicas ask for full replication. Although, rocksdb backup function can support incremental backup, it still cost much time, disk bandwidth, and disk space if there is no based backup or current backup exists for long time, and backup function really needs copy real files. If kvrocks holds a huge amount of data, copying data files would influence the writing or reading of rocksdb that results in latency of user request. As we know, the function checkpoint of rocksdb would hard link data files instead of copying files, that has minor resource consumption, so we wish using checkpoint to implement data snapshot when full replication to avoid influencing user requests. The followings are some the details of solution. * Master - As similar with full replication by using backup, master will try to create a new checkpoint if there is no checkpoint when replicas ask full replication, and tell replicas all files name list of checkpoint. - We can create a new checkpoint for every replica, but there will be many checkpoints that may occupy disk space because hard link could make deleted files existing during copying file period. So we hope to use one shared checkpoint, but replicas still wait for next checkpoint if current checkpoint is too old to satisfy feeding new coming replicas. - How to clean checkpoint, it is much difficult because replicas fetch data info and data files by different connections and threads. So i use current fetching files threads number as checkpoint reference count, master will delete checkpoint if there no thread to fetch files during several seconds. To avoid that a slow replica always keep the checkpoint alive too long time, master still delete the checkpoint if it is very long time (1 day) from creating checkpoint. * Replica - Replicas still fetch checkpoint files like before, after fetching all files, replicas will destroy old db, rename synced checkpoint dir to db dir and reload this checkpoint. To avoid loading wrong checkpoint causes db broken, we rename old db dir, and restore it if the synced checkpoint is corrupt to fail to open. - How to resume broken transfer based files. We store synced files into a specified dir (synced_checkpoint). Firstly, because we don't get file CRC by checkpoint to identify unique files, so we must clean the synced_checkpoint dir when changing master. Secondly, we need to clean up invalid files of master old checkpoint when we check existing files of master new checkpoint for skipping some transferred files. For backward compatibility, as #200, kvrocks still support to replicate data from old version master if you set 'master-use-repl-port' to yes.

ShooterIT added enhancement type enhancement major decision Requires project management committee consensus labels Mar 9, 2021

ShooterIT requested review from karelrooted and git-hulk March 9, 2021 10:52

ShooterIT added the release notes label Mar 9, 2021

ShooterIT marked this pull request as ready for review March 9, 2021 11:25

karelrooted reviewed Mar 10, 2021

View reviewed changes

src/redis_cmd.cc Outdated Show resolved Hide resolved

Don't use extra port for replication

83a73bf

karelrooted previously approved these changes Mar 10, 2021

View reviewed changes

git-hulk reviewed Mar 10, 2021

View reviewed changes

src/redis_cmd.cc Show resolved Hide resolved

src/redis_cmd.cc Outdated Show resolved Hide resolved

Base automatically changed from master to unstable March 10, 2021 13:22

Fix some CRs

9e1d8ec

ShooterIT dismissed karelrooted’s stale review via 9e1d8ec March 11, 2021 08:20

ShooterIT closed this Mar 11, 2021

ShooterIT deleted the no-repl-port branch March 11, 2021 08:25

ShooterIT restored the no-repl-port branch March 11, 2021 08:25

ShooterIT reopened this Mar 11, 2021

git-hulk approved these changes Mar 11, 2021

View reviewed changes

ShooterIT requested a review from karelrooted March 11, 2021 09:05

ShooterIT merged commit 62df766 into apache:unstable Mar 12, 2021

ShooterIT deleted the no-repl-port branch March 12, 2021 03:14

ShooterIT mentioned this pull request Mar 19, 2021

Use checkpoint to implement data snapshot when full replication #205

Merged

ShooterIT mentioned this pull request Apr 23, 2021

Release 2.0.0 #235

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Don't use extra port for replication #200

Don't use extra port for replication #200

ShooterIT commented Mar 9, 2021 •

edited

Loading

karelrooted left a comment

git-hulk left a comment

ShooterIT commented Mar 11, 2021

git-hulk commented Mar 11, 2021

Don't use extra port for replication #200

Don't use extra port for replication #200

Conversation

ShooterIT commented Mar 9, 2021 • edited Loading

karelrooted left a comment

Choose a reason for hiding this comment

git-hulk left a comment

Choose a reason for hiding this comment

ShooterIT commented Mar 11, 2021

git-hulk commented Mar 11, 2021

ShooterIT commented Mar 9, 2021 •

edited

Loading