Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't use extra port for replication #200

Merged
merged 2 commits into from
Mar 12, 2021
Merged

Don't use extra port for replication #200

merged 2 commits into from
Mar 12, 2021

Conversation

ShooterIT
Copy link
Member

@ShooterIT ShooterIT commented Mar 9, 2021

As we know, in order to implement replication, kvrocks has specified replication threads that use 'port + 1' as listening port, that may make somethings easy, but it is not convenient to deploy and maintain.

In this PR, kvrocks doesn't use extra port for replication instead of processing replication requests in general work threads.

Some problems we need to care
Blocking
We can't block work threads for processing replication requests. Currently, for replication commands, _fetch_meta and _fetch_file are blocked, in this commit, we process both commands in new threads.

Compatibility
We provide a configuration item master-use-repl-port to make new version able to replicate old version master. If the masters are old version, you should set master-use-repl-port to yes, otherwise you should set it to no.

@ShooterIT ShooterIT added enhancement type enhancement major decision Requires project management committee consensus labels Mar 9, 2021
@ShooterIT ShooterIT marked this pull request as ready for review March 9, 2021 11:25
karelrooted
karelrooted previously approved these changes Mar 10, 2021
Copy link
Contributor

@karelrooted karelrooted left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@git-hulk git-hulk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome!

Base automatically changed from master to unstable March 10, 2021 13:22
@ShooterIT ShooterIT closed this Mar 11, 2021
@ShooterIT ShooterIT deleted the no-repl-port branch March 11, 2021 08:25
@ShooterIT ShooterIT restored the no-repl-port branch March 11, 2021 08:25
@ShooterIT ShooterIT reopened this Mar 11, 2021
@ShooterIT
Copy link
Member Author

hi @git-hulk @karelrooted Currently i just implement basic function. we still need to test its performance although it doesn't have damage.
We still have some other jobs for this change, such as clean up jobs, replication connection identifier, ... But i don't want to make one PR too complicated to review, it is hard to find problems. We can fix these in other PRs.

@git-hulk
Copy link
Member

It should be fine to merge the PR since the master was defined as unstable now. Also, those problems weren't imported by this PR so makes a lot of sense to file a new PR.

@ShooterIT ShooterIT requested a review from karelrooted March 11, 2021 09:05
@ShooterIT ShooterIT merged commit 62df766 into apache:unstable Mar 12, 2021
@ShooterIT ShooterIT deleted the no-repl-port branch March 12, 2021 03:14
ShooterIT added a commit that referenced this pull request Mar 22, 2021
Currently, master creates a new snapshot by backup function of rocksdb when replicas ask for full replication.
Although, rocksdb backup function can support incremental backup, it still cost much time, disk bandwidth, and disk
space if there is no based backup or current backup exists for long time, and backup function really needs copy real
files. If kvrocks holds a huge amount of data, copying data files would influence the writing or reading of rocksdb that
results in latency of user request. As we know, the function checkpoint of rocksdb would hard link data files instead of
copying files, that has minor resource consumption, so we wish using checkpoint to implement data snapshot when
full replication to avoid influencing user requests. The followings are some the details of solution.

* Master
- As similar with full replication by using backup, master will try to create a new checkpoint if there is no checkpoint
when replicas ask full replication, and tell replicas all files name list of checkpoint.
- We can create a new checkpoint for every replica, but there will be many checkpoints that may occupy disk space
because hard link could make deleted files existing during copying file period. So we hope to use one shared
checkpoint, but replicas still wait for next checkpoint if current checkpoint is too old to satisfy feeding new coming
replicas.
- How to clean checkpoint, it is much difficult because replicas fetch data info and data files by different connections
and threads. So i use current fetching files threads number as checkpoint reference count, master will delete
checkpoint if there no thread to fetch files during several seconds. To avoid that a slow replica always keep the
checkpoint alive too long time, master still delete the checkpoint if it is very long time (1 day) from creating
checkpoint.

* Replica
- Replicas still fetch checkpoint files like before, after fetching all files, replicas will destroy old db, rename synced
checkpoint dir to db dir and reload this checkpoint. To avoid loading wrong checkpoint causes db broken, we rename
old db dir, and restore it if the synced checkpoint is corrupt to fail to open.
- How to resume broken transfer based files. We store synced files into a specified dir (synced_checkpoint). Firstly,
because we don't get file CRC by checkpoint to identify unique files, so we must clean the synced_checkpoint dir
when changing master. Secondly, we need to clean up invalid files of master old checkpoint when we check existing
files of master new checkpoint for skipping some transferred files.

For backward compatibility, as #200, kvrocks still support to replicate data from old version master if you set
'master-use-repl-port' to yes.
@ShooterIT ShooterIT mentioned this pull request Apr 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement type enhancement major decision Requires project management committee consensus release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants