Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

raft: always keep un-applied logs in memory #199

Closed
Fullstop000 opened this issue Mar 15, 2019 · 9 comments
Closed

raft: always keep un-applied logs in memory #199

Fullstop000 opened this issue Mar 15, 2019 · 9 comments

Comments

@Fullstop000
Copy link
Member

Fullstop000 commented Mar 15, 2019

In the current version of implementation, if we want to apply committed entries, we will try to retrieve them from raft_log first. At that time, it's very likely that the entries we want are stored in the Storage, which might be insufficient to get them ( especially when the Storage is based on something like rocksDB).

Generally, the issue described above often occurs when the updating speed of committed_index is not as fast as raft log growing so I think it's a common situation we need to improve.

Therefore, we can always keep un-applied logs in memory ( the unstable ) even after we store them into the Storage.

@siddontang
Copy link
Contributor

@Fullstop000

In TiKV, we use a cache outside to avoid fetching logs from Storage. But if we can do it in raft lib directly, maybe it is more common.

Here the think we need to care is the memory usage of this cache, we can't save a lot to avoid OOM.

@Fullstop000
Copy link
Member Author

Fullstop000 commented Mar 17, 2019

Avoiding OOM is exactly what I'm considering about. I notice there is a #131 which describes a protection mechanism for the leader to prevent unlimited raft logs growth and it inspires me a little.

An intuitive approach to solve the issue for me is to have some configurations for limitation e.g. max_entry_count_in_memory and max_entry_size_in_memory. But I start to realize that there might be a more attractive feature we can have: a throttle for the leader.

When the follower's in-memory raft logs reach the memory limit, it should send a msg (e.g MsgMemoryLimit) back to the leader to indicate that the applying speed could not be as fast as network transfers. And once the leader receives a MsgMemoryLimit from a follower, the leader should throttle the appending speed to the follower back.

For a detailed design of leader throttle:

  1. The MsgMemoryLimit includes the last index of a follower to help the leader throttling it back ( just like the reject_hint).
  2. Once the leader receives MsgMemoryLimit, it records the follower in memory who needs to be throttled. Therefore, the leader checks whether it needs to pause the log replication for a follower each Tick and the unit of pausing is the times of Tick ( should be able to be configured e.g throttle_pausing_ticks).
  3. After the pausing, the leader starts sending raft logs to the follower.

The leader throttle approach might be somewhat aggressive here. For a succinct solution, we can just use configures. And if in-memory raft logs reach the limit, the unstable should be compacted. @siddontang

/cc @Hoverbear

@BusyJay
Copy link
Member

BusyJay commented Mar 18, 2019

This can be implemented in storage instead of the raft library. It's better to keep the library simple.

@Hoverbear
Copy link
Contributor

@Fullstop000 Maybe you and I could make a new more sophisticated storage module crate for Raft that does this? :)

@Fullstop000
Copy link
Member Author

@Hoverbear if you have any idea to improve this, I'd like to help with it.

@Hoverbear
Copy link
Contributor

@Fullstop000 Do you think we could come up with a simple file based storage for demonstration purposes? Or maybe use something like SLED?

@Fullstop000
Copy link
Member Author

@Hoverbear I prefer that we start this with file-based storage from scratch so that we can later easily add some best practices about how to use the raft lib. For now, the two examples might not be enough for some 'real-world' situations.

@Hoverbear
Copy link
Contributor

That sounds great, @Fullstop000 . :) Do you think you can take care of an initial PR and I will make myself available to help how I can?

@Fullstop000
Copy link
Member Author

Fullstop000 commented May 13, 2019

@Hoverbear somewhat busy recently 😢. Maybe I could commit an initial PR for some ideas about the Storage on mid-June

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants