Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(v2): metastore API refactoring #3679

Merged
merged 9 commits into from
Nov 15, 2024

Conversation

kolesnikovae
Copy link
Collaborator

@kolesnikovae kolesnikovae commented Nov 13, 2024

I decided to split #3652 into multiple PRs to avoid bad merge conflicts. In this PR, I refactor the FSM module and gRPC services. The core idea: strict raft command handler interface with storage/FSM injection (*bbolt.Tx for now). This required quite a few adjustments, but now we always perform all the operations in a single transaction and the state is explicitly provided to the command handler.

I tried to preserve as much as possible but decided to not re-write compactor tests as the implementation is changed significantly in #3652.

I'm going to test the PR more extensively (there almost no functional changes), but it is ready for review.

@kolesnikovae kolesnikovae force-pushed the chore/metastore-api-refactoring branch from e745e20 to 346e8c0 Compare November 13, 2024 12:50
@kolesnikovae kolesnikovae force-pushed the chore/metastore-api-refactoring branch from 4bcf224 to 264a970 Compare November 13, 2024 13:54
@kolesnikovae kolesnikovae marked this pull request as ready for review November 13, 2024 14:02
@kolesnikovae kolesnikovae requested review from korniltsev and a team as code owners November 13, 2024 14:02
@kolesnikovae kolesnikovae requested a review from aleks-p November 13, 2024 14:02
Copy link
Contributor

@aleks-p aleks-p left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great to me, thank you for taking the time to do the refactoring.

The only part I am not sure about is how clear some of the new concepts are. In particular, we have the raft_node.Leader and raft_node.Follower types. Their main purpose is ensuring read consistency but their names can be misleading because they are active on both leader and follower nodes. There is also a strong connection between the two, with the follower being the only consumer of what the leader provides (ReadIndex). I wonder if merging them under a StateReader (or StateProvider) is possible and could make their purpose more clear.

Comment on lines +227 to +229
m.observer.OnLeader(m.cleanerService)
m.observer.OnLeader(m.dlqRecovery)
m.observer.OnLeader(m.placement)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, I think here we are registering the 3 services so perhaps the method should reflect that (e.g., Register or RegisterService, ...)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WDYT about RunOnLeader? We already have quite a few "registrations", and adding some more might not be helpful for the reader

Comment on lines +133 to +137
proposer *RaftProposer
client metastorev1.RaftNodeServiceClient
observer *raft_node.Observer
follower *raft_node.Follower
leader *raft_node.Leader
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, perhaps prefix all of these with raft to make their usage more clear (e.g., it is not obvious that m.observer refers to the raft state)

Copy link
Collaborator Author

@kolesnikovae kolesnikovae Nov 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking about refactoring raft-related stuff to a separate module: now the metastore type has too many raft-smth members. Will do that in the PR I'm currently working on

return o
}

func (o *Observer) RegisterHandler(h StateHandler) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this can probably be private

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RegisterHandler is actually the primary method. Additionally, I plan to extend the interface to handle notifications for other types of events, such as peer additions or removals, leader loss, and so on

@kolesnikovae
Copy link
Collaborator Author

kolesnikovae commented Nov 15, 2024

Thank you so much for the review, Aleks!

The only part I am not sure about is how clear some of the new concepts are. In particular, we have the raft_node.Leader and raft_node.Follower types. Their main purpose is ensuring read consistency, but their names can be misleading because they are active on both leader and follower nodes.

Interestingly, I actually find these types intuitive and quite like them: the types encapsulate behavior rather than role and satisfy the corresponding interfaces:

type RaftLeader interface {
	ReadIndex() (uint64, error)
}

type RaftFollower interface {
	WaitLeaderCommitIndexAppliedLocally(ctx context.Context) (applied uint64, err error)
	ConsistentRead(ctx context.Context, read func(*bbolt.Tx)) error
}

For example, I really like that follower.ConsistentRead might be called on the leader (keep in mind that followers may be ahead of the leader in terms of applied changes, so the leader must also perform this check). This makes the Follower Reads pattern quite obvious.

Similarly, leader.ReadIndex can be called on any node: if the node is not elected by the quorum, the method fails; otherwise, it succeeds.

There is also a strong connection between the two, with the follower being the only consumer of what the leader provides (ReadIndex). I wonder if merging them under a StateReader (or StateProvider) is possible and could make their purpose more clear.

You've clearly described the two components, how they interact, and the principle behind their connection: they indeed have a strong leader-follower connection and distinct responsibilities. I wouldn't want to obscure this from the user; in the current implementation, each component has a single responsibility.

But I hear you – I'll try to find a better approach. I'm not entirely sure about stateReader.ReadIndex (vs. leader.ReadIndex) or stateProvider.ConsistentRead (vs. follower.ConsistentRead), but I'll give it a try.

@kolesnikovae kolesnikovae force-pushed the chore/metastore-api-refactoring branch from cb0f511 to d14fd65 Compare November 15, 2024 05:27
@kolesnikovae kolesnikovae merged commit 7ca3dcf into main Nov 15, 2024
29 checks passed
@kolesnikovae kolesnikovae deleted the chore/metastore-api-refactoring branch November 15, 2024 07:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants