Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor Cluster and Catalog mods #235

Closed
6 tasks done
waynexia opened this issue Sep 6, 2022 · 2 comments · Fixed by #323
Closed
6 tasks done

Refactor Cluster and Catalog mods #235

waynexia opened this issue Sep 6, 2022 · 2 comments · Fixed by #323
Assignees
Labels
feature New feature or request

Comments

@waynexia
Copy link
Member

waynexia commented Sep 6, 2022

Describe This Problem

A rough dependency graph of related structures:

cluster

Cluster and CatalogManager are public, they may be used by Client or CeresMeta to modify catalog information (create, open or delete tables). This graph shows two paths of creating/droping a table. The green one is triggered by server itself and another orange one is triggered by CeresMeta

  • Green path
    CatalogManager can create/drop tables when it has write access (the leader role). 3-1 in this path is notifying CeresMeta that a new table has been created, and 3-2 is an operation that happens simultaneously that create/drop that table in the leader itself.
  • Orange path
    This path can be treated as counter part of the green one. CeresMeta may notify non-leader node that something has changed. It also goes through EventHandler to notify Cluster and let it take action.

Proposal

From my first impression, the dependency and call graph is complicated. I'd like to change relative parts in the following ways:

  • Combine VolatileCatalog and Cluster Combine VolatileCatalog and Cluster #245
    Cluster is responsible to manage Tables and Shards while Catalog is responsible to manage Schemas which is a group of tables in another perspective different from Shard. Their functionalities have a big part that overlaps. It's natural to implement both Cluster and Catalog traits over one struct, so they can use one memory state and maybe one operation logic.
  • Remove SchemaIdAlloc and TableIdAlloc refactor: remove SchemaIdAlloc and TableIdAlloc #238
    Id should be allocated by CeresMeta, not the server. And furthermore, the server does need not to allocate ID then create table. The entire procedure is accomplished by CeresMeta.
  • Split MetaClient
    Current MetaClient provides two functionalities: poll update from CeresMeta and invoke CeresMeta's method. These correspond to the two ways above: green arrows are from server to CeresMeta and orange are from CeresMeta to server. We can split these two aspects, by making MetaClient an actual client that invokes RPC to CeresMeta and an Eventloop that keeps fetching updates from CeresMeta. This can also resolve the cyclic dependency between Cluster and MetaClient.

Due to #235 (comment), here are the new tracking tasks:

Additional Context

No response

@waynexia waynexia added the feature New feature or request label Sep 6, 2022
@ShiKaiWi
Copy link
Member

ShiKaiWi commented Sep 6, 2022

@waynexia 👍 I can't agree more with this proposal.

@ShiKaiWi
Copy link
Member

ShiKaiWi commented Oct 12, 2022

During the development, I find it is hard to combine the VolatileCatalog and Cluster because create/drop/open/close procedures are too complex to be processed by the combined module in cluster mode, e.g. in the create table procedure, the combined one should be able to handle two different cases: create by user (the request will be forwarded to ceresmeta) and create by ceresmeta (do the real creation), and that is ugly.

Here is the new proposal, which will still keep the VolatileCatalog and Cluster separate, but the event handler will be removed, and the TableManager in the Cluster will be exposed and will be held by VolatileCatalog:

 ┌───────────┐                                             
 │  Cluster  │───────Create/Drop/Open/Close table          
 └───────────┘                │                            
       │                      ▼                            
       │             ┌─────────────────┐                   
       │             │TableManipulator │───────────┐       
       │             └─────────────────┘           │       
       │                                           ▼       
       │                                   ┌──────────────┐
       │                                   │CatalogManager│
       │                                   └──────────────┘
       ▼                                           │       
┌────────────┐                                     │       
│TableManager│◀────────────────────────────────────┘       
└────────────┘     

As for the create/drop table procedure, a new proxy called TableCreator will be added to handle different cases:

         ┌─────────────┐           
      ┌──│TableCreator │────┐      
      │  └─────────────┘    │      
      │                     │      
      ▼                     ▼      
┌───────────┐         ┌───────────┐
│ From user │         │ From meta │
└───────────┘         └───────────┘
      │                     │      
      ▼                     ▼      
┌───────────┐         ┌───────────┐
│Meta client│         │  Cluster  │
└───────────┘         └───────────┘
                            │      
                            ▼      
                      ┌───────────┐
                      │CatalogMana│
                      └───────────┘

Here is the tracking task:

  • Remove the current catalog based on cluster, and add table manager to volatile catalog;
  • Add TableCreator & TableDropper, and refactor the create/drop table procedure;
  • Implement the open/close shards of cluster;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants