-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: VSchema based routing and resharding #4790
Comments
While drilling down on the design, I came across an additional use case: users that treat vitess as a multi-schema server connect to specific keyspaces. We have to make vertical splits work in such cases also. In order to accommodate this, we'll extend the routing rule key to be So, if a table t is migrating from ks1 to ks2, we can start with Additionally, the list of tables as target will change how the vtgate optimizer will work. Currently, the keyspace for a route gets decided when it gets created. Now that a table can be in different keyspaces, we'll create routes with multiple routing options. As the plan evolves we'll eliminate the ones that are not suitable. At the end, we'll choose the first one if more than one is left. |
This is the first part of the changes to implement vitessio#4790. This part implements all the management functionality for routing rules. Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
In this change the query routing takes the possibility that there could be multiple target options for a given table. The design for this is explained in vitessio#4790. At a high level: * VSchema.FindTableOrVindex function can return a list of tables instead of a single one. * The route planbuilder creates multiple routeOptions, one for each table returned. * All actions that affected the plan of a route are changed to update all routeOptions. * If a particular routeOption cannot accommodate a pushed down construct, it's removed from the list. Previously, this was an error case. But if no options are left, then we return an error. * If two routeOptions qualify for a merge of routes, then all other combinations that don't qualify are discarded. This is the case for joins, subqueries and unions. More details: vindexTable was renamed to the more appropriate vschemaTable. In order to achieve this, a new routeOption data type was introduced, and route was changed to contain a list of routeOptions. In symtab, tables used to point at the vschema table that was used to build them. Since a table can now represent multiple target tables, this field has been moved into routeOption. In symtab, columns used to contain a vindex member. Since this can change depending on the target table, the routeOption now contains a map of column to vindexes instead. The routeOption also contains the vschemaTable. DMLs use this information. Since DMLs have to be more deterministic about the table they write to, they always choose the first option. At the beginning of the Wireup phase, we evaluate all existing options and decide on the best available. To be done: When a table has multiple targets, the targets can have different names than the original table. If so, the queries have to be rewritten to address the new target tables. In order to do this, each routeOption will contain a list of substitutions that will be made during the Wireup phase. Tests have to be written for the new flows. Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Current design
VTGate performs routing using different methods:
Problem statement
The ServedFrom scheme is meant to only allow for one-time vertical splitting of a keyspace into two. This does not meet other growing requirements like:
This ServedFrom approach also does not allow us to reverse replication after a master migration because the model is not symmetric.
The Shard Map approach does meet existing and future needs for now.
Requirements
The new design should not only address the above problems, but it should also accommodate the following new use cases:
Proposed design
The high level proposal is to deprecate the ServedFrom approach in favor of implementing a more versatile functionality at the VSchema level.
The current VSchema design works at the per-keyspace level. But the above requirements define interactions that go across keyspaces. Although it’s possible to find a way to express these within the scope of individual keyspaces, it will be better to extend the structure of a VSchema.
We will introduce the concept of
RoutingRules
. These rules will be global instead of being keyspace-specific. However, they will become part of the SrvVSchema when all the vschemas are combined for serving.Studying the above requirements, we can see two orthogonal concepts emerging:
This can be represented as a map from (table,tablet_type) pair into a list of keyspace_qualified tables. This mapping will be resolved to specific table pointers after all the vschema is combined for all the keyspaces.
For example, a map for vertical resharding where rdonly has migrated will look like this:
t@* is matched last.
By default, every unique table will be
t
: [ks.t]`.In the case of a vreplicated table from ks1 to ks2, the rule will be
t@*: [ks1.t, ks2.t]
. This rule will mean that a reference tot
can resolve to ks1.t or ks2.t, whichever is favorable. ks1 will be preferred by default.Since the map and list are orthogonal, it’s possible to combine them like this:
Reference tables
The case of reference tables is different. This will need to be stored within the keyspace as metadata for the table (like sequences).
Transitioning state
We have to rely on the principle that the lockserver data cannot be relied upon for timely delivery. This means that workflows should use an alternate mechanisms for situations where timeliness is required.
For example, while migrating masters (or writes), we have to first force readonly on the source. This is currently achieved by pushing tablet control records or blacklisted tables into the topo. Instead, we’ll reimplement this by directly writing this metadata into the relevant vttablets where the action will be taken.
The topo changes will be used only to transmit the rest of the transitions. In the case of write transitions, there is a period of exposure where we would have marked the source as readonly and the vtgates have not received the updated vschema. This is unavoidable. However, we can have the assurance that no spurious writes will go to the source. We’ll only be serving some transient errors.
The text was updated successfully, but these errors were encountered: