Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make TabletExternallyReparented more robust #5151

Merged
merged 8 commits into from
Sep 13, 2019

Conversation

deepthi
Copy link
Member

@deepthi deepthi commented Aug 31, 2019

This is an attempt to fix some problems found by @sougou and @enisoc
TabletExternallyReparented should set the type to REPLICA not only on the current "oldMaster" but also on any lingering old masters from previous unsuccessful reparents.

Changes in this PR (edited from the list written by @enisoc below)

  • Remove the short circuit on the shard record being up-to-date. This ensures future TER calls will try to do cleanup even if the shard record has already been updated.
  • Update the shard record as soon as the new and old master tablet records have been updated. This ensures that everyone knows who the new master is even if we fail to update some of the impostors, which in the context of TER is only a best-effort cleanup.
  • In parallel with updating the shard record, we should immediately start trying to refresh the old master, since ensuring it knows it's not master is the next most important thing.
  • Only after all that do we start trying to do best-effort cleanup of impostors: listing tablets for the shard, updating tablet records, and refreshing tablets.

When is TER is marked as finished vs failed?

  • if the global shard record was updated correctly AND the old master and new master tablet records were updated correctly AND any lingering old masters have had their tablet records updated correctly, then TER is marked as finished. If the topo is healthy, there is no reason for any of these to fail.
  • if any of these fails, then it is marked as failed. re-running TER is safe and might make more progress than a previous attempt.
  • if the old master(s) have not refreshed their state correctly, we only log a warning (along with the error from the call to RefreshState), we don't mark TER as failed. This is only expected to happen if that particular tablet is unreachable. However, you do need to watch for if and when the old tablets come back online. An automated fix to this situation is planned for a future PR (Issue Handle two-master scenarios #5173)

Signed-off-by: deepthi deepthi@planetscale.com

@deepthi deepthi requested a review from sougou as a code owner August 31, 2019 01:01
@deepthi deepthi requested a review from enisoc August 31, 2019 01:01
}()
}

// update the shard record first
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why update the shard record first? Traditionally, updating the shard record means the reparent is done. That's how we ensure that subsequent TER calls will fix the tablet records if this attempt fails.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

take a look at the latest and let me know if you still think this is a problem

go/vt/wrangler/reparent.go Outdated Show resolved Hide resolved
@@ -498,22 +498,21 @@ func (wr *Wrangler) plannedReparentShardLocked(ctx context.Context, ev *events.R
wgSlaves.Wait()
return fmt.Errorf("failed to PopulateReparentJournal on master: %v", masterErr)
}
// Wait for the slaves to complete.
Copy link
Member Author

@deepthi deepthi Sep 3, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

making this change breaks one of the integ tests:
https://github.com/vitessio/vitess/blob/master/test/reparent.py#L572

We end up with this error:
TopologyServer has inconsistent state for shard master
because we have updated the masterElect to be master, but not updated the shard record.

hence I have reverted it.

cc: @enisoc

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that test case raises a good point. We can't require slaves to complete in order to consider the reparent finished, because we need to be able to make progress even if some slaves are down.

// timestamp to the current time.
agent.setExternallyReparentedTime(startTime)

if topoproto.TabletAliasEqual(si.MasterAlias, tablet.Alias) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This short circuit was important because YouTube's equivalent of Orchestrator would call TabletExternallyReparented every 10 seconds on the master, even if nothing has changed. We don't do that yet in the Orchestrator integration, but it was always intended and the integration is incomplete without that constantly repeating signal.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

!! Why did it do that? Was the intent to have it basically be a self healing loop?

// This is where updateState will block for gracePeriod, while it gives
// vtgate a chance to stop sending replica queries.
agent.updateState(ctx, newTablet, "fastTabletExternallyReparented")
// always run finalize even if the master tablet has the correct state
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is starting to be a significant departure from the original design of TER - in particular the split into a fast stage with the short-circuit based on the shard record and a slow "finalize" stage that only runs when something has actually changed.

By recommending TER as a fix for cases when PlannedReparentShard fails, we're also mixing code paths that were never intended to be mixed. TER was supposed to only be used when Vitess-native reparents are disabled.

I'm starting to think what we really need is to make PlannedReparentShard itself more robust. If PRS fails, the solution should be to run PRS again pointing at the same master. It's been a longstanding problem with Vitess reparents that there are no commands to recover from partial operations. I don't think co-opting TER for a purpose outside its intended scope is the right answer.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a good summary of the original design?
https://vitess.io/docs/user-guides/reparenting/#external-reparenting
Code in master as of right now doesn't actually perform part2 of step 5 (change tablet type to spare).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's been a longstanding problem with Vitess reparents that there are no commands to recover from partial operations.

This is a bit out of scope for this PR, but 👍 I agree completely. Partial Vitess reparents have been tricky for us to manually recover from in the past, and it'd be great to make it safe to retry without having to manually inspect the cluster.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TER was supposed to only be used when Vitess-native reparents are disabled.

This is really interesting history that I didn't know!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

part2 of step 5 (changing the tablet type of the unresponsive old master to spare) seems like it would actually solve a lot of problems we've seen when a master is unavailable for a bit, a TER happens, and then the master comes back online. I would prefer the spare (or other non-serving tablet type) to setting it to replica.

@@ -498,22 +498,21 @@ func (wr *Wrangler) plannedReparentShardLocked(ctx context.Context, ev *events.R
wgSlaves.Wait()
return fmt.Errorf("failed to PopulateReparentJournal on master: %v", masterErr)
}
// Wait for the slaves to complete.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that test case raises a good point. We can't require slaves to complete in order to consider the reparent finished, because we need to be able to make progress even if some slaves are down.

Copy link
Member

@enisoc enisoc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, I think we need to step back and redefine our goals. We shouldn't try to make TER better suited to fix when PRS experiences partial success. TER was never meant to be used on the same cluster as PRS at all.

Instead, our end goal should be that the fix for any partial failure of PRS is simply to run PRS again with the same arguments. We need to make PRS smart enough to skip things that are already consistent with the desired end state and fix up anything that isn't consistent.

@sougou
Copy link
Contributor

sougou commented Sep 3, 2019

I agree in principle with TER not being the remedy for PRS. This is mainly because PRS also needs to fix replication for the tablets that got left out.

One thing missing in PRS is the -cancel option. If the point of no return has not been reached (and the in-command cancel failed), one should be able to revert to the old master. I'm actually implementing this same capability in the resharding MigrateWrites command.

Finally, a sanity check to make sure replication is not too far behind. If so, don't even start the reparent.

@deepthi
Copy link
Member Author

deepthi commented Sep 3, 2019

Overall, I think we need to step back and redefine our goals. We shouldn't try to make TER better suited to fix when PRS experiences partial success. TER was never meant to be used on the same cluster as PRS at all.

Instead, our end goal should be that the fix for any partial failure of PRS is simply to run PRS again with the same arguments. We need to make PRS smart enough to skip things that are already consistent with the desired end state and fix up anything that isn't consistent.

What is the strategy if TER fails (supposing you never ran a PRS)?

@enisoc
Copy link
Member

enisoc commented Sep 3, 2019

One thing missing in PRS is the -cancel option. If the point of no return has not been reached (and the in-command cancel failed), one should be able to revert to the old master.

If we make PRS mean "fix up anything that's inconsistent with the idea of X being the master", then I would expect the "rollback" process to be "run PRS pointing at the old master".

Or do you feel it's important for usability or automation that the command doesn't require you to specify the old master? In that case, I still would prefer something like -current_master rather than -cancel so it works for fixing any kind of inconsistency, not just a prior PRS that failed before the point of no return.

What is the strategy if TER fails (supposing you never ran a PRS)?

In the world for which TER was designed (when all Vitess-native reparents are disabled), running TER again should always be the answer. It may not perfectly handle all possible cases, but that's the intended strategy.

@sougou
Copy link
Contributor

sougou commented Sep 3, 2019

If we make PRS mean "fix up anything that's inconsistent with the idea of X being the master", then I would expect the "rollback" process to be "run PRS pointing at the old master".

It comes down to whether we have enough metadata to revert the last attempted reparent. We should support -cancel only if we have that metadata, or decide to create it somewhere. Requiring user to specify -current_master will be problematic.

I'm actually planning to use the new set @@vitess_metadata. construct for MigrateWrites.

@zmagg
Copy link
Contributor

zmagg commented Sep 4, 2019

At a high level, I'd love more context about how this PR fits in Vitess's design intentions around master failure. There's a lot of interesting historical nuggets of information around TER and PRS in this PR, as is, and I'm curious if @enisoc or @sougou have any more knowledge to share.

My understanding today about Vitess's design principles about master failure (which could be wrong), was that the intention is that on hard crashes of master tablets, that the masters should be deprovisioned and never restarted, as it's unsafe to have them serve again as masters. I'm curious if this was actually an explicit design principle of Vitess, or if this was just a side effect due to running on Borg.

With how we run Vitess at Slack, it is definitely unsafe for a master to come back again, even as a replica, after a hard crash. We run with innodb_flush_log_at_trx_commit=2 && sync_binlog=1 and this can result in transactions that are only on the hard-crashed master's local disk. If that master comes back (even as a replica), and is at some point promoted to be a master again, it can result in errant transactions.

In this PR, it seems like if a master hard crashes and restarts, it'll be made a replica again and available for later promotion to master, which would cause errant transactions and replication breakage. For us, we'd prefer to have any lingering tablets who think they're still the master be marked as spare or non serving types instead of replicas.

@deepthi
Copy link
Member Author

deepthi commented Sep 5, 2019

@zmagg this is very good feedback. It prompted me to go look at the history of TER changing old master to spare vs replica. I found the commit but not an explanation 😞
1161251

It seems like there was a decision to reduce/remove usage of spare, but I think we'll need to have @enisoc or @sougou provide context.

@sougou
Copy link
Contributor

sougou commented Sep 5, 2019

At a high level, I'd love more context about how this PR fits in Vitess's design intentions around master failure. There's a lot of interesting historical nuggets of information around TER and PRS in this PR, as is, and I'm curious if @enisoc or @sougou have any more knowledge to share.

This PR itself doesn't change anything about how we handle master failures. It just fixes some problems we found where certain failures led to states that were hard to recover from. For clarity, we've now split this into three objectives, which will become separate PRs.

  1. Make PRS more robust. Provide ways to heal the system if it failed. @deepthi will remove the PRS changes (if she hasn't done so already).
  2. Make TER idempotent. If it fails, a retry should complete what it missed out.
  3. Run a watchdog thread (likely in vtctld), that will continuously monitor for the two-master scenario and demote the impostor.

But the questions below are good ones, and answers are below:

My understanding today about Vitess's design principles about master failure (which could be wrong), was that the intention is that on hard crashes of master tablets, that the masters should be deprovisioned and never restarted, as it's unsafe to have them serve again as masters. I'm curious if this was actually an explicit design principle of Vitess, or if this was just a side effect due to running on Borg.

This was an explicit design principle while at YouTube. However, people have started using mounted drives, especially while running on Kubernetes. In such cases, they rely on the durability guarantees provided by the mounted drives and prefer to recover instead of using a restore. This is because kubernetes local storage is too ephemeral, and they want better assurances that all their data will not be lost if there's a catastrophic failure.

With how we run Vitess at Slack, it is definitely unsafe for a master to come back again, even as a replica, after a hard crash. We run with innodb_flush_log_at_trx_commit=2 && sync_binlog=1 and this can result in transactions that are only on the hard-crashed master's local disk. If that master comes back (even as a replica), and is at some point promoted to be a master again, it can result in errant transactions.

Right. But this failure mode is easy to remedy because the old master will not be able to replicate from the new master because of the errant GTIDs. If this happens, we just have to restart the vttablet in "restore mode" (empty datadir).

In this PR, it seems like if a master hard crashes and restarts, it'll be made a replica again and available for later promotion to master, which would cause errant transactions and replication breakage. For us, we'd prefer to have any lingering tablets who think they're still the master be marked as spare or non serving types instead of replicas.

This would be true if TER was used only for catastrophic failures. But vitess allows an external agent to reparent mysqls even if there was no crash, and TER could be used in such situations. If so, the old master is eligible to become a replica. YouTube ran it this way, and there may be others who do the same.

@derekperkins
Copy link
Member

TER was never meant to be used on the same cluster as PRS at all.

Make PRS more robust. Provide ways to heal the system if it failed. @deepthi will remove the PRS changes (if she hasn't done so already).

I'm very excited about PRS improvements and eliminating TER from our management. We end up with imposter double masters fairly often.

@deepthi deepthi changed the title make reparents more robust make TabletExternallyReparented more robust Sep 7, 2019
tablet := agent.Tablet()
tabletMap, err := agent.TopoServer.GetTabletMapForShard(ctx, tablet.Keyspace, tablet.Shard)
// make the channel buffer big enough that it doesn't block senders
tablets := make(chan topodatapb.Tablet, len(tabletMap))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe call this tabletsToRefresh?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we're going to ignore errors from GetTabletMapForShard, len(tabletMap) could be 0 which would cause a deadlock below. Might as well add 1 to the buffer size.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, we might not technically need a channel at all. Since we bail out if any errors occur, we should get the same result if we append tablets to a slice before launching each per-tablet goroutine (before we know whether the tablet record update will succeed, but after we decided to try to update it), which removes the need for synchronization. However, I'm fine keeping this as a channel if you prefer, since the proof that it isn't needed is a bit obscure.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was being cautious here, because UpdateTabletFields reads the latest tablet info from topo, and updates and returns it. This ensures that we are using the latest tabletInfo, but maybe it doesn't matter?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see what you mean. I don't think it matters in this case since I believe tmc.RefreshState only looks at the tablet host/port, and we aren't changing those fields. There's a slight possibility the tablet might update its own host/port in between when we first read it and when we call RefreshState, but that window will always be non-zero and it doesn't get much smaller by taking the result from UpdateTabletFields, since we call those concurrently right after we fetched the latest values from GetTabletMapForShard.

That said, the above is yet another non-obvious nuance in the proof that we don't need synchronization, so it's probably better to just keep the synchronization to be defensive against future code changes.

tablet := agent.Tablet()
// update any other tablets claiming to be MASTER also to REPLICA
for alias, tabletInfo := range tabletMap {
if alias != topoproto.TabletAliasString(agent.TabletAlias) && alias != topoproto.TabletAliasString(oldMasterAlias) && tabletInfo.Tablet.Type == topodatapb.TabletType_MASTER {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can compare directly without converting to string: topoproto.TabletAliasEqual(tabletInfo.Alias, x)

@@ -160,9 +161,14 @@ func (agent *ActionAgent) finalizeTabletExternallyReparented(ctx context.Context
}
}()

tablet := agent.Tablet()
tabletMap, err := agent.TopoServer.GetTabletMapForShard(ctx, tablet.Keyspace, tablet.Shard)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should log this error and add a comment saying we intentionally do not bail out because we still want to process whatever we found if there is a "partial result" error (some cells inaccessible), and even if this completely fails we still want to process the old master.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you mean print to the log or collect it in errs?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it needs to be printed to log and not collected in errs. If we collect it in errs it will cause us to bail out later. However, that will mean TER gets stuck/blocked if any topo cells are unavailable.

This is a common concern throughout Vitess. Any time we try to access topo cross-cell (as we're doing here, since GetTabletMapForShard has to read tablets from all cells where the shard has tablets), we need to be careful to not get blocked in the case when one or more cells is unavailable. Otherwise, every cell becomes a single point of failure for that operation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose this is one of the reasons we call finalize with a relatively short timeout (30s), because any of the various things we are doing in finalize could block if the target component is unavailable.

if !topoproto.TabletAliasIsZero(oldMasterAlias) {
wg.Add(1)
go func() {
log.Infof("finalizeTabletExternallyReparented: updating tablet record for old master: %v", oldMasterAlias)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By convention, defer wg.Done() should remain the first line so it's immediately clear that it always gets called.

if alias != topoproto.TabletAliasString(agent.TabletAlias) && alias != topoproto.TabletAliasString(oldMasterAlias) && tabletInfo.Tablet.Type == topodatapb.TabletType_MASTER {
log.Infof("finalizeTabletExternallyReparented: updating tablet record for another old master: %v", alias)
wg.Add(1)
go func() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to pass in a copy of any loop variables needed inside the goroutine, as you do below.

@@ -221,22 +250,24 @@ func (agent *ActionAgent) finalizeTabletExternallyReparented(ctx context.Context
errs.RecordError(err)
}
}()
if !topoproto.TabletAliasIsZero(oldMasterAlias) {
wg.Wait()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need a new Wait here? These steps were previously done concurrently on purpose, since they aren't interdependent.

…master to REPLICA

Signed-off-by: deepthi <deepthi@planetscale.com>
Signed-off-by: deepthi <deepthi@planetscale.com>
@deepthi
Copy link
Member Author

deepthi commented Sep 10, 2019

@enisoc thanks for the detailed feedback. I have addressed the changes you requested.
Is there something further we can do to handle this?
"Any time we try to access topo cross-cell (as we're doing here, since GetTabletMapForShard has to read tablets from all cells where the shard has tablets), we need to be careful to not get blocked in the case when one or more cells is unavailable. Otherwise, every cell becomes a single point of failure for that operation."

Copy link
Member

@enisoc enisoc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall. Had one more potential defensive suggestion.

Is there something further we can do to handle this?

It looks ok to me, afaict. I only mentioned that as justification for not bailing out on a GetTabletMapForShard error, which we don't do.

var err error
tab, err := agent.TopoServer.UpdateTabletFields(ctx, alias,
func(tablet *topodatapb.Tablet) error {
tablet.Type = topodatapb.TabletType_REPLICA
Copy link
Member

@enisoc enisoc Sep 10, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this is overly paranoid, but perhaps we should add both here and in the original code a check that the tablet type is still MASTER before we force it to REPLICA? There's a small chance we might race with a human trying to do something like force it SPARE, and then we overwrite it to REPLICA.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

Signed-off-by: deepthi <deepthi@planetscale.com>
Copy link
Member

@enisoc enisoc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Signed-off-by: deepthi <deepthi@planetscale.com>
Copy link
Contributor

@sougou sougou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking of one minor improvement in the flow: update the MasterAlias as soon as we have successfully updated the old master to be a replica. There's no use remembering the old value after this.

This way, if the next step of chasing down the rest of the tablets fails, then the updated MasterAlias is still usable by others.

@enisoc
Copy link
Member

enisoc commented Sep 11, 2019

I thought one of the goals was that, whenever feasible, rerunning TER should fix up anything that was missed due to partial failure. If we update the shard record but updating the impostor master fails, then we will not retry next time since we assume the shard record being updated means everything is done.

@sougou
Copy link
Contributor

sougou commented Sep 11, 2019

Right. I was only recommending that the MasterAlias be changed after we've successfully downgraded the old master. But before checking the rest of the tablets (// update any other tablets claiming to be MASTER also to REPLICA)

@sougou
Copy link
Contributor

sougou commented Sep 11, 2019

Specifically, the use case I have in mind is the one where we update the old master successfully, but the part that checks if there are other impostors chronically fails. If so, we'll never update the MasterAlias, which will cause all workflows that rely on this info to fail.

@enisoc
Copy link
Member

enisoc commented Sep 11, 2019

but the part that checks if there are other impostors chronically fails

If we fail to read topo to even check for impostors, we just skip the check. It will only block if we actually do find an impostor master. In that case, do you really prefer that we leave what we know is an impostor master behind and mark the shard as done so we never retry fixing the impostor master?

I'm not asking because I disagree. I just want to be clear on the trade-off we're making.

@enisoc
Copy link
Member

enisoc commented Sep 11, 2019

@deepthi Sugu and I talked and worked out his concerns. Here's what we're thinking now:

  • Remove the short circuit on the shard record being up-to-date, like you had done originally. This ensures future TER calls will try to do cleanup even if the shard record had already been updated. I previously objected to removing the short circuit based on the idea that we might emulate Decider by calling TER at a fast frequency. However, Sugu convinced me there's a better solution (outside the scope of this PR) that will mean we won't have to call TER that often, so it's fine to drop the short circuit.
  • Update the shard record as soon as the new and old master tablet records have been updated. This ensures that everyone knows who the new master is even if we fail to update some of the impostors, which in the context of TER is only a best-effort cleanup.
  • After updating the shard record, we should immediately start trying to refresh the old master, since ensuring it knows it's not master is the next most important thing.
  • Only after all that do we start trying to do best-effort cleanup of impostors: listing tablets for the shard, updating tablet records, and refreshing tablets.

How does that sound to you?

@deepthi
Copy link
Member Author

deepthi commented Sep 11, 2019

Sounds good. I will make the changes.
This reminded me of why I had added the extra wg.Wait() in the first place. It was to ensure RefreshState on the old master completed before we started trying to update any impostors.

…does not complete all steps

Signed-off-by: deepthi <deepthi@planetscale.com>
@deepthi
Copy link
Member Author

deepthi commented Sep 11, 2019

@sougou @enisoc this PR is ready for final review.

@sougou
Copy link
Contributor

sougou commented Sep 11, 2019

LGTM
@enisoc: we can merge once you eyeball it.

Signed-off-by: deepthi <deepthi@planetscale.com>
…oops into 1 and remove use of channel that is no longer needed

Signed-off-by: deepthi <deepthi@planetscale.com>
… use one tmc for all calls to RefreshState

Signed-off-by: deepthi <deepthi@planetscale.com>
Copy link
Member

@enisoc enisoc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants