Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dm validation binlog position is not accurate and not updated frequently enough #8463

Open
hihihuhu opened this issue Mar 7, 2023 · 13 comments
Labels
type/feature Issues about a new feature

Comments

@hihihuhu
Copy link
Contributor

hihihuhu commented Mar 7, 2023

Is your feature request related to a problem?

  1. the current flushed binlog position for validation is recorded in the validator after it dispatches the event, https://github.com/pingcap/tiflow/blob/master/dm/syncer/data_validator.go#L989, however, that doesn't mean the worker has picked it up and update the pending count as it happens in another thread, thus binlog position in validator >= a specific binlog position and pending count = 0 doesn't guarantee the downstream is fully caught up with upstream.

  2. also currently, the frequency to update the position tights to the meta flush interval, which is too infrequent. in order to use the validation information to determine if downstream is caught up with upstream during the failover process, it requires the binlog position updated in realtime to avoid extended downtime, which it should be totally possible because it is just a in memory meta.

Describe the feature you'd like

for 1, increase the pending event counter in data validator instead of worker thread
and for 2, should rename flushedLoc to currentLoc and update it everytime dispatches an event

Describe alternatives you've considered

No response

Teachability, Documentation, Adoption, Migration Strategy

No response

@hihihuhu hihihuhu added the type/feature Issues about a new feature label Mar 7, 2023
@lance6716
Copy link
Contributor

/cc @D3Hunter @buchuitoudegou

@D3Hunter
Copy link
Contributor

D3Hunter commented Mar 8, 2023

binlog position printed in validator status is the position where validator has processed, so if it >= a specific binlog position A and pending count = 0 it DOES guarantee the downstream is fully caught up with upstream at position A

validator checkpoint not only include the position, also include pending data which may need retry, since the pending data might quite large depends on how data is changed on upstream, flush checkpoint too frequent may slow validator down and add more pressure to downstream, so the interval is set to a quite large value(5 minutes)

@hihihuhu
Copy link
Contributor Author

hihihuhu commented Mar 8, 2023

thanks for the reply! could you help me to understand more on this? my understanding is the processRowsEvent https://github.com/pingcap/tiflow/blob/master/dm/syncer/data_validator.go#L694 dispatches the event, then the worker thread consumes the mutation async https://github.com/pingcap/tiflow/blob/master/dm/syncer/validate_worker.go#L118 and increases pending event count. so it is possible the data validator goes ahead and record the binlog position, while the validation worker is very slow and hasn't picked up the mutation yet, which breaks the assumption you mentioned?

i agree the flush checkpoint should not be frequently, however, the frequency to update the in memory binlog position doesn't have to be the same of flushing checkpoints? all other fields in the status are updated in realtime, like pending count, but not the binlog position

@D3Hunter
Copy link
Contributor

D3Hunter commented Mar 9, 2023

validator.flushedLoc is the loc we have saved in checkpoint last time, all events before it have been validated or in the checkpoint pending data waiting to be retried.

flushedLoc *binlog.Location

while the validation worker is very slow and hasn't picked up the mutation yet

before flush we dispatch a flush job to all worker and wait all worker has processed all jobs before the loc

data validator goes ahead and record the binlog position

this position is stored in validator loop. currEndLoc moved forward every time we receive a event, locationForFlush is changed only we met transaction boundary. those 2 var reflect the location we've received or dispatched but not the location worker has processed.

currEndLoc := v.streamerController.GetCurEndLocation()
locationForFlush = v.streamerController.GetTxnEndLocation()

@hihihuhu
Copy link
Contributor Author

hihihuhu commented Mar 9, 2023

oh, i c, thanks, it makes sense!

then i think the issue is if it is possible to update the binlog position more realtime? this is critical for us because during aurora -> tidb migration, we will cut all aurora connection, wait for tidb catching up (binlog position + pending event check in validation), then route traffic to tidb, thus if update binlog position relies on flush checkpoint, then it means very long downtime in the failover process.

do you think we could improve this? thanks!

@D3Hunter
Copy link
Contributor

does the switch process like this:

  • stop write to aurora and get the binlog pos
  • wait dm sync data to the pos(about 30s on default config to flush syncer checkpoint). forcely kill connection won't work since syncer rely on transaction boundary or heartbeat event to trigger checkpoint flush
    • and since dm validator never validate before sync's progress
  • wait dm validator reached the pos and there's no pending data
  • switch to tidb

and what's the max allowed downtime in your case?

it can be done, just how to do it in a good way.

@hihihuhu
Copy link
Contributor Author

hihihuhu commented Mar 10, 2023

yes, that is pretty much the flow at high level, one more thing in the system is we have a proxy in front, so we could kill connections, switch backends in it at runtime. currently our aurora -> aurora failover has 2-3 sec downtime, so it would be hard for us if the downtime of aurora -> tidb is longer than perhaps 10 sec. thus, right now, we are shoot for less than 10 sec on happy path.

in our testing, we set checkpoint-flush-interval to 1 s, but cannot reduce meta-flush-interval to less than 1 hour, especially when the job just starts, there would be many mutations to flush out for validation purpose.

@D3Hunter
Copy link
Contributor

one way to achieve this might be:

  • after upstream stop write, send a cutover command to dm job
  • then dm get current master binlog position, the syncer(since validator never validate before syncer's progress, we need it reach master pos too) and validator try to compare current progress and master position in some small interval(maybe in another routine)
  • when syncer/validator progress >= master position, flush checkpoint and exit this mode.
  • then query-status will see that the validator progress reaches master

@hihihuhu
Copy link
Contributor Author

hihihuhu commented Apr 14, 2023

yes, i think that could work as well, thanks! also, it would be great if the cutover api is part of https://github.com/pingcap/tiflow/blob/master/dm/pb/dmmaster.pb.go#L3756, which we are using in our migration tools programmatically

@okJiang
Copy link
Member

okJiang commented Sep 26, 2023

If there are still problems, you can continue to reopen this issue.

@okJiang okJiang closed this as completed Sep 26, 2023
@hihihuhu
Copy link
Contributor Author

@okJiang could you help to link the pr fixing implementing this ask and perhaps with examples on how to use it? thanks!

@okJiang
Copy link
Member

okJiang commented Sep 27, 2023

I carefully browsed the previous Q&A. I found that the solution to your doubts has been given in the answer. But you still want to be able to use the API directly. I think this is still a feature request and has yet to be implemented. So I reopen this issue. Sorry for accidentally closing it @hihihuhu

@okJiang okJiang reopened this Sep 27, 2023
@Frank945946
Copy link

@hihihuhu It has been released in V7.5, user doc

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/feature Issues about a new feature
Projects
None yet
Development

No branches or pull requests

5 participants