Skip to content

Migration stuck before cutover due to binlog streamer error #1603

@grodowski

Description

@grodowski

Result of troubleshooting a stuck migration with the following symptoms:

  • running, copy at 100%
  • heartbeat lag growing infinitely
  • no errors or crashes

Investigating gh-ost logs showed the following inbetween the last "healthy" progress log and the one where heartbeat lag started to grow:

[gh-ost] : 2025-10-31 21:00:37 INFO rotate to next log from binlog.1000000:0 to binlog.1000000
[gh-ost] : 2025-10-31 21:00:37 INFO rotate to next log from binlog.1000000:104865545 to binlog.1000000
[gh-ost] : 2025-10-31 20:59:52 INFO rotate to next log from binlog.999999:0 to binlog.999999
[gh-ost] : 2025-10-31 20:59:52 INFO rotate to next log from binlog.999999:0 to binlog.999999
...
[gh-ost] : 2025-10-31 20:59:52 INFO rotate to next log from binlog.999999:0 to binlog.999999
[gh-ost] : 2025-10-31 20:59:52 INFO rotate to next log from binlog.999999:104997866 to binlog.999999
[gh-ost] : 2025-10-31 20:59:52 INFO rotate to next log from binlog.999999:0 to binlog.999999
[gh-ost] : [2025/10/31 20:59:52] [info] binlogsyncer.go:868 rotate to (binlog.999999, 4)

Which confirmed my hypothesis that the streamer started dropping all new events erroneously, because the filenames are compared lexographically in the current SmallerThan implementation, which treats 999999 > 1000000:

if this.currentCoordinates.SmallerThanOrEquals(&this.LastAppliedRowsEventHint) {
this.migrationContext.Log.Debugf("Skipping handled query at %+v", this.currentCoordinates)
return nil
}

Please note that BinlogFile has been changed recently (005043d#diff-0b91aa3798ba83a920a77a09b6adf3bfdffbf3cf5f22e323b66753f7affb8ebd), but I thought it makes sense to apply the proposed fix anyway.

Opening a PR shortly ⌛

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions