Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rare race condition: Copying and flushing backup files at the same time #17065

Open
frouioui opened this issue Oct 25, 2024 · 0 comments
Open

Comments

@frouioui
Copy link
Member

Overview of the Issue

While working on #17062 I have discovered a rare race condition that happens when we fail to write to the destination file. Which is what the test introduced in #17062 does.

In be.backupFile we have a createAndCopy function that will copy one file (reader) to another (writer). In this function we call io.Copy(writer, reader) which in turns call pargzip.Writer.init, this init method starts a goroutine that writes chunks. While this goroutine is non-blocking, we go up the stack and go back to createAndCopy, which returns and ends up calling its defer function, in this defer we flush the same writer that is called in the goroutine instanced by pargzip.Writer.init leading to a data race.

Full stack trace:

=== RUN   TestExecuteBackupWithFailureOnLastFile
==================
WARNING: DATA RACE
Write at 0x00c0004c21e8 by goroutine 48:
  bufio.(*Writer).Write()
      /Users/florentpoinsard/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.2.darwin-arm64/src/bufio/bufio.go:692 +0x2c8
  vitess.io/vitess/go/vt/mysqlctl.(*backupPipe).Write()
      /Users/florentpoinsard/Code/vitess/go/vt/mysqlctl/builtinbackupengine.go:760 +0x54
  github.com/planetscale/pargzip.(*Writer).writeCompressedChunk()
      /Users/florentpoinsard/go/pkg/mod/github.com/planetscale/pargzip@v0.0.0-20201116224723-90c7fc03ea8a/pargzip.go:172 +0x15c
  github.com/planetscale/pargzip.(*Writer).init.func1()
      /Users/florentpoinsard/go/pkg/mod/github.com/planetscale/pargzip@v0.0.0-20201116224723-90c7fc03ea8a/pargzip.go:130 +0xd8

Previous read at 0x00c0004c21e8 by goroutine 38:
  bufio.(*Writer).Flush()
      /Users/florentpoinsard/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.2.darwin-arm64/src/bufio/bufio.go:636 +0x48
  vitess.io/vitess/go/vt/mysqlctl.(*backupPipe).Close()
      /Users/florentpoinsard/Code/vitess/go/vt/mysqlctl/builtinbackupengine.go:770 +0x84
  vitess.io/vitess/go/vt/mysqlctl.(*BuiltinBackupEngine).backupFile.func4.1()
      /Users/florentpoinsard/Code/vitess/go/vt/mysqlctl/builtinbackupengine.go:875 +0x54
  runtime.deferreturn()
      /Users/florentpoinsard/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.2.darwin-arm64/src/runtime/panic.go:605 +0x5c
  vitess.io/vitess/go/vt/mysqlctl.(*BuiltinBackupEngine).backupFile()
      /Users/florentpoinsard/Code/vitess/go/vt/mysqlctl/builtinbackupengine.go:925 +0xddc
  vitess.io/vitess/go/vt/mysqlctl.(*BuiltinBackupEngine).backupFiles.func1()
      /Users/florentpoinsard/Code/vitess/go/vt/mysqlctl/builtinbackupengine.go:646 +0x58c
  vitess.io/vitess/go/vt/mysqlctl.(*BuiltinBackupEngine).backupFiles.gowrap1()
      /Users/florentpoinsard/Code/vitess/go/vt/mysqlctl/builtinbackupengine.go:651 +0x44

Goroutine 48 (running) created at:
  github.com/planetscale/pargzip.(*Writer).init()
      /Users/florentpoinsard/go/pkg/mod/github.com/planetscale/pargzip@v0.0.0-20201116224723-90c7fc03ea8a/pargzip.go:127 +0x2a0
  github.com/planetscale/pargzip.(*Writer).Write()
      /Users/florentpoinsard/go/pkg/mod/github.com/planetscale/pargzip@v0.0.0-20201116224723-90c7fc03ea8a/pargzip.go:178 +0x50
  io.Writer.Write-fm()
      <autogenerated>:1 +0x60
  vitess.io/vitess/go/ioutil.(*meter).measure()
      /Users/florentpoinsard/Code/vitess/go/ioutil/meter.go:47 +0x5c
  vitess.io/vitess/go/ioutil.(*meteredWriter).Write()
      /Users/florentpoinsard/Code/vitess/go/ioutil/writer.go:89 +0xb0
  io.copyBuffer()
      /Users/florentpoinsard/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.2.darwin-arm64/src/io/io.go:431 +0x25c
  io.Copy()
      /Users/florentpoinsard/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.2.darwin-arm64/src/io/io.go:388 +0x97c
  vitess.io/vitess/go/vt/mysqlctl.(*BuiltinBackupEngine).backupFile.func4()
      /Users/florentpoinsard/Code/vitess/go/vt/mysqlctl/builtinbackupengine.go:918 +0x118
  vitess.io/vitess/go/vt/mysqlctl.(*BuiltinBackupEngine).backupFile()
      /Users/florentpoinsard/Code/vitess/go/vt/mysqlctl/builtinbackupengine.go:925 +0xddc
  vitess.io/vitess/go/vt/mysqlctl.(*BuiltinBackupEngine).backupFiles.func1()
      /Users/florentpoinsard/Code/vitess/go/vt/mysqlctl/builtinbackupengine.go:646 +0x58c
  vitess.io/vitess/go/vt/mysqlctl.(*BuiltinBackupEngine).backupFiles.gowrap1()
      /Users/florentpoinsard/Code/vitess/go/vt/mysqlctl/builtinbackupengine.go:651 +0x44

Goroutine 38 (finished) created at:
  vitess.io/vitess/go/vt/mysqlctl.(*BuiltinBackupEngine).backupFiles()
      /Users/florentpoinsard/Code/vitess/go/vt/mysqlctl/builtinbackupengine.go:611 +0x534
  vitess.io/vitess/go/vt/mysqlctl.(*BuiltinBackupEngine).executeFullBackup()
      /Users/florentpoinsard/Code/vitess/go/vt/mysqlctl/builtinbackupengine.go:496 +0xfcc
  vitess.io/vitess/go/vt/mysqlctl.(*BuiltinBackupEngine).ExecuteBackup()
      /Users/florentpoinsard/Code/vitess/go/vt/mysqlctl/builtinbackupengine.go:220 +0x3bc
  vitess.io/vitess/go/vt/mysqlctl_test.TestExecuteBackupWithFailureOnLastFile()
      /Users/florentpoinsard/Code/vitess/go/vt/mysqlctl/backup_blackbox_race_test.go:126 +0x117c
  testing.tRunner()
      /Users/florentpoinsard/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.2.darwin-arm64/src/testing/testing.go:1690 +0x184
  testing.(*T).Run.gowrap1()
      /Users/florentpoinsard/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.2.darwin-arm64/src/testing/testing.go:1743 +0x40
==================
==================
WARNING: DATA RACE
Write at 0x00c000592b90 by goroutine 48:
  hash/crc32.(*digest).Write()
      /Users/florentpoinsard/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.2.darwin-arm64/src/hash/crc32/crc32.go:219 +0x7c
  vitess.io/vitess/go/vt/mysqlctl.(*backupPipe).Write()
      /Users/florentpoinsard/Code/vitess/go/vt/mysqlctl/builtinbackupengine.go:761 +0x9c
  github.com/planetscale/pargzip.(*Writer).writeCompressedChunk()
      /Users/florentpoinsard/go/pkg/mod/github.com/planetscale/pargzip@v0.0.0-20201116224723-90c7fc03ea8a/pargzip.go:172 +0x15c
  github.com/planetscale/pargzip.(*Writer).init.func1()
      /Users/florentpoinsard/go/pkg/mod/github.com/planetscale/pargzip@v0.0.0-20201116224723-90c7fc03ea8a/pargzip.go:130 +0xd8

Previous read at 0x00c000592b90 by goroutine 38:
  hash/crc32.(*digest).Sum32()
      /Users/florentpoinsard/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.2.darwin-arm64/src/hash/crc32/crc32.go:223 +0x48
  hash/crc32.(*digest).Sum()
      /Users/florentpoinsard/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.2.darwin-arm64/src/hash/crc32/crc32.go:226 +0x44
  vitess.io/vitess/go/vt/mysqlctl.(*backupPipe).HashString()
      /Users/florentpoinsard/Code/vitess/go/vt/mysqlctl/builtinbackupengine.go:779 +0x54
  vitess.io/vitess/go/vt/mysqlctl.(*BuiltinBackupEngine).backupFile()
      /Users/florentpoinsard/Code/vitess/go/vt/mysqlctl/builtinbackupengine.go:930 +0xe50
  vitess.io/vitess/go/vt/mysqlctl.(*BuiltinBackupEngine).backupFiles.func1()
      /Users/florentpoinsard/Code/vitess/go/vt/mysqlctl/builtinbackupengine.go:646 +0x58c
  vitess.io/vitess/go/vt/mysqlctl.(*BuiltinBackupEngine).backupFiles.gowrap1()
      /Users/florentpoinsard/Code/vitess/go/vt/mysqlctl/builtinbackupengine.go:651 +0x44

Goroutine 48 (running) created at:
  github.com/planetscale/pargzip.(*Writer).init()
      /Users/florentpoinsard/go/pkg/mod/github.com/planetscale/pargzip@v0.0.0-20201116224723-90c7fc03ea8a/pargzip.go:127 +0x2a0
  github.com/planetscale/pargzip.(*Writer).Write()
      /Users/florentpoinsard/go/pkg/mod/github.com/planetscale/pargzip@v0.0.0-20201116224723-90c7fc03ea8a/pargzip.go:178 +0x50
  io.Writer.Write-fm()
      <autogenerated>:1 +0x60
  vitess.io/vitess/go/ioutil.(*meter).measure()
      /Users/florentpoinsard/Code/vitess/go/ioutil/meter.go:47 +0x5c
  vitess.io/vitess/go/ioutil.(*meteredWriter).Write()
      /Users/florentpoinsard/Code/vitess/go/ioutil/writer.go:89 +0xb0
  io.copyBuffer()
      /Users/florentpoinsard/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.2.darwin-arm64/src/io/io.go:431 +0x25c
  io.Copy()
      /Users/florentpoinsard/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.2.darwin-arm64/src/io/io.go:388 +0x97c
  vitess.io/vitess/go/vt/mysqlctl.(*BuiltinBackupEngine).backupFile.func4()
      /Users/florentpoinsard/Code/vitess/go/vt/mysqlctl/builtinbackupengine.go:918 +0x118
  vitess.io/vitess/go/vt/mysqlctl.(*BuiltinBackupEngine).backupFile()
      /Users/florentpoinsard/Code/vitess/go/vt/mysqlctl/builtinbackupengine.go:925 +0xddc
  vitess.io/vitess/go/vt/mysqlctl.(*BuiltinBackupEngine).backupFiles.func1()
      /Users/florentpoinsard/Code/vitess/go/vt/mysqlctl/builtinbackupengine.go:646 +0x58c
  vitess.io/vitess/go/vt/mysqlctl.(*BuiltinBackupEngine).backupFiles.gowrap1()
      /Users/florentpoinsard/Code/vitess/go/vt/mysqlctl/builtinbackupengine.go:651 +0x44

Goroutine 38 (finished) created at:
  vitess.io/vitess/go/vt/mysqlctl.(*BuiltinBackupEngine).backupFiles()
      /Users/florentpoinsard/Code/vitess/go/vt/mysqlctl/builtinbackupengine.go:611 +0x534
  vitess.io/vitess/go/vt/mysqlctl.(*BuiltinBackupEngine).executeFullBackup()
      /Users/florentpoinsard/Code/vitess/go/vt/mysqlctl/builtinbackupengine.go:496 +0xfcc
  vitess.io/vitess/go/vt/mysqlctl.(*BuiltinBackupEngine).ExecuteBackup()
      /Users/florentpoinsard/Code/vitess/go/vt/mysqlctl/builtinbackupengine.go:220 +0x3bc
  vitess.io/vitess/go/vt/mysqlctl_test.TestExecuteBackupWithFailureOnLastFile()
      /Users/florentpoinsard/Code/vitess/go/vt/mysqlctl/backup_blackbox_race_test.go:126 +0x117c
  testing.tRunner()
      /Users/florentpoinsard/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.2.darwin-arm64/src/testing/testing.go:1690 +0x184
  testing.(*T).Run.gowrap1()
      /Users/florentpoinsard/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.2.darwin-arm64/src/testing/testing.go:1743 +0x40
==================
    testing.go:1399: race detected during execution of test
--- FAIL: TestExecuteBackupWithFailureOnLastFile (0.07s)

Reproduction Steps

Run the test in d04aec6 (#17062) with -race

Binary Version

n/a

Operating System and Environment details

n/a

Log Fragments

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant