Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix azcopy on architectures that are strict about alignment (ARM) #1567

Closed
wants to merge 1 commit into from
Closed

Fix azcopy on architectures that are strict about alignment (ARM) #1567

wants to merge 1 commit into from

Conversation

jepio
Copy link
Member

@jepio jepio commented Sep 23, 2021

This PR fixes azcopy on ARM architectures (tested on ARM64) by ensuring that JobPartPlanTransfer is properly 8 byte aligned inside the plan file. JobPartPlanTransfer is located after a variable size string and so its alignment was dependent on what was being copied.

Tested using go 1.16.2 and GOARCH=arm64 go build.

Fixes #972
Fixes #882
Fixes #427

The current version of azcopy dies on ARM systems with a SIGBUS and the following stacktrace:

  INFO: Scanning...
  INFO: Any empty folders will not be processed, because source and/or destination doesn't have full folder support
  unexpected fault address 0xffff72640a5f
  fatal error: fault
  [signal SIGBUS: bus error code=0x1 addr=0xffff72640a5f pc=0x733d8c]

  goroutine 25 [running]:
  runtime.throw(0x9de596, 0x5)
          /usr/lib/go-1.16/src/runtime/panic.go:1117 +0x54 fp=0x4000d982f0 sp=0x4000d982c0 pc=0x43914
  runtime.sigpanic()
          /usr/lib/go-1.16/src/runtime/signal_unix.go:731 +0x284 fp=0x4000d98330 sp=0x4000d982f0 pc=0x5b214
  github.com/Azure/azure-storage-azcopy/v10/common.(*TransferStatus).AtomicLoad(...)
          /home/jeremi/github/azure-storage-azcopy-2/common/fe-ste-models.go:690
  github.com/Azure/azure-storage-azcopy/v10/ste.(*JobPartPlanTransfer).TransferStatus(...)
          /home/jeremi/github/azure-storage-azcopy-2/ste/JobPartPlan.go:377
  github.com/Azure/azure-storage-azcopy/v10/ste.(*jobPartMgr).ScheduleTransfers(0x4000adc000, 0xb54930, 0x4000ad2000)
          /home/jeremi/github/azure-storage-azcopy-2/ste/mgr-JobPartMgr.go:396 +0x42c fp=0x4000d99f20 sp=0x4000d98340 pc=0x733d8c
  github.com/Azure/azure-storage-azcopy/v10/ste.(*jobsAdmin).scheduleJobParts(0x4000496000)
          /home/jeremi/github/azure-storage-azcopy-2/ste/JobsAdmin.go:287 +0x44 fp=0x4000d99fd0 sp=0x4000d99f20 pc=0x724ee4
  runtime.goexit()
          /usr/lib/go-1.16/src/runtime/asm_arm64.s:1130 +0x4 fp=0x4000d99fd0 sp=0x4000d99fd0 pc=0x78a44
  created by github.com/Azure/azure-storage-azcopy/v10/ste.initJobsAdmin
          /home/jeremi/github/azure-storage-azcopy-2/ste/JobsAdmin.go:212 +0x5ec

What happens is that a JobPartPlanTransfer struct is created inside a memory
mapped 'plan' file and there are several members of that struct that are
accessed atomically. This requires that those members have the natural
alignment for their type, which is guaranteed at the struct level, but the
struct itself is not properly aligned inside the plan file. JobPartPlanTransfer
is located after the fixed size JobPartPlanHeader struct and a variable size
'CommandString'. It is that variable sized string that breaks the whole
alignment (most of the time). x86 doesn't care but ARM does not like it
one bit and SIGBUS.

To remedy this, ensure that the start of the first JobPartPlanTransfer is 8
byte aligned both when writing the file and when accessing it. This makes
azcopy work on ARM64 systems that I have tested.

Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
@mohsha-msft
Copy link
Contributor

Running tests to check for regression.

@mohsha-msft
Copy link
Contributor

mohsha-msft commented Sep 27, 2021

Hey @jepio ,

Thanks for your contribution.

If you wished to compile AzCopy for Linux with 64-bit ARM, consider adding GOARCH=arm64 GOOS=linux go build -o "$(Build.ArtifactStagingDirectory)/azcopy_linux_arm64.exe" here -


and here -

@jepio
Copy link
Member Author

jepio commented Sep 27, 2021

Hi @mohsha-msft,
I don't use Windows. If you want me to add the lines let me know.

@mohsha-msft
Copy link
Contributor

I can add the lines on your behalf. 👍

@mohsha-msft
Copy link
Contributor

mohsha-msft commented Sep 27, 2021

Not causing any regressions. Need to check why E2E tests are failing on ARM.
BuildLogs

Branch with changes in azure-pipeline.yml : mohsha-msft/branch-from-jepio-arm

@jepio
Copy link
Member Author

jepio commented Sep 27, 2021

I don't think your CI has arm64 nodes, it might be able to build the executables but not test them.

For legacy arm I have another PR that I plan to submit, but it requires this jiacfan/keyctl#4 PR in a dependency first.

@mohsha-msft mohsha-msft added the pr-on-hold Use on PRs to indicate ones that shouldn't be reviewed right now because they're blocked on somethin label Sep 27, 2021
@mohsha-msft
Copy link
Contributor

Hey @jepio ,

We've forked this repo here
I've picked your changes and merged them here
Please let me know if there is any other dependency you wish to merge in order to make ARM work.

@mohsha-msft mohsha-msft added testing-required and removed pr-on-hold Use on PRs to indicate ones that shouldn't be reviewed right now because they're blocked on somethin labels Sep 30, 2021
@hss83
Copy link

hss83 commented Sep 30, 2021

Thanks, this fixes the issue for me #882

@jepio
Copy link
Member Author

jepio commented Oct 6, 2021

Replaced by #1577

@jepio jepio closed this Oct 6, 2021
@jepio jepio deleted the fix-azcopy-on-arm64 branch October 13, 2021 08:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SIGBUS error in TransferStatus on aarch64 AzCopy 10.3.4 segfault on ARM AzCopy on ARM/Synology crashes
3 participants