Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

criu dump will block at arm64? #150

Closed
datawolf opened this issue Apr 22, 2016 · 14 comments
Closed

criu dump will block at arm64? #150

datawolf opened this issue Apr 22, 2016 · 14 comments
Labels

Comments

@datawolf
Copy link
Contributor

Hi, all

I built criu on arm64, the version is :

ubuntu@ubuntu:~/criu$ git describe
v2.1-115-g55fa530

and test criu with the following steps:

root@ubuntu:/home/ubuntu/criu-test# setsid ./test.sh  < /dev/null &> test.log &
[1] 15916
root@ubuntu:/home/ubuntu/criu-test# ps -C test.sh
  PID TTY          TIME CMD
15917 ?        00:00:00 test.sh
[1]+  Done                    setsid ./test.sh < /dev/null &> test.log
root@ubuntu:/home/ubuntu/criu-test# criu dump -t 15917 -vvv -o dump.log && echo OK

**it will be blocked at here**

the tail of dump.log is :

(00.007494) 0x7f860b3000-0x7f860b4000 (4K) prot 0x5 flags 0x22 st 0x209 off 0 reg vdso ap  shmid: 0
(00.007499) 0x7f860b4000-0x7f860b5000 (4K) prot 0x5 flags 0x2 st 0x41 off 0x1c000 reg fp  shmid: 0
(00.007504) 0x7f860b5000-0x7f860b7000 (8K) prot 0x7 flags 0x2 st 0x41 off 0x1d000 reg fp  shmid: 0
(00.007508) 0x7fe20f4000-0x7fe2116000 (136K) prot 0x7 flags 0x122 st 0x201 off 0 reg ap  shmid: 0
(00.007513) ----------------------------------------
(00.007521)
(00.007524) Collecting fds (pid: 15917)
(00.007528) ----------------------------------------
(00.007557) Found 4 file descriptors
(00.007563) ----------------------------------------
(00.007715) Set up parasite blob using memfd
(00.007723) Putting parasite blob into 0x7f9254f000->0x7f85f34000
(00.007745) Dumping GP/FPU registers for 15917
(00.007757) Putting tsock into pid 15917

The tail output of strace criu dump -t 15917 -vvv -o dump.log && echo OK

write(1023, "(00.027384) Set up parasite blob"..., 45) = 45
write(1023, "(00.027428) Putting parasite blo"..., 66) = 66
write(1023, "(00.027496) Dumping GP/FPU regis"..., 47) = 47
ptrace(PTRACE_GETREGSET, 15917, NT_PRSTATUS, [{0x7fda97e680, 272}]) = 0
ptrace(PTRACE_GETREGSET, 15917, NT_FPREGSET, [{0x7fda97e420, 528}]) = 0
write(1023, "(00.027620) Putting tsock into p"..., 41) = 41
bind(6, {sa_family=AF_LOCAL, sun_path=@"/crtools-pr-16912"}, 20) = 0
listen(6, 1)                            = 0
rt_sigaction(SIGCHLD, {0x46041c, [CHLD], SA_RESTART|SA_SIGINFO}, NULL, 8) = 0
ptrace(0x420b /* PTRACE_??? */, 15917, 0x8, 0x7fda97e4a0) = 0
ptrace(PTRACE_SETREGSET, 15917, NT_PRSTATUS, [{0x7fda97e528, 272}]) = 0
ptrace(PTRACE_CONT, 15917, 0, SIG_0)    = 0
accept(6,

It blocked at accept system call.

I would like to ask for ideas, how this problem can be solved.

@datawolf
Copy link
Contributor Author

The same issue on Raspberry pi 2. please refer to: https://lists.openvz.org/pipermail/criu/2016-April/027494.html

@xemul
Copy link
Member

xemul commented Apr 22, 2016

OK, I've put Christopher Covington in Cc. He's now the one who ... kinda maintains the arm/aarch64 port.

@xemul xemul added the bug label Apr 22, 2016
@0x7f454c46
Copy link
Member

Hey, @datawolf, could you please test if it works with the following patch over criu-dev branch?

https://lists.openvz.org/pipermail/criu/2016-April/027743.html

@datawolf
Copy link
Contributor Author

Sorry, It does not work.

here is the make dry-run after apply the patch:

ubuntu@ubuntu:~/criu$ make --dry-run
make -r -R -f /home/ubuntu/criu/scripts/nmk/scripts/main.mk makefile=Makefile obj=images all
make[1]: Entering directory '/home/ubuntu/criu'
true
make[1]: Leaving directory '/home/ubuntu/criu'
make -r -R -f /home/ubuntu/criu/scripts/nmk/scripts/main.mk makefile=Makefile obj=images images/built-in.o
make[1]: Entering directory '/home/ubuntu/criu'
make[1]: 'images/built-in.o' is up to date.
make[1]: Leaving directory '/home/ubuntu/criu'
make -C criu all
make[1]: Entering directory '/home/ubuntu/criu/criu'
sh: 1: pkg-config: not found
make -r -R -f /home/ubuntu/criu/scripts/nmk/scripts/main.mk makefile=Makefile.syscalls obj=arch/aarch64 all
make[2]: Entering directory '/home/ubuntu/criu/criu'
true
make[2]: Leaving directory '/home/ubuntu/criu/criu'
make -r -R -f /home/ubuntu/criu/scripts/nmk/scripts/main.mk makefile=Makefile obj=arch/aarch64 all
make[2]: Entering directory '/home/ubuntu/criu/criu'
true
make[2]: Leaving directory '/home/ubuntu/criu/criu'
make -r -R -f /home/ubuntu/criu/scripts/nmk/scripts/main.mk makefile=Makefile.library obj=pie all
make[2]: Entering directory '/home/ubuntu/criu/criu'
true
make[2]: Leaving directory '/home/ubuntu/criu/criu'
make -r -R -f /home/ubuntu/criu/scripts/nmk/scripts/main.mk makefile=Makefile obj=pie all
make[2]: Entering directory '/home/ubuntu/criu/criu'
echo "  CC      "  arch/aarch64/restorer.o
gcc -c -O2 -g -Wall -Werror -D_FILE_OFFSET_BITS=64 -D_GNU_SOURCE -iquote /home/ubuntu/criu/compel/include -iquote arch/aarch64/include -iquote /home/ubuntu/criu -iquote /home/ubuntu/criu/criu/include -fno-strict-aliasing -iquote /home/ubuntu/criu/criu/include -iquote /home/ubuntu/criu/images -iquote /home/ubuntu/criu/criu/pie -iquote /home/ubuntu/criu/criu/arch/aarch64 -iquote /home/ubuntu/criu/criu/arch/aarch64/include -iquote /home/ubuntu/criu/ -I/usr/include/libnl3 -DCR_NOGLIBC -fpie -Wa,--noexecstack -fno-stack-protector arch/aarch64/restorer.c -o arch/aarch64/restorer.o
echo "  CC      "  pie/restorer.o
gcc -c -O2 -g -Wall -Werror -D_FILE_OFFSET_BITS=64 -D_GNU_SOURCE -iquote /home/ubuntu/criu/compel/include -iquote arch/aarch64/include -iquote /home/ubuntu/criu -iquote /home/ubuntu/criu/criu/include -fno-strict-aliasing -iquote /home/ubuntu/criu/criu/include -iquote /home/ubuntu/criu/images -iquote /home/ubuntu/criu/criu/pie -iquote /home/ubuntu/criu/criu/arch/aarch64 -iquote /home/ubuntu/criu/criu/arch/aarch64/include -iquote /home/ubuntu/criu/ -I/usr/include/libnl3 -DCR_NOGLIBC -fpie -Wa,--noexecstack -fno-stack-protector pie/restorer.c -o pie/restorer.o
echo "  LINK    "  pie/restorer.built-in.o
ld    -r -o pie/restorer.built-in.o  ./arch/aarch64/restorer.o pie/restorer.o ./arch/aarch64/syscalls.built-in.o
echo "  AR      "  pie/native.lib.a
ar -rcs   pie/native.lib.a   
echo "  GEN     "  pie/restorer.built-in.bin.o
ld -r -T pie/pie.lds-native.S -o pie/restorer.built-in.bin.o pie/restorer.built-in.o pie/native.lib.a
echo "  GEN     "  pie/restorer.built-in.bin
objcopy -O binary pie/restorer.built-in.bin.o pie/restorer.built-in.bin
echo "  GEN     "  pie/restorer-blob.h
/bin/bash pie/../../scripts/gen-offsets.sh pie/restorer restorer  > pie/restorer-blob.h
echo "  CC      "  arch/aarch64/parasite-head.o
gcc -c -O2 -g -Wall -Werror -D_FILE_OFFSET_BITS=64 -D_GNU_SOURCE -iquote /home/ubuntu/criu/compel/include -iquote arch/aarch64/include -iquote /home/ubuntu/criu -iquote /home/ubuntu/criu/criu/include -D__ASSEMBLY__ arch/aarch64/parasite-head.S -o arch/aarch64/parasite-head.o
echo "  CC      "  pie/parasite.o
gcc -c -O2 -g -Wall -Werror -D_FILE_OFFSET_BITS=64 -D_GNU_SOURCE -iquote /home/ubuntu/criu/compel/include -iquote arch/aarch64/include -iquote /home/ubuntu/criu -iquote /home/ubuntu/criu/criu/include -fno-strict-aliasing -iquote /home/ubuntu/criu/criu/include -iquote /home/ubuntu/criu/images -iquote /home/ubuntu/criu/criu/pie -iquote /home/ubuntu/criu/criu/arch/aarch64 -iquote /home/ubuntu/criu/criu/arch/aarch64/include -iquote /home/ubuntu/criu/ -I/usr/include/libnl3 -DCR_NOGLIBC -fpie -Wa,--noexecstack -fno-stack-protector pie/parasite.c -o pie/parasite.o
echo "  LINK    "  pie/native.built-in.o
ld    -r -o pie/native.built-in.o  ./arch/aarch64/parasite-head.o pie/parasite.o ./arch/aarch64/syscalls.built-in.o
echo "  GEN     "  pie/parasite-native.built-in.bin.o
ld -r -T pie/pie.lds-native.S -o pie/parasite-native.built-in.bin.o pie/native.built-in.o pie/native.lib.a
echo "  GEN     "  pie/parasite-native.built-in.bin
objcopy -O binary pie/parasite-native.built-in.bin.o pie/parasite-native.built-in.bin
echo "  GEN     "  pie/parasite-native-blob.h
/bin/bash pie/../../scripts/gen-offsets.sh pie/parasite-native parasite_native  > pie/parasite-native-blob.h
true
make[2]: Leaving directory '/home/ubuntu/criu/criu'
make -r -R -f /home/ubuntu/criu/scripts/nmk/scripts/main.mk makefile=Makefile.crtools obj=. all
make[2]: Entering directory '/home/ubuntu/criu/criu'
true
make[2]: Leaving directory '/home/ubuntu/criu/criu'
true
make[1]: Leaving directory '/home/ubuntu/criu/criu'
make -C lib all
make[1]: Entering directory '/home/ubuntu/criu/lib'
echo "  GEN     "  lib-py
make -C py all
make[2]: Entering directory '/home/ubuntu/criu/lib/py'
make -C images all
make[3]: Entering directory '/home/ubuntu/criu/lib/py/images'
protoc -I=/home/ubuntu/criu/images -I=/usr/include/ --python_out=./ /home/ubuntu/criu/images/autofs.proto /home/ubuntu/criu/images/binfmt-misc.proto /home/ubuntu/criu/images/cgroup.proto /home/ubuntu/criu/images/core-aarch64.proto /home/ubuntu/criu/images/core-arm.proto /home/ubuntu/criu/images/core-ppc64.proto /home/ubuntu/criu/images/core-x86.proto /home/ubuntu/criu/images/core.proto /home/ubuntu/criu/images/cpuinfo.proto /home/ubuntu/criu/images/creds.proto /home/ubuntu/criu/images/eventfd.proto /home/ubuntu/criu/images/eventpoll.proto /home/ubuntu/criu/images/ext-file.proto /home/ubuntu/criu/images/fdinfo.proto /home/ubuntu/criu/images/fh.proto /home/ubuntu/criu/images/fifo.proto /home/ubuntu/criu/images/file-lock.proto /home/ubuntu/criu/images/fown.proto /home/ubuntu/criu/images/fs.proto /home/ubuntu/criu/images/fsnotify.proto /home/ubuntu/criu/images/ghost-file.proto /home/ubuntu/criu/images/inventory.proto /home/ubuntu/criu/images/ipc-desc.proto /home/ubuntu/criu/images/ipc-msg.proto /home/ubuntu/criu/images/ipc-sem.proto /home/ubuntu/criu/images/ipc-shm.proto /home/ubuntu/criu/images/ipc-var.proto /home/ubuntu/criu/images/mm.proto /home/ubuntu/criu/images/mnt.proto /home/ubuntu/criu/images/netdev.proto /home/ubuntu/criu/images/ns.proto /home/ubuntu/criu/images/opts.proto /home/ubuntu/criu/images/packet-sock.proto /home/ubuntu/criu/images/pagemap.proto /home/ubuntu/criu/images/pipe-data.proto /home/ubuntu/criu/images/pipe.proto /home/ubuntu/criu/images/pstree.proto /home/ubuntu/criu/images/regfile.proto /home/ubuntu/criu/images/remap-file-path.proto /home/ubuntu/criu/images/rlimit.proto /home/ubuntu/criu/images/sa.proto /home/ubuntu/criu/images/seccomp.proto /home/ubuntu/criu/images/siginfo.proto /home/ubuntu/criu/images/signalfd.proto /home/ubuntu/criu/images/sk-inet.proto /home/ubuntu/criu/images/sk-netlink.proto /home/ubuntu/criu/images/sk-opts.proto /home/ubuntu/criu/images/sk-packet.proto /home/ubuntu/criu/images/sk-unix.proto /home/ubuntu/criu/images/stats.proto /home/ubuntu/criu/images/tcp-stream.proto /home/ubuntu/criu/images/time.proto /home/ubuntu/criu/images/timer.proto /home/ubuntu/criu/images/timerfd.proto /home/ubuntu/criu/images/tty.proto /home/ubuntu/criu/images/tun.proto /home/ubuntu/criu/images/userns.proto /home/ubuntu/criu/images/utsns.proto /home/ubuntu/criu/images/vma.proto
echo "# Autogenerated. Do not edit!" > pb.py
for m in autofs_pb2 binfmt_misc_pb2 cgroup_pb2 core_aarch64_pb2 core_arm_pb2 core_ppc64_pb2 core_x86_pb2 core_pb2 cpuinfo_pb2 creds_pb2 eventfd_pb2 eventpoll_pb2 ext_file_pb2 fdinfo_pb2 fh_pb2 fifo_pb2 file_lock_pb2 fown_pb2 fs_pb2 fsnotify_pb2 ghost_file_pb2 inventory_pb2 ipc_desc_pb2 ipc_msg_pb2 ipc_sem_pb2 ipc_shm_pb2 ipc_var_pb2 mm_pb2 mnt_pb2 netdev_pb2 ns_pb2 opts_pb2 packet_sock_pb2 pagemap_pb2 pipe_data_pb2 pipe_pb2 pstree_pb2 regfile_pb2 remap_file_path_pb2 rlimit_pb2 sa_pb2 seccomp_pb2 siginfo_pb2 signalfd_pb2 sk_inet_pb2 sk_netlink_pb2 sk_opts_pb2 sk_packet_pb2 sk_unix_pb2 stats_pb2 tcp_stream_pb2 time_pb2 timer_pb2 timerfd_pb2 tty_pb2 tun_pb2 userns_pb2 utsns_pb2 vma_pb2; do \
    echo "from $m import *" >> pb.py ;\
done
make[3]: Leaving directory '/home/ubuntu/criu/lib/py/images'
make[2]: Leaving directory '/home/ubuntu/criu/lib/py'
true
make[1]: Leaving directory '/home/ubuntu/criu/lib'

@datawolf
Copy link
Contributor Author

Maybe the patch has some wrong, When I change

+ifeq ($(filter arm arm64,$(ARCH)),)

To

+ifneq ($(filter arm arm64,$(ARCH)),)

The command criu dump -t 11208 -vvv -o dump.log && echo OK can run successfully, but the criu restore -d -vvvv -o restore.log && echo OK run failed.

pie: vdso: DT_STRTAB: 0x1f8
pie: vdso: DT_SYMTAB: 0x150
pie: Remap 0x7fa7316000->0x7f926ec000 len 0x4000
pie: vdso: DT_STRSZ: 0x77
pie: vdso: DT_SYMENT: 0x18
pie: Remap 0x7fa731a000->0x7f926f0000 len 0x2000
pie: vdso: nbucket 0x3 nchain 0x7 bucket 0x7f89815128 chain 0x7f898151>
pie: 34
pie: Remap 0x7fa731c000->0x7f926f2000 len 0x4000
pie: Remap 0x7fa7320000->0x7f926f6000 len 0x1c000
pie: Remap 0x7fa733c000->0x7f92712000 len 0x3000
pie: Remap 0x7fa733f000->0x7f9271e000 len 0x2000
pie: Remap 0x7fa7341000->0x7f92720000 len 0x1000
pie: Remap 0x7fa7342000->0x7f92721000 len 0x1000
pie: Remap 0x7fa7343000->0x7f92722000 len 0x1000
pie: Remap 0x7fa7344000->0x7f92723000 len 0x2000
pie: Remap 0x7fa7347000->0x7fd0777000 len 0x22000
pie: vdso: Parsing at 0x7f92721000 0x7f92722000
pie: vdso: PT_LOAD p_vaddr: 0x0
pie: vdso: DT_HASH: 0x120
pie: vdso: DT_STRTAB: 0x1f8
pie: vdso: DT_SYMTAB: 0x150
pie: vdso: DT_STRSZ: 0x77
pie: vdso: DT_SYMENT: 0x18
pie: vdso: nbucket 0x3 nchain 0x7 bucket 0x7f92721128 chain 0x7f927211>
pie: 34
(00.299961) Error (cr-restore.c:1407): 11208 killed by signal 11: Segmentation fault
(00.299992) Error (cr-restore.c:2248): Restoring FAILED.

@0x7f454c46
Copy link
Member

@datawolf, thanks, I will check with qemu on the monday. Seems like, for arm64 problem was about missing linking against lib.a(on master), native.lib.a(on criu-dev).
Where for arm it's also about reloc flag. So this
+ifeq ($(filter arm arm64,$(ARCH)),)
should be:
+ifeq ($(ARCH),arm)

@datawolf
Copy link
Contributor Author

@0x7f454c46 ping, any progress about this bug?

@0x7f454c46
Copy link
Member

@datawolf I'm on it, will give a reply/patch today.

@0x7f454c46
Copy link
Member

Ok, so the patch I did was ok, except ARCH filter should be not with arm64, but with aarch64.
Will resend it as it's correct and tested.
It took a time to set aarch64 environment, so the restore problem isn't yet fixed.
And got another thing, that prevent dumping, so I'll take a look on all this closer tomorrow:

(00.263830) Fetched ack: 13 13 0
(00.264320) 18020 fdinfo 0: pos: 0x               0 flags:           402002/0
(00.265040) Error (files-ext.c:91): Can't dump file 0 of that type [20600] (chr 204:64)
(00.265261) ----------------------------------------
(00.265476) Error (cr-dump.c:1312): Dump files (pid: 18020) failed with -1

@0x7f454c46
Copy link
Member

So the problem is that gcc v5.3 creates relative relocs for vdso_symbols array.
That seems to be already known problem: https://lists.openvz.org/pipermail/criu/2015-October/022453.html
I'm not really sure, how to avoid relocation of this array.
Seems like the best solution will be porting compel to aarch64.
Maybe Christopher Covington could help with a workaround before that happens?

@datawolf
Copy link
Contributor Author

datawolf commented May 4, 2016

@0x7f454c46 I use gcc v4.9.2, and it can dump OK.

ubuntu@ubuntu:~/criu$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/aarch64-linux-gnu/4.9/lto-wrapper
Target: aarch64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu/Linaro 4.9.2-10ubuntu13' --with-bugurl=file:///usr/share/doc/gcc-4.9/README.Bugs --enable-languages=c,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-4.9 --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.9 --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-gnu-unique-object --disable-libsanitizer --disable-libquadmath --enable-plugin --with-system-zlib --disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-4.9-arm64/jre --enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-4.9-arm64 --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-4.9-arm64 --with-arch-directory=arm64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --enable-multiarch --disable-werror --enable-checking=release --build=aarch64-linux-gnu --host=aarch64-linux-gnu --target=aarch64-linux-gnu
Thread model: posix
gcc version 4.9.2 (Ubuntu/Linaro 4.9.2-10ubuntu13) 

But the restore run failed. Could you please help to solve the restore failed problem?

the tail of the restore.log is:

pie: Remap 0x7fb7cef000->0x7fb7fb9000 len 0xf000
pie: Remap 0x7fb7b1e000->0x7fb7fb9000 len 0xf000
pie: Remap 0x7fb7bbf000->0x7fb7e89000 len 0x130000
pie: Remap 0x7fb79ee000->0x7fb7e89000 len 0x130000
pie: Remap 0x7fb7865000->0x7fb7d00000 len 0x189000
pie: vdso: Parsing at 0x7fb7ffd000 0x7fb7ffe000
pie: vdso: PT_LOAD p_vaddr: 0x0
pie: vdso: DT_HASH: 0x120
pie: vdso: Parsing at 0x7fb7ffd000 0x7fb7ffe000
pie: vdso: DT_STRTAB: 0x1f8
pie: vdso: PT_LOAD p_vaddr: 0x0
pie: vdso: DT_SYMTAB: 0x150
pie: vdso: DT_HASH: 0x120
pie: vdso: DT_STRSZ: 0x77
pie: vdso: DT_STRTAB: 0x1f8
pie: vdso: DT_SYMENT: 0x18
pie: vdso: DT_SYMTAB: 0x150
pie: vdso: nbucket 0x3 nchain 0x7 bucket 0x7fb7ffd128 chain 0x7fb7ffd1>
pie: vdso: DT_STRSZ: 0x77
pie: 34
pie: vdso: DT_SYMENT: 0x18
pie: vdso: nbucket 0x3 nchain 0x7 bucket 0x7fb7ffd128 chain 0x7fb7ffd1>
pie: 34
(00.300043) Error (cr-restore.c:1407): 19509 killed by signal 11: Segmentation fault
(00.300593) Error (cr-restore.c:2248): Restoring FAILED.

@0x7f454c46
Copy link
Member

@datawolf, could you test it with the following patch: https://lists.openvz.org/pipermail/criu/2016-May/028247.html

criupatchwork pushed a commit to criupatchwork/criu that referenced this issue May 11, 2016
This seems to be known problem in util-vdso.c on aarch64 [1].

Now restorer segfaults with the following log:
[ 8107.384817] criu[5135]: unhandled level 3 translation fault (11) at 0x00005b98, esr 0x90000007
[ 8107.385538] pgd = ffffffc038dbc000
[ 8107.386046] [00005b98] *pgd=0000000078d6c003, *pud=0000000078d6c003, *pmd=0000000073c31003, *pte=0000000000000000
[ 8107.391920]
[ 8107.392521] CPU: 0 PID: 5135 Comm: criu Not tainted 4.5.0 checkpoint-restore#29
[ 8107.392805] Hardware name: linux,dummy-virt (DT)
[ 8107.393140] task: ffffffc039a2a400 ti: ffffffc033c34000 task.ti: ffffffc033c34000
[ 8107.393782] PC is at 0x13514
[ 8107.406332] LR is at 0x1342c
[ 8107.406550] pc : [<0000000000013514>] lr : [<000000000001342c>] pstate: a0000000

This is because gcc for aarch64 adds vdso_symbols array to symbols
table and by that reason, it needs run-time relocations in place.

How it goes (with cut not interesting assembly):
  0x14104:	adrp	x7, 0x14000	; adrp+add loading of
  0x14114:	add	x0, x7, #0x928	; symbol table's address,
  0x14134:	ldp	x2, x3, [x0]	; loading address of symbol from
					; symbol table
  0x1414c:	stp	x2, x3, [x29,checkpoint-restore#112] ; saving it on stack on
					; function's begin
  0x14188:	ldr	x2, [x29,checkpoint-restore#112]	; using symbol's address in code

The symbol for this in symbol table is:
  [root@alarm cr]# readelf -s criu/pie/restorer.built-in.bin.o | grep 5b98
    248: 0000000000005b98     0 NOTYPE  LOCAL  DEFAULT    1 $d
And also may be visible this way:
  objdump -dS criu/pie/restorer.built-in.bin.o | less
  ...
  0000000000004924 <cur_loglevel>:
      4924:       00000002 00005b98 00000000 00005ed0     .....[.......^..
      4934:       00000000 00005ee8 00000000 00005f00     .....^......._..

So, in a symbol table lies not relocated address of symbol.
The real address may be visible with added printing of vdso_symbols[0]:
pie: vdso: vdso_symbols[0] 0x15b98
(this way gcc by some reason does access symbol through
local adrp+add calculations, than through stack-saved pointer
of a symbol from symbol table).

While we don't handling properly relocs here, I suggest this
ugly workaround.

Temporary fix for: checkpoint-restore#150

[1]: https://lists.openvz.org/pipermail/criu/2015-October/022453.html

Cc: Wang Long <long.wanglong@huawei.com>
Cc: Christopher Covington <cov@codeaurora.org>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
@datawolf
Copy link
Contributor Author

@0x7f454c46 It works. Thank you.

@0x7f454c46
Copy link
Member

@datawolf thanks for testing :)

xemul pushed a commit that referenced this issue May 11, 2016
This seems to be known problem in util-vdso.c on aarch64 [1].

Now restorer segfaults with the following log:
[ 8107.384817] criu[5135]: unhandled level 3 translation fault (11) at 0x00005b98, esr 0x90000007
[ 8107.385538] pgd = ffffffc038dbc000
[ 8107.386046] [00005b98] *pgd=0000000078d6c003, *pud=0000000078d6c003, *pmd=0000000073c31003, *pte=0000000000000000
[ 8107.391920]
[ 8107.392521] CPU: 0 PID: 5135 Comm: criu Not tainted 4.5.0 #29
[ 8107.392805] Hardware name: linux,dummy-virt (DT)
[ 8107.393140] task: ffffffc039a2a400 ti: ffffffc033c34000 task.ti: ffffffc033c34000
[ 8107.393782] PC is at 0x13514
[ 8107.406332] LR is at 0x1342c
[ 8107.406550] pc : [<0000000000013514>] lr : [<000000000001342c>] pstate: a0000000

This is because gcc for aarch64 adds vdso_symbols array to symbols
table and by that reason, it needs run-time relocations in place.

How it goes (with cut not interesting assembly):
  0x14104:	adrp	x7, 0x14000	; adrp+add loading of
  0x14114:	add	x0, x7, #0x928	; symbol table's address,
  0x14134:	ldp	x2, x3, [x0]	; loading address of symbol from
					; symbol table
  0x1414c:	stp	x2, x3, [x29,#112] ; saving it on stack on
					; function's begin
  0x14188:	ldr	x2, [x29,#112]	; using symbol's address in code

The symbol for this in symbol table is:
  [root@alarm cr]# readelf -s criu/pie/restorer.built-in.bin.o | grep 5b98
    248: 0000000000005b98     0 NOTYPE  LOCAL  DEFAULT    1 $d
And also may be visible this way:
  objdump -dS criu/pie/restorer.built-in.bin.o | less
  ...
  0000000000004924 <cur_loglevel>:
      4924:       00000002 00005b98 00000000 00005ed0     .....[.......^..
      4934:       00000000 00005ee8 00000000 00005f00     .....^......._..

So, in a symbol table lies not relocated address of symbol.
The real address may be visible with added printing of vdso_symbols[0]:
pie: vdso: vdso_symbols[0] 0x15b98
(this way gcc by some reason does access symbol through
local adrp+add calculations, than through stack-saved pointer
of a symbol from symbol table).

While we don't handling properly relocs here, I suggest this
ugly workaround.

Temporary fix for: #150

[1]: https://lists.openvz.org/pipermail/criu/2015-October/022453.html

Cc: Wang Long <long.wanglong@huawei.com>
Cc: Christopher Covington <cov@codeaurora.org>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Tested-by: long.wanglong <long.wanglong@huawei.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
@xemul xemul closed this as completed May 11, 2016
xemul pushed a commit that referenced this issue May 16, 2016
This seems to be known problem in util-vdso.c on aarch64 [1].

Now restorer segfaults with the following log:
[ 8107.384817] criu[5135]: unhandled level 3 translation fault (11) at 0x00005b98, esr 0x90000007
[ 8107.385538] pgd = ffffffc038dbc000
[ 8107.386046] [00005b98] *pgd=0000000078d6c003, *pud=0000000078d6c003, *pmd=0000000073c31003, *pte=0000000000000000
[ 8107.391920]
[ 8107.392521] CPU: 0 PID: 5135 Comm: criu Not tainted 4.5.0 #29
[ 8107.392805] Hardware name: linux,dummy-virt (DT)
[ 8107.393140] task: ffffffc039a2a400 ti: ffffffc033c34000 task.ti: ffffffc033c34000
[ 8107.393782] PC is at 0x13514
[ 8107.406332] LR is at 0x1342c
[ 8107.406550] pc : [<0000000000013514>] lr : [<000000000001342c>] pstate: a0000000

This is because gcc for aarch64 adds vdso_symbols array to symbols
table and by that reason, it needs run-time relocations in place.

How it goes (with cut not interesting assembly):
  0x14104:	adrp	x7, 0x14000	; adrp+add loading of
  0x14114:	add	x0, x7, #0x928	; symbol table's address,
  0x14134:	ldp	x2, x3, [x0]	; loading address of symbol from
					; symbol table
  0x1414c:	stp	x2, x3, [x29,#112] ; saving it on stack on
					; function's begin
  0x14188:	ldr	x2, [x29,#112]	; using symbol's address in code

The symbol for this in symbol table is:
  [root@alarm cr]# readelf -s criu/pie/restorer.built-in.bin.o | grep 5b98
    248: 0000000000005b98     0 NOTYPE  LOCAL  DEFAULT    1 $d
And also may be visible this way:
  objdump -dS criu/pie/restorer.built-in.bin.o | less
  ...
  0000000000004924 <cur_loglevel>:
      4924:       00000002 00005b98 00000000 00005ed0     .....[.......^..
      4934:       00000000 00005ee8 00000000 00005f00     .....^......._..

So, in a symbol table lies not relocated address of symbol.
The real address may be visible with added printing of vdso_symbols[0]:
pie: vdso: vdso_symbols[0] 0x15b98
(this way gcc by some reason does access symbol through
local adrp+add calculations, than through stack-saved pointer
of a symbol from symbol table).

While we don't handling properly relocs here, I suggest this
ugly workaround.

Temporary fix for: #150

[1]: https://lists.openvz.org/pipermail/criu/2015-October/022453.html

Cc: Wang Long <long.wanglong@huawei.com>
Cc: Christopher Covington <cov@codeaurora.org>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Tested-by: long.wanglong <long.wanglong@huawei.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
tkhai pushed a commit to tkhai/criu that referenced this issue May 17, 2016
This seems to be known problem in util-vdso.c on aarch64 [1].

Now restorer segfaults with the following log:
[ 8107.384817] criu[5135]: unhandled level 3 translation fault (11) at 0x00005b98, esr 0x90000007
[ 8107.385538] pgd = ffffffc038dbc000
[ 8107.386046] [00005b98] *pgd=0000000078d6c003, *pud=0000000078d6c003, *pmd=0000000073c31003, *pte=0000000000000000
[ 8107.391920]
[ 8107.392521] CPU: 0 PID: 5135 Comm: criu Not tainted 4.5.0 checkpoint-restore#29
[ 8107.392805] Hardware name: linux,dummy-virt (DT)
[ 8107.393140] task: ffffffc039a2a400 ti: ffffffc033c34000 task.ti: ffffffc033c34000
[ 8107.393782] PC is at 0x13514
[ 8107.406332] LR is at 0x1342c
[ 8107.406550] pc : [<0000000000013514>] lr : [<000000000001342c>] pstate: a0000000

This is because gcc for aarch64 adds vdso_symbols array to symbols
table and by that reason, it needs run-time relocations in place.

How it goes (with cut not interesting assembly):
  0x14104:	adrp	x7, 0x14000	; adrp+add loading of
  0x14114:	add	x0, x7, #0x928	; symbol table's address,
  0x14134:	ldp	x2, x3, [x0]	; loading address of symbol from
					; symbol table
  0x1414c:	stp	x2, x3, [x29,checkpoint-restore#112] ; saving it on stack on
					; function's begin
  0x14188:	ldr	x2, [x29,checkpoint-restore#112]	; using symbol's address in code

The symbol for this in symbol table is:
  [root@alarm cr]# readelf -s criu/pie/restorer.built-in.bin.o | grep 5b98
    248: 0000000000005b98     0 NOTYPE  LOCAL  DEFAULT    1 $d
And also may be visible this way:
  objdump -dS criu/pie/restorer.built-in.bin.o | less
  ...
  0000000000004924 <cur_loglevel>:
      4924:       00000002 00005b98 00000000 00005ed0     .....[.......^..
      4934:       00000000 00005ee8 00000000 00005f00     .....^......._..

So, in a symbol table lies not relocated address of symbol.
The real address may be visible with added printing of vdso_symbols[0]:
pie: vdso: vdso_symbols[0] 0x15b98
(this way gcc by some reason does access symbol through
local adrp+add calculations, than through stack-saved pointer
of a symbol from symbol table).

While we don't handling properly relocs here, I suggest this
ugly workaround.

Temporary fix for: checkpoint-restore#150

[1]: https://lists.openvz.org/pipermail/criu/2015-October/022453.html

Cc: Wang Long <long.wanglong@huawei.com>
Cc: Christopher Covington <cov@codeaurora.org>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Tested-by: long.wanglong <long.wanglong@huawei.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
xemul pushed a commit that referenced this issue May 30, 2016
This seems to be known problem in util-vdso.c on aarch64 [1].

Now restorer segfaults with the following log:
[ 8107.384817] criu[5135]: unhandled level 3 translation fault (11) at 0x00005b98, esr 0x90000007
[ 8107.385538] pgd = ffffffc038dbc000
[ 8107.386046] [00005b98] *pgd=0000000078d6c003, *pud=0000000078d6c003, *pmd=0000000073c31003, *pte=0000000000000000
[ 8107.391920]
[ 8107.392521] CPU: 0 PID: 5135 Comm: criu Not tainted 4.5.0 #29
[ 8107.392805] Hardware name: linux,dummy-virt (DT)
[ 8107.393140] task: ffffffc039a2a400 ti: ffffffc033c34000 task.ti: ffffffc033c34000
[ 8107.393782] PC is at 0x13514
[ 8107.406332] LR is at 0x1342c
[ 8107.406550] pc : [<0000000000013514>] lr : [<000000000001342c>] pstate: a0000000

This is because gcc for aarch64 adds vdso_symbols array to symbols
table and by that reason, it needs run-time relocations in place.

How it goes (with cut not interesting assembly):
  0x14104:	adrp	x7, 0x14000	; adrp+add loading of
  0x14114:	add	x0, x7, #0x928	; symbol table's address,
  0x14134:	ldp	x2, x3, [x0]	; loading address of symbol from
					; symbol table
  0x1414c:	stp	x2, x3, [x29,#112] ; saving it on stack on
					; function's begin
  0x14188:	ldr	x2, [x29,#112]	; using symbol's address in code

The symbol for this in symbol table is:
  [root@alarm cr]# readelf -s criu/pie/restorer.built-in.bin.o | grep 5b98
    248: 0000000000005b98     0 NOTYPE  LOCAL  DEFAULT    1 $d
And also may be visible this way:
  objdump -dS criu/pie/restorer.built-in.bin.o | less
  ...
  0000000000004924 <cur_loglevel>:
      4924:       00000002 00005b98 00000000 00005ed0     .....[.......^..
      4934:       00000000 00005ee8 00000000 00005f00     .....^......._..

So, in a symbol table lies not relocated address of symbol.
The real address may be visible with added printing of vdso_symbols[0]:
pie: vdso: vdso_symbols[0] 0x15b98
(this way gcc by some reason does access symbol through
local adrp+add calculations, than through stack-saved pointer
of a symbol from symbol table).

While we don't handling properly relocs here, I suggest this
ugly workaround.

Temporary fix for: #150

[1]: https://lists.openvz.org/pipermail/criu/2015-October/022453.html

Cc: Wang Long <long.wanglong@huawei.com>
Cc: Christopher Covington <cov@codeaurora.org>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Tested-by: long.wanglong <long.wanglong@huawei.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants