Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error (sk-inet.c:610): Can't bind inet socket: Cannot assign requested address #29

Closed
avagin opened this issue Sep 16, 2015 · 3 comments
Labels

Comments

@avagin
Copy link
Member

avagin commented Sep 16, 2015

(00.206948)    198:     Restoring fd 11 (state -> create)
(00.206952)    198:     Restore: family 10 type 1 proto 6 port 7676 state 1 src_addr 2001:db8:1::feee:cfa2:163e
(00.206958)    198: Restoring TCP connection
(00.206961)    198: Restoring TCP connection id 18 ino 1b683d
(00.206972)    198:     Setting 1 queue seq to 2466973862
(00.206975)    198:     Setting 2 queue seq to 1315864320
(00.206999)    198: Error (sk-inet.c:610): Can't bind inet socket: Cannot assign requested address
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
1168: eth0: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast state UNKNOWN group default qlen 1000
    link/ether fe:ee:cf:a2:16:3e brd ff:ff:ff:ff:ff:ff
    inet 172.21.1.167/16 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 2001:db8:1::feee:cfa2:163e/64 scope global tentative 
       valid_lft forever preferred_lft forever
    inet6 fe80::fcee:cfff:fea2:163e/64 scope link tentative 
       valid_lft forever preferred_lft forever

https://gist.github.com/boucher/7693ed611225ba24f927

Reported-by: @boucher

@avagin
Copy link
Member Author

avagin commented Sep 16, 2015

2001:db8:1::feee:cfa2:163e/64 is marked as tentative.

The new addresses remain in a tentative state while duplicate address detection is performed.

DAD can be disabled by the dad_transmits sysctl.

@xemul xemul added the bug label Sep 16, 2015
@avagin
Copy link
Member Author

avagin commented Oct 28, 2015

We can set the IP_FREEBIND option for ipv6 sockets.

@xemul
Copy link
Member

xemul commented Feb 8, 2016

73a739b

@xemul xemul closed this as completed Feb 8, 2016
0x7f454c46 added a commit to 0x7f454c46/criu that referenced this issue Apr 28, 2016
Massive `const char *vdso_symbols[VDSO_SYMBOL_MAX]`, local to function
`parse_elf_symbols` is dereferenced.
On the first attempt massive's first element's address is 0x15c68
and value readed is correct. But for the second/third dereferences
the same first element has address 0x5c68.

Crashes with segmentation fault with:
pie: vdso: parse_elf_symbols: array @0x1f180 array[0] @0x15c68 is __ke>
pie: rnel_clock_getres
pie: vdso: parse_elf_symbols: array[0] is 0x5c68
[162792.339164] criu[32308]: unhandled level 3 translation fault (11) at 0x00005c68, esr 0x90000007
[162792.339555] pgd = ffffffc038f5e000
[162792.339667] [00005c68] *pgd=0000000075827003, *pud=0000000075827003, *pmd=0000000078fb6003, *pte=0000000000000000
[162792.340056]
[162792.340219] CPU: 0 PID: 32308 Comm: criu Not tainted 4.5.0 checkpoint-restore#29
[162792.340328] Hardware name: linux,dummy-virt (DT)
[162792.340448] task: ffffffc039a28000 ti: ffffffc0392e8000 task.ti: ffffffc0392e8000
[162792.340649] PC is at 0x13504
[162792.340708] LR is at 0x1341c
[162792.340765] pc : [<0000000000013504>] lr : [<000000000001341c>] pstate: a0000000
[162792.340876] sp : 000000000001eed0
[162792.340944] x29: 000000000001eed0 x28: 0000000000005c68
[162792.341068] x27: 0000007f9bfba040 x26: 000000000001ef40
[162792.341161] x25: 6666666666666667 x24: 00000000000159dc
[162792.341251] x23: 000000000001ef48 x22: 00000000fffffff0
[162792.341342] x21: 0000000000015ca5 x20: 000000000001f0d0
[162792.341435] x19: 000000000001efc8 x18: 00000000004f4838
[162792.341521] x17: 0000007f931ad920 x16: 0000000000000000
[162792.341613] x15: 0011a8fe65dd29e4 x14: 0000000000000000
[162792.341709] x13: 00000001f4000000 x12: 0000000000000017
[162792.341798] x11: 0000000000090abf x10: 00000000572217f0
[162792.341951] x9 : 000000000001eec3 x8 : 0000000000000000
[162792.342039] x7 : 000000000001eeb8 x6 : 000000000001eebc
[162792.342127] x5 : 0000000000000030 x4 : 0000000000000040
[162792.342214] x3 : 000000000001f008 x2 : 0000000000000030
[162792.342300] x1 : 00000000fffffff0 x0 : 000000000001f0b8

Asm for loading vdso_symbols[0] parameter is for the first print (in x5):
    4150:       b0000005        adrp    x5, 5000 <cur_loglevel+0x63c>
    4154:       b0000014        adrp    x20, 5000 <cur_loglevel+0x63c>
    4158:       b0000016        adrp    x22, 5000 <cur_loglevel+0x63c>
    415c:       9131a0a5        add     x5, x5, #0xc68

For the second time it's:
    418c:       f9405ba3        ldr     x3, [x29,checkpoint-restore#176]

And for the third:
    41a8:       f9405ba5        ldr     x5, [x29,checkpoint-restore#176]

How to reproduce:
0. make
1. sleep 10000
2. ./criu/criu dump -vvvv --shell-job -t <pid-of-sleep-10000>
3. ./criu/criu restore -vvvv --shell-job

Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
criupatchwork pushed a commit to criupatchwork/criu that referenced this issue May 11, 2016
This seems to be known problem in util-vdso.c on aarch64 [1].

Now restorer segfaults with the following log:
[ 8107.384817] criu[5135]: unhandled level 3 translation fault (11) at 0x00005b98, esr 0x90000007
[ 8107.385538] pgd = ffffffc038dbc000
[ 8107.386046] [00005b98] *pgd=0000000078d6c003, *pud=0000000078d6c003, *pmd=0000000073c31003, *pte=0000000000000000
[ 8107.391920]
[ 8107.392521] CPU: 0 PID: 5135 Comm: criu Not tainted 4.5.0 checkpoint-restore#29
[ 8107.392805] Hardware name: linux,dummy-virt (DT)
[ 8107.393140] task: ffffffc039a2a400 ti: ffffffc033c34000 task.ti: ffffffc033c34000
[ 8107.393782] PC is at 0x13514
[ 8107.406332] LR is at 0x1342c
[ 8107.406550] pc : [<0000000000013514>] lr : [<000000000001342c>] pstate: a0000000

This is because gcc for aarch64 adds vdso_symbols array to symbols
table and by that reason, it needs run-time relocations in place.

How it goes (with cut not interesting assembly):
  0x14104:	adrp	x7, 0x14000	; adrp+add loading of
  0x14114:	add	x0, x7, #0x928	; symbol table's address,
  0x14134:	ldp	x2, x3, [x0]	; loading address of symbol from
					; symbol table
  0x1414c:	stp	x2, x3, [x29,checkpoint-restore#112] ; saving it on stack on
					; function's begin
  0x14188:	ldr	x2, [x29,checkpoint-restore#112]	; using symbol's address in code

The symbol for this in symbol table is:
  [root@alarm cr]# readelf -s criu/pie/restorer.built-in.bin.o | grep 5b98
    248: 0000000000005b98     0 NOTYPE  LOCAL  DEFAULT    1 $d
And also may be visible this way:
  objdump -dS criu/pie/restorer.built-in.bin.o | less
  ...
  0000000000004924 <cur_loglevel>:
      4924:       00000002 00005b98 00000000 00005ed0     .....[.......^..
      4934:       00000000 00005ee8 00000000 00005f00     .....^......._..

So, in a symbol table lies not relocated address of symbol.
The real address may be visible with added printing of vdso_symbols[0]:
pie: vdso: vdso_symbols[0] 0x15b98
(this way gcc by some reason does access symbol through
local adrp+add calculations, than through stack-saved pointer
of a symbol from symbol table).

While we don't handling properly relocs here, I suggest this
ugly workaround.

Temporary fix for: checkpoint-restore#150

[1]: https://lists.openvz.org/pipermail/criu/2015-October/022453.html

Cc: Wang Long <long.wanglong@huawei.com>
Cc: Christopher Covington <cov@codeaurora.org>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
xemul pushed a commit that referenced this issue May 11, 2016
This seems to be known problem in util-vdso.c on aarch64 [1].

Now restorer segfaults with the following log:
[ 8107.384817] criu[5135]: unhandled level 3 translation fault (11) at 0x00005b98, esr 0x90000007
[ 8107.385538] pgd = ffffffc038dbc000
[ 8107.386046] [00005b98] *pgd=0000000078d6c003, *pud=0000000078d6c003, *pmd=0000000073c31003, *pte=0000000000000000
[ 8107.391920]
[ 8107.392521] CPU: 0 PID: 5135 Comm: criu Not tainted 4.5.0 #29
[ 8107.392805] Hardware name: linux,dummy-virt (DT)
[ 8107.393140] task: ffffffc039a2a400 ti: ffffffc033c34000 task.ti: ffffffc033c34000
[ 8107.393782] PC is at 0x13514
[ 8107.406332] LR is at 0x1342c
[ 8107.406550] pc : [<0000000000013514>] lr : [<000000000001342c>] pstate: a0000000

This is because gcc for aarch64 adds vdso_symbols array to symbols
table and by that reason, it needs run-time relocations in place.

How it goes (with cut not interesting assembly):
  0x14104:	adrp	x7, 0x14000	; adrp+add loading of
  0x14114:	add	x0, x7, #0x928	; symbol table's address,
  0x14134:	ldp	x2, x3, [x0]	; loading address of symbol from
					; symbol table
  0x1414c:	stp	x2, x3, [x29,#112] ; saving it on stack on
					; function's begin
  0x14188:	ldr	x2, [x29,#112]	; using symbol's address in code

The symbol for this in symbol table is:
  [root@alarm cr]# readelf -s criu/pie/restorer.built-in.bin.o | grep 5b98
    248: 0000000000005b98     0 NOTYPE  LOCAL  DEFAULT    1 $d
And also may be visible this way:
  objdump -dS criu/pie/restorer.built-in.bin.o | less
  ...
  0000000000004924 <cur_loglevel>:
      4924:       00000002 00005b98 00000000 00005ed0     .....[.......^..
      4934:       00000000 00005ee8 00000000 00005f00     .....^......._..

So, in a symbol table lies not relocated address of symbol.
The real address may be visible with added printing of vdso_symbols[0]:
pie: vdso: vdso_symbols[0] 0x15b98
(this way gcc by some reason does access symbol through
local adrp+add calculations, than through stack-saved pointer
of a symbol from symbol table).

While we don't handling properly relocs here, I suggest this
ugly workaround.

Temporary fix for: #150

[1]: https://lists.openvz.org/pipermail/criu/2015-October/022453.html

Cc: Wang Long <long.wanglong@huawei.com>
Cc: Christopher Covington <cov@codeaurora.org>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Tested-by: long.wanglong <long.wanglong@huawei.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
xemul pushed a commit that referenced this issue May 16, 2016
This seems to be known problem in util-vdso.c on aarch64 [1].

Now restorer segfaults with the following log:
[ 8107.384817] criu[5135]: unhandled level 3 translation fault (11) at 0x00005b98, esr 0x90000007
[ 8107.385538] pgd = ffffffc038dbc000
[ 8107.386046] [00005b98] *pgd=0000000078d6c003, *pud=0000000078d6c003, *pmd=0000000073c31003, *pte=0000000000000000
[ 8107.391920]
[ 8107.392521] CPU: 0 PID: 5135 Comm: criu Not tainted 4.5.0 #29
[ 8107.392805] Hardware name: linux,dummy-virt (DT)
[ 8107.393140] task: ffffffc039a2a400 ti: ffffffc033c34000 task.ti: ffffffc033c34000
[ 8107.393782] PC is at 0x13514
[ 8107.406332] LR is at 0x1342c
[ 8107.406550] pc : [<0000000000013514>] lr : [<000000000001342c>] pstate: a0000000

This is because gcc for aarch64 adds vdso_symbols array to symbols
table and by that reason, it needs run-time relocations in place.

How it goes (with cut not interesting assembly):
  0x14104:	adrp	x7, 0x14000	; adrp+add loading of
  0x14114:	add	x0, x7, #0x928	; symbol table's address,
  0x14134:	ldp	x2, x3, [x0]	; loading address of symbol from
					; symbol table
  0x1414c:	stp	x2, x3, [x29,#112] ; saving it on stack on
					; function's begin
  0x14188:	ldr	x2, [x29,#112]	; using symbol's address in code

The symbol for this in symbol table is:
  [root@alarm cr]# readelf -s criu/pie/restorer.built-in.bin.o | grep 5b98
    248: 0000000000005b98     0 NOTYPE  LOCAL  DEFAULT    1 $d
And also may be visible this way:
  objdump -dS criu/pie/restorer.built-in.bin.o | less
  ...
  0000000000004924 <cur_loglevel>:
      4924:       00000002 00005b98 00000000 00005ed0     .....[.......^..
      4934:       00000000 00005ee8 00000000 00005f00     .....^......._..

So, in a symbol table lies not relocated address of symbol.
The real address may be visible with added printing of vdso_symbols[0]:
pie: vdso: vdso_symbols[0] 0x15b98
(this way gcc by some reason does access symbol through
local adrp+add calculations, than through stack-saved pointer
of a symbol from symbol table).

While we don't handling properly relocs here, I suggest this
ugly workaround.

Temporary fix for: #150

[1]: https://lists.openvz.org/pipermail/criu/2015-October/022453.html

Cc: Wang Long <long.wanglong@huawei.com>
Cc: Christopher Covington <cov@codeaurora.org>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Tested-by: long.wanglong <long.wanglong@huawei.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
tkhai pushed a commit to tkhai/criu that referenced this issue May 17, 2016
This seems to be known problem in util-vdso.c on aarch64 [1].

Now restorer segfaults with the following log:
[ 8107.384817] criu[5135]: unhandled level 3 translation fault (11) at 0x00005b98, esr 0x90000007
[ 8107.385538] pgd = ffffffc038dbc000
[ 8107.386046] [00005b98] *pgd=0000000078d6c003, *pud=0000000078d6c003, *pmd=0000000073c31003, *pte=0000000000000000
[ 8107.391920]
[ 8107.392521] CPU: 0 PID: 5135 Comm: criu Not tainted 4.5.0 checkpoint-restore#29
[ 8107.392805] Hardware name: linux,dummy-virt (DT)
[ 8107.393140] task: ffffffc039a2a400 ti: ffffffc033c34000 task.ti: ffffffc033c34000
[ 8107.393782] PC is at 0x13514
[ 8107.406332] LR is at 0x1342c
[ 8107.406550] pc : [<0000000000013514>] lr : [<000000000001342c>] pstate: a0000000

This is because gcc for aarch64 adds vdso_symbols array to symbols
table and by that reason, it needs run-time relocations in place.

How it goes (with cut not interesting assembly):
  0x14104:	adrp	x7, 0x14000	; adrp+add loading of
  0x14114:	add	x0, x7, #0x928	; symbol table's address,
  0x14134:	ldp	x2, x3, [x0]	; loading address of symbol from
					; symbol table
  0x1414c:	stp	x2, x3, [x29,checkpoint-restore#112] ; saving it on stack on
					; function's begin
  0x14188:	ldr	x2, [x29,checkpoint-restore#112]	; using symbol's address in code

The symbol for this in symbol table is:
  [root@alarm cr]# readelf -s criu/pie/restorer.built-in.bin.o | grep 5b98
    248: 0000000000005b98     0 NOTYPE  LOCAL  DEFAULT    1 $d
And also may be visible this way:
  objdump -dS criu/pie/restorer.built-in.bin.o | less
  ...
  0000000000004924 <cur_loglevel>:
      4924:       00000002 00005b98 00000000 00005ed0     .....[.......^..
      4934:       00000000 00005ee8 00000000 00005f00     .....^......._..

So, in a symbol table lies not relocated address of symbol.
The real address may be visible with added printing of vdso_symbols[0]:
pie: vdso: vdso_symbols[0] 0x15b98
(this way gcc by some reason does access symbol through
local adrp+add calculations, than through stack-saved pointer
of a symbol from symbol table).

While we don't handling properly relocs here, I suggest this
ugly workaround.

Temporary fix for: checkpoint-restore#150

[1]: https://lists.openvz.org/pipermail/criu/2015-October/022453.html

Cc: Wang Long <long.wanglong@huawei.com>
Cc: Christopher Covington <cov@codeaurora.org>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Tested-by: long.wanglong <long.wanglong@huawei.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
xemul pushed a commit that referenced this issue May 30, 2016
This seems to be known problem in util-vdso.c on aarch64 [1].

Now restorer segfaults with the following log:
[ 8107.384817] criu[5135]: unhandled level 3 translation fault (11) at 0x00005b98, esr 0x90000007
[ 8107.385538] pgd = ffffffc038dbc000
[ 8107.386046] [00005b98] *pgd=0000000078d6c003, *pud=0000000078d6c003, *pmd=0000000073c31003, *pte=0000000000000000
[ 8107.391920]
[ 8107.392521] CPU: 0 PID: 5135 Comm: criu Not tainted 4.5.0 #29
[ 8107.392805] Hardware name: linux,dummy-virt (DT)
[ 8107.393140] task: ffffffc039a2a400 ti: ffffffc033c34000 task.ti: ffffffc033c34000
[ 8107.393782] PC is at 0x13514
[ 8107.406332] LR is at 0x1342c
[ 8107.406550] pc : [<0000000000013514>] lr : [<000000000001342c>] pstate: a0000000

This is because gcc for aarch64 adds vdso_symbols array to symbols
table and by that reason, it needs run-time relocations in place.

How it goes (with cut not interesting assembly):
  0x14104:	adrp	x7, 0x14000	; adrp+add loading of
  0x14114:	add	x0, x7, #0x928	; symbol table's address,
  0x14134:	ldp	x2, x3, [x0]	; loading address of symbol from
					; symbol table
  0x1414c:	stp	x2, x3, [x29,#112] ; saving it on stack on
					; function's begin
  0x14188:	ldr	x2, [x29,#112]	; using symbol's address in code

The symbol for this in symbol table is:
  [root@alarm cr]# readelf -s criu/pie/restorer.built-in.bin.o | grep 5b98
    248: 0000000000005b98     0 NOTYPE  LOCAL  DEFAULT    1 $d
And also may be visible this way:
  objdump -dS criu/pie/restorer.built-in.bin.o | less
  ...
  0000000000004924 <cur_loglevel>:
      4924:       00000002 00005b98 00000000 00005ed0     .....[.......^..
      4934:       00000000 00005ee8 00000000 00005f00     .....^......._..

So, in a symbol table lies not relocated address of symbol.
The real address may be visible with added printing of vdso_symbols[0]:
pie: vdso: vdso_symbols[0] 0x15b98
(this way gcc by some reason does access symbol through
local adrp+add calculations, than through stack-saved pointer
of a symbol from symbol table).

While we don't handling properly relocs here, I suggest this
ugly workaround.

Temporary fix for: #150

[1]: https://lists.openvz.org/pipermail/criu/2015-October/022453.html

Cc: Wang Long <long.wanglong@huawei.com>
Cc: Christopher Covington <cov@codeaurora.org>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Tested-by: long.wanglong <long.wanglong@huawei.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants