Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docker checkpoint failed #6

Closed
kimh opened this issue Jul 15, 2015 · 53 comments
Closed

docker checkpoint failed #6

kimh opened this issue Jul 15, 2015 · 53 comments

Comments

@kimh
Copy link

kimh commented Jul 15, 2015

I'm trying to make docker checkpoint works, but I got the following error.

Here is the docker client log.

vagrant@vagrant-ubuntu-trusty:~/docker$ export CID=$(docker run -d busybox:latest /bin/sh -c 'i=0; while true; do echo $i >> /foo; i=$(e
xpr $i + 1); sleep 3; done')

vagrant@vagrant-ubuntu-trusty:~/docker$ docker ps
CONTAINER ID        IMAGE               COMMAND                CREATED             STATUS              PORTS               NAMES
806b0fa36744        busybox:latest      "/bin/sh -c 'i=0; wh   3 seconds ago       Up 1 seconds                            determined_visvesvaraya

vagrant@vagrant-ubuntu-trusty:~/docker$ docker checkpoint $CID
Error response from daemon: Cannot checkpoint container 806b0fa36744339b60e41727c1c7f4aa65e30cd3ab86a63206cae9c4854b6fbb: criu failed: type NOTIFY errno 0

Here is the docker daemon log.

INFO[0840] POST /v1.20/containers/806b0fa36744339b60e41727c1c7f4aa65e30cd3ab86a63206cae9c4854b6fbb/checkpoint
ERRO[0840] Handler for POST /containers/{name:.*}/checkpoint returned error: Cannot checkpoint container 806b0fa36744339b60e41727c1c7f4aa65e30cd3ab86a63206cae9c4854b6fbb: criu failed: type NOTIFY errno 0
ERRO[0840] HTTP Error                                    err=Cannot checkpoint container 806b0fa36744339b60e41727c1c7f4aa65e30cd3ab86a63206cae9c4854b6fbb: criu failed: type NOTIFY errno 0 statusCode=500

Here is my environment.

Ubuntu: 14.04 (vagrant)

criu

vagrant@vagrant-ubuntu-trusty:~/docker$ criu --version
Version: 1.6
GitID: v1.6-95-gde70936

docker boucher/docker@dd06ea0

vagrant@vagrant-ubuntu-trusty:~/docker$ docker -v
Docker version 1.8.0-dev, build dd06ea0

Please let me know if you need more information.

@kimh kimh changed the title Native C/R failed with docker checkpoint docker checkpoint failed Jul 15, 2015
@xemul
Copy link
Member

xemul commented Jul 15, 2015

Can you find dump logs?
Also adding @boucher to discussion :)

@boucher
Copy link

boucher commented Jul 15, 2015

Hey @kimh, there's currently a regression in Docker thats causing the problem. Reverting this commit should work as a temporary fix: f18fb5b3efd59d54c00d4e1b1d4b88c4b21e96be

@kimh
Copy link
Author

kimh commented Jul 15, 2015

@boucher ok, reverted and now compiling....It's totally different discussion, but are there any good ways to compile docker fast? Using make create docker image everytime from scratch :(

@boucher
Copy link

boucher commented Jul 15, 2015

When you build with make docker should be using its cache for everything
but the final step. My builds take less than a minute usually.

On Wednesday, July 15, 2015, Kim, Hirokuni notifications@github.com wrote:

@boucher https://github.com/boucher ok, reverted and now
compiling....It's totally different discussion, but are there any good ways
to compile docker fast? Using make create docker image everytime from
scratch :(


Reply to this email directly or view it on GitHub
https://github.com/xemul/criu/issues/6#issuecomment-121666031.

Sent from Gmail Mobile

@kimh
Copy link
Author

kimh commented Jul 15, 2015

@boucher Looks now docker checkpoint is not available.

vagrant@vagrant-ubuntu-trusty:~/docker$ docker -v
Docker version 1.8.0-dev, build f18fb5b
vagrant@vagrant-ubuntu-trusty:~/docker$ docker checkpoint
docker: 'checkpoint' is not a docker command.
See 'docker --help'.

Did I have to use different version of docker binary ? (daemon with reverted version and client with latest)

@boucher
Copy link

boucher commented Jul 15, 2015

Are you sure you applied the revert to the right branch?

On Wed, Jul 15, 2015 at 9:30 AM, Kim, Hirokuni notifications@github.com
wrote:

@boucher https://github.com/boucher Looks now docker checkpoint is not
available.

vagrant@vagrant-ubuntu-trusty:/docker$ docker -v
Docker version 1.8.0-dev, build f18fb5b
vagrant@vagrant-ubuntu-trusty:
/docker$ docker checkpoint
docker: 'checkpoint' is not a docker command.
See 'docker --help'.

Did I have to use different version of docker binary ? (daemon with
reverted version and client with latest)


Reply to this email directly or view it on GitHub
https://github.com/xemul/criu/issues/6#issuecomment-121669300.

@kimh
Copy link
Author

kimh commented Jul 15, 2015

Doesn't this look so?

vagrant@vagrant-ubuntu-trusty:~/docker$ docker -v
Docker version 1.8.0-dev, build f18fb5b

@kimh
Copy link
Author

kimh commented Jul 15, 2015

Oh, I think I misunderstood what you said. So reiterate just in case, I have to revert f18fb5b3efd59d54c00d4e1b1d4b88c4b21e96be commit?

@boucher
Copy link

boucher commented Jul 15, 2015

Yes, from the cr-combined branch.

@kimh
Copy link
Author

kimh commented Jul 15, 2015

@boucher Now checkpoint works. However, restore doesn't.

docker restore db490e9a820014dca0c6e960b28381db7bc0dcf5778fcb6e68bc855fb96465c0
ERRO[0045] Error restoring container: criu failed: type NOTIFY errno 0, exitCode={-1 %!d(bool=false)}
ERRO[0045] Handler for POST /containers/{name:.*}/restore returned error: Cannot restore container db490e9a820014dca0c6e960b28381db7bc0dcf5778fcb6e68bc855fb96465c0: criu failed: type NOTIFY errno 0
ERRO[0045] HTTP Error                                    err=Cannot restore container db490e9a820014dca0c6e960b28381db7bc0dcf5778fcb6e68bc855fb96465c0: criu failed: type NOTIFY errno 0 statusCode=500
INFO[0050] GET /v1.20/containers/json

Any idea?

@boucher
Copy link

boucher commented Jul 15, 2015

Can you pass --work-dir=/tmp/something to the restore and then paste the
criu restore log somewhere

On Wed, Jul 15, 2015 at 9:47 AM, Kim, Hirokuni notifications@github.com
wrote:

@boucher https://github.com/boucher Now checkpoint works. However,
restore doesn't.

docker restore db490e9a820014dca0c6e960b28381db7bc0dcf5778fcb6e68bc855fb96465c0

ERRO[0045] Error restoring container: criu failed: type NOTIFY errno 0, exitCode={-1 %!d(bool=false)}
ERRO[0045] Handler for POST /containers/{name:.*}/restore returned error: Cannot restore container db490e9a820014dca0c6e960b28381db7bc0dcf5778fcb6e68bc855fb96465c0: criu failed: type NOTIFY errno 0
ERRO[0045] HTTP Error err=Cannot restore container db490e9a820014dca0c6e960b28381db7bc0dcf5778fcb6e68bc855fb96465c0: criu failed: type NOTIFY errno 0 statusCode=500
INFO[0050] GET /v1.20/containers/json

Any idea?


Reply to this email directly or view it on GitHub
https://github.com/xemul/criu/issues/6#issuecomment-121676184.

@kimh
Copy link
Author

kimh commented Jul 15, 2015

Is restore.log supposed to be created under /tmp/something ?

docker restore $CID  --work-dir=/tmp/something
Error response from daemon: Cannot restore container e50472b26a0ce9e1369d0f76e5b25d11e68979709691b7dc66cf305cd18670df: a container has already joined the endpoint
Error response from daemon: no such id: --work-dir=/tmp/something
Error: failed to restore one or more containers

Looks like nothing is created.

vagrant@vagrant-ubuntu-trusty:/var$ ls /tmp/something
ls: cannot access /tmp/something: No such file or directory

Even if I first created something directory, nothing is created.

@boucher
Copy link

boucher commented Jul 15, 2015

You must pass --work-dir before the $CID

@kimh
Copy link
Author

kimh commented Jul 15, 2015

Here is restore.log

https://gist.github.com/kimh/1077ad2a8e7aa58f3eed

@boucher
Copy link

boucher commented Jul 15, 2015

Did you build criu from source? I believe there's a bug in the master branch that @xemul is working on (and may already have a patch for?)

@kimh
Copy link
Author

kimh commented Jul 15, 2015

Yes, I built from source. My HEAD is xemul@de70936

I live in JST, so I'll go to bed now. @xemul can you let me know if there is anything that I can try?

Thank you for your quick help @boucher

@boucher
Copy link

boucher commented Jul 15, 2015

opencontainers/runc#130

I believe this is tracking the fix in runc, but it will then need to be merged into docker master as well.

@YaoZengzeng
Copy link

Hi,@boucher what do you mean "Reverting this commit should work as a temporary fix: f18fb5b3efd59d54c00d4e1b1d4b88c4b21e96be".
I'm under cr-combined branch and "git rest --hard f18fb5b3efd59d54c00d4e1b1d4b88c4b21e96be" ,then compile ,but the binary seems not support checkpoint and restore.
root@iZu1pbgsilbZ:~# docker version
Client:
Version: 1.8.0-dev
API version: 1.20
Go version: go1.4.2
Git commit: f18fb5b
Built: Sun Jul 19 03:16:51 UTC 2015
OS/Arch: linux/amd64

Server:
Version: 1.8.0-dev
API version: 1.20
Go version: go1.4.2
Git commit: f18fb5b
Built: Sun Jul 19 03:16:51 UTC 2015
OS/Arch: linux/amd64
My problem is same with @kimh.

@boucher
Copy link

boucher commented Jul 19, 2015

Sorry, as of yesterday it's now behind the experimental flag. You need to
build with DOCKER_EXPERIMENTAL=1

Also the fix has been incorporated so reverting that commit is no longer
necessary. However, there is a new blocker that has not yet been figured
out.

On Saturday, July 18, 2015, YaoZengzeng notifications@github.com wrote:

Hi,@boucher https://github.com/boucher what do you mean "Reverting this
commit should work as a temporary fix:
f18fb5b3efd59d54c00d4e1b1d4b88c4b21e96be".
I'm under cr-combined branch and "git rest --hard
f18fb5b3efd59d54c00d4e1b1d4b88c4b21e96be" ,then compile ,but the binary
seems not support checkpoint and restore.
root@iZu1pbgsilbZ:~# docker version
Client:
Version: 1.8.0-dev
API version: 1.20
Go version: go1.4.2
Git commit: f18fb5b
Built: Sun Jul 19 03:16:51 UTC 2015
OS/Arch: linux/amd64

Server:
Version: 1.8.0-dev
API version: 1.20
Go version: go1.4.2
Git commit: f18fb5b
Built: Sun Jul 19 03:16:51 UTC 2015
OS/Arch: linux/amd64
My problem is same with @kimh https://github.com/kimh.


Reply to this email directly or view it on GitHub
https://github.com/xemul/criu/issues/6#issuecomment-122625309.

Sent from Gmail Mobile

@YaoZengzeng
Copy link

@boucher I compile like this
"root@iZu1pbgsilbZ:/home/monster/docker# make DOCKER_EXPERIMENTAL=1 binary",
and the result support checkpoint and restore command .But checkpoint still doesn't work

root@iZu1pbgsilbZ:/home/monster/docker# docker checkpoint $CID
Error response from daemon: Cannot checkpoint container be9da3d14c883491fee1d73a727b88d9d52e1db82996f9110080a544ade94b24: criu failed: type NOTIFY errno 0
Error: failed to checkpoint one or more containers

@boucher
Copy link

boucher commented Jul 19, 2015

Yes, things are currently broken. You can follow this thread on the mailing
list:
http://lists.openvz.org/pipermail/criu/2015-July/021277.html

On Sat, Jul 18, 2015 at 10:03 PM, YaoZengzeng notifications@github.com
wrote:

@boucher https://github.com/boucher I compile like this
"root@iZu1pbgsilbZ:/home/monster/docker# make DOCKER_EXPERIMENTAL=1
binary",
and the result support checkpoint and restore command .But checkpoint
still doesn't work

root@iZu1pbgsilbZ:/home/monster/docker# docker checkpoint $CID
Error response from daemon: Cannot checkpoint container
be9da3d14c883491fee1d73a727b88d9d52e1db82996f9110080a544ade94b24: criu
failed: type NOTIFY errno 0
Error: failed to checkpoint one or more containers


Reply to this email directly or view it on GitHub
https://github.com/xemul/criu/issues/6#issuecomment-122629443.

@YaoZengzeng
Copy link

Hi @boucher I find that in docker 1.8.0 ,it will not create dockerinit-1.8.0-dev under /var/lib/docker/init/ or anywhere else .I don't if it is a bug or something else ....Becasuse of this ,I even can't do external C/R test in docker 1.8.0,the docker_cr.sh will report the missing of dockerinit-1.8.0-dev.

@xemul
Copy link
Member

xemul commented Jul 22, 2015

@kimh the error you see in logs should be fixed in recent head. The fixing commit is 7b20f42

@kimh
Copy link
Author

kimh commented Jul 22, 2015

@xemul thank you for the ping. I'll try later and let you know how it goes.

@kimh
Copy link
Author

kimh commented Jul 26, 2015

@xemul I tried 7b20f42 and restore once has worked. Then, I updated docker to use boucher/docker@24e477b and now checkpoint doesn't work... (DOCKER_EXPERIMENTAL is set).

The head for CRIU is xemul@ace699d

Here is the log for failed checkpoint which is useless.

vagrant@vagrant-ubuntu-trusty:~/criu$ docker checkpoint $CID
Error response from daemon: Cannot checkpoint container 94dd8db31db80dc3cc61c44cab05b6018595c0b48144350b82e191897177bae4: criu failed: type NOTIFY errno 0
Error: failed to checkpoint one or more containers

How can I get criu dump log?

@xemul
Copy link
Member

xemul commented Jul 27, 2015

OK :) Now we need the @boucher 's help with Docker part.

@boucher
Copy link

boucher commented Jul 27, 2015

The dump log should be at /var/run/docker/execdriver/native/<container_id>/criu.work/dump.log

@kimh
Copy link
Author

kimh commented Jul 27, 2015

@boucher here is the dump.log

@boucher
Copy link

boucher commented Jul 27, 2015

This is the same issue as before, in Docker. The fix has not yet made its way from libcontainer to docker. You can update libcontainer yourself for now, or you can try reverting the same commit from the top of this thread.

xemul pushed a commit that referenced this issue Sep 10, 2015
If you call clone directly you are responsible for setting up the TLS area yourself.

$ abrt-cli ls  | grep different_creds | wc -l
39
$ gdb -c /var/spool/abrt/ccpp-2015-07-24-10\:21\:14-8014/coredump  different_creds
 Core was generated by `./different_creds --pidfile=different_creds.pid --outfile=different_creds.out'.
 Program terminated with signal SIGILL, Illegal instruction.
 #0  0x00007f86e2d8c7d9 in _dl_x86_64_restore_sse () from /lib64/ld-linux-x86-64.so.2
 Missing separate debuginfos, use: dnf debuginfo-install glibc-2.21-7.fc22.x86_64 libattr-2.4.47-9.fc22.x86_64 libcap-2.24-7.fc22.x86_64
 (gdb) bt
 #0  0x00007f86e2d8c7d9 in _dl_x86_64_restore_sse () from /lib64/ld-linux-x86-64.so.2
 #1  0x00007f86e2d84add in _dl_fixup () from /lib64/ld-linux-x86-64.so.2
 #2  0x00007f86e2d8bbc0 in _dl_runtime_resolve () from /lib64/ld-linux-x86-64.so.2
 #3  0x0000000000402da3 in sys_futex (val3=0, uaddr2=0x0, timeout=0x0, val=0, op=0, uaddr=0x6063f0 <sig_received>) at lock.h:29
 #4  futex_wait_while (f=0x6063f0 <sig_received>, v=0) at lock.h:121
 #5  test_waitsig () at test.c:367
 #6  0x0000000000401c4b in main (argc=<optimized out>, argv=0x7ffce16432f8) at different_creds.c:82

Reported-by: Mr Jenkins
Cc: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Andrew Vagin <avagin@openvz.org>
Acked-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
avagin referenced this issue in avagin/criu Dec 4, 2015
It's used to restore bind-mounts. For example, we cat the common
part of bind-mounts:

Core was generated by `criu restore -vvvv --file-locks --tcp-established --evasive-devices --manage-cg'.
Program terminated with signal 11, Segmentation fault.
741                     BUG_ON(target_root[tok] == '\0');
(gdb) bt
 #0  0x000000000045eef2 in cut_root_for_bind (target_root=0x1e00f20 "/", source_root=0x1e04910 "/vzt/del/vzctl-rm-me.X99UVU8/.criu.cgyard.D5Dfcv/zdtmtst/") at mount.c:741
 #1  0x000000000045f594 in do_bind_mount (mi=mi@entry=0x1e00dd0) at mount.c:2035
 #2  0x000000000045fd02 in do_mount_one (mi=0x1e00dd0) at mount.c:2191
 #3  0x000000000046241f in mnt_tree_for_each (fn=0x45fc80 <do_mount_one>, start=0x1e044d0) at mount.c:1759
 #4  populate_mnt_ns () at mount.c:2729
 #5  prepare_mnt_ns () at mount.c:2843
 #6  0x000000000045a3c3 in prepare_namespace (item=0x7fe10b9ce050, clone_flags=2080505856) at namespaces.c:1311
 #7  0x000000000043383e in restore_task_with_children (_arg=0x7ffd0f7faae0) at cr-restore.c:1535
 #8  0x00007fe10acb41ad in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113

https://jira.sw.ru/browse/PSBM-41932

Reported-by: Virtuozzo QA Team
criupatchwork pushed a commit to criupatchwork/criu that referenced this issue Jun 2, 2016
It can be dead-lokced:
 #0  0x00007fafbf49f6ac in __lll_lock_wait_private () from /lib64/libc.so.6
 #1  0x00007fafbf44af1c in _L_lock_2460 () from /lib64/libc.so.6
 #2  0x00007fafbf44ad57 in __tz_convert () from /lib64/libc.so.6
 checkpoint-restore#3  0x00000000004022e2 in test_msg (format=0x404508 "Receive signal %d\n") at msg.c:51
 checkpoint-restore#4  <signal handler called>
 checkpoint-restore#5  0x00007fafbf3f2483 in __GI__IO_vfscanf () from /lib64/libc.so.6
 checkpoint-restore#6  0x00007fafbf408f27 in vsscanf () from /lib64/libc.so.6
 checkpoint-restore#7  0x00007fafbf4032f7 in sscanf () from /lib64/libc.so.6
 checkpoint-restore#8  0x00007fafbf449ba6 in __tzset_parse_tz () from /lib64/libc.so.6
 checkpoint-restore#9  0x00007fafbf44c4cb in __tzfile_compute () from /lib64/libc.so.6
 checkpoint-restore#10 0x00007fafbf44ae17 in __tz_convert () from /lib64/libc.so.6
 checkpoint-restore#11 0x00000000004022e2 in test_msg (format=format@entry=0x40458c "PASS\n") at msg.c:51
 checkpoint-restore#12 0x0000000000401ceb in main (argc=<optimized out>, argv=<optimized out>) at ptrace_sig.c:172

https://jira.sw.ru/browse/PSBM-47772

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Andrew Vagin <avagin@virtuozzo.com>
Tested-by: Cyrill Gorcunov <gorcunov@openvz.org>
xemul pushed a commit that referenced this issue Jun 7, 2016
It can be dead-lokced:
 #0  0x00007fafbf49f6ac in __lll_lock_wait_private () from /lib64/libc.so.6
 #1  0x00007fafbf44af1c in _L_lock_2460 () from /lib64/libc.so.6
 #2  0x00007fafbf44ad57 in __tz_convert () from /lib64/libc.so.6
 #3  0x00000000004022e2 in test_msg (format=0x404508 "Receive signal %d\n") at msg.c:51
 #4  <signal handler called>
 #5  0x00007fafbf3f2483 in __GI__IO_vfscanf () from /lib64/libc.so.6
 #6  0x00007fafbf408f27 in vsscanf () from /lib64/libc.so.6
 #7  0x00007fafbf4032f7 in sscanf () from /lib64/libc.so.6
 #8  0x00007fafbf449ba6 in __tzset_parse_tz () from /lib64/libc.so.6
 #9  0x00007fafbf44c4cb in __tzfile_compute () from /lib64/libc.so.6
 #10 0x00007fafbf44ae17 in __tz_convert () from /lib64/libc.so.6
 #11 0x00000000004022e2 in test_msg (format=format@entry=0x40458c "PASS\n") at msg.c:51
 #12 0x0000000000401ceb in main (argc=<optimized out>, argv=<optimized out>) at ptrace_sig.c:172

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Andrew Vagin <avagin@virtuozzo.com>
Tested-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
@Vallari-Mehta
Copy link

I am trying to build docker from following source: https://github.com/boucher/docker.git and using the branch: /cr-combined. But I do not see docker checkpoint option.

screen shot 2016-06-08 at 6 37 08 pm

I am using the command make DOCKER_EXPERIMENTAL=1 build to build the docker
And then make DOCKER_EXPERIMENTAL=1 binary to compile.

After compiling when I try running the compiled binary: ./docker --help.
I do not see any checkpoint option.
I also tried ./docker checkpoint. But it says "docker: 'checkpoint' is not a docker command."

Please let me know if you need any information.

@xemul
Copy link
Member

xemul commented Jun 9, 2016

Please, open another issue if you have problems. This one is closed already.

xemul pushed a commit that referenced this issue Jun 14, 2016
It can be dead-lokced:
 #0  0x00007fafbf49f6ac in __lll_lock_wait_private () from /lib64/libc.so.6
 #1  0x00007fafbf44af1c in _L_lock_2460 () from /lib64/libc.so.6
 #2  0x00007fafbf44ad57 in __tz_convert () from /lib64/libc.so.6
 #3  0x00000000004022e2 in test_msg (format=0x404508 "Receive signal %d\n") at msg.c:51
 #4  <signal handler called>
 #5  0x00007fafbf3f2483 in __GI__IO_vfscanf () from /lib64/libc.so.6
 #6  0x00007fafbf408f27 in vsscanf () from /lib64/libc.so.6
 #7  0x00007fafbf4032f7 in sscanf () from /lib64/libc.so.6
 #8  0x00007fafbf449ba6 in __tzset_parse_tz () from /lib64/libc.so.6
 #9  0x00007fafbf44c4cb in __tzfile_compute () from /lib64/libc.so.6
 #10 0x00007fafbf44ae17 in __tz_convert () from /lib64/libc.so.6
 #11 0x00000000004022e2 in test_msg (format=format@entry=0x40458c "PASS\n") at msg.c:51
 #12 0x0000000000401ceb in main (argc=<optimized out>, argv=<optimized out>) at ptrace_sig.c:172

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Andrew Vagin <avagin@virtuozzo.com>
Tested-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
xemul pushed a commit that referenced this issue Jun 28, 2016
It can be dead-lokced:
 #0  0x00007fafbf49f6ac in __lll_lock_wait_private () from /lib64/libc.so.6
 #1  0x00007fafbf44af1c in _L_lock_2460 () from /lib64/libc.so.6
 #2  0x00007fafbf44ad57 in __tz_convert () from /lib64/libc.so.6
 #3  0x00000000004022e2 in test_msg (format=0x404508 "Receive signal %d\n") at msg.c:51
 #4  <signal handler called>
 #5  0x00007fafbf3f2483 in __GI__IO_vfscanf () from /lib64/libc.so.6
 #6  0x00007fafbf408f27 in vsscanf () from /lib64/libc.so.6
 #7  0x00007fafbf4032f7 in sscanf () from /lib64/libc.so.6
 #8  0x00007fafbf449ba6 in __tzset_parse_tz () from /lib64/libc.so.6
 #9  0x00007fafbf44c4cb in __tzfile_compute () from /lib64/libc.so.6
 #10 0x00007fafbf44ae17 in __tz_convert () from /lib64/libc.so.6
 #11 0x00000000004022e2 in test_msg (format=format@entry=0x40458c "PASS\n") at msg.c:51
 #12 0x0000000000401ceb in main (argc=<optimized out>, argv=<optimized out>) at ptrace_sig.c:172

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Andrew Vagin <avagin@virtuozzo.com>
Tested-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
criupatchwork pushed a commit to criupatchwork/criu that referenced this issue Sep 7, 2016
phys_stat_resolve() call mount_resolve_path() which requires that mntinfo_tree
in the ns_id struct is initialized. This is a problem we observed with sockets
on btrfs volumes:

 Program received signal SIGSEGV, Segmentation fault.
 0x00005555555bb6dd in mount_resolve_path (mntinfo_tree=<optimized out>, path=0x555555875790 "/var/lib/lxd/unix.socket") at criu/mount.c:213
 213     criu/mount.c: No such file or directory.
 (gdb) bt
 #0  0x00005555555bb6dd in mount_resolve_path (mntinfo_tree=<optimized out>, path=0x555555875790 "/var/lib/lxd/unix.socket") at criu/mount.c:213
 #1  0x00005555555be240 in phys_stat_resolve_dev (ns=<optimized out>, st_dev=43, path=<optimized out>) at criu/mount.c:240
 #2  0x00005555555be2bb in phys_stat_dev_match (st_dev=<optimized out>, phys_dev=41, ns=ns@entry=0x5555558753a0,
     path=path@entry=0x555555875790 "/var/lib/lxd/unix.socket") at criu/mount.c:256
 checkpoint-restore#3  0x00005555555e75ed in unix_process_name (d=d@entry=0x5555558756e0, tb=tb@entry=0x7fffffffe0c0, m=<optimized out>) at criu/sk-unix.c:565
 checkpoint-restore#4  0x00005555555e9378 in unix_collect_one (tb=0x7fffffffe0c0, m=0x555555869f18 <buf+312>) at criu/sk-unix.c:620
 checkpoint-restore#5  unix_receive_one (h=0x555555869f08 <buf+296>, arg=<optimized out>) at criu/sk-unix.c:692
 checkpoint-restore#6  0x00005555555b85aa in nlmsg_receive (buf=<optimized out>, arg=<optimized out>, err_cb=<optimized out>, cb=<optimized out>, len=<optimized out>)
     at criu/libnetlink.c:45
 checkpoint-restore#7  do_rtnl_req (nl=nl@entry=5, req=req@entry=0x7fffffffe220, size=size@entry=72, receive_callback=0x5555555e9290 <unix_receive_one>,
     error_callback=0x5555555b83d0 <rtnl_return_err>, error_callback@entry=0x0, arg=arg@entry=0x0) at criu/libnetlink.c:119
 checkpoint-restore#8  0x00005555555e9cf7 in do_collect_req (nl=nl@entry=5, req=req@entry=0x7fffffffe220, receive_callback=<optimized out>, arg=arg@entry=0x0, size=72)
     at criu/sockets.c:610
 checkpoint-restore#9  0x00005555555eb1d0 in collect_sockets (ns=ns@entry=0x7fffffffe300) at criu/sockets.c:636
 checkpoint-restore#10 0x000055555559ddfc in check_sock_diag () at criu/cr-check.c:118
 checkpoint-restore#11 cr_check () at criu/cr-check.c:999
 checkpoint-restore#12 0x00005555555872d0 in main (argc=<optimized out>, argv=0x7fffffffe678, envp=<optimized out>) at criu/crtools.c:719

Signed-off-by: Christian Brauner <christian.brauner@canonical.com>
criupatchwork pushed a commit to criupatchwork/criu that referenced this issue Oct 25, 2016
A root mount namespace list is used to resolve paths to
unix sockets if they are placed on btrfs.

This patch fixes a crash:
 #0 mount_resolve_path at criu/mount.c:213
 #1 phys_stat_resolve_dev at criu/mount.c:240
 #2 phys_stat_dev_match at criu/mount.c:256
 checkpoint-restore#3 unix_process_name at criu/sk-unix.c:565
 checkpoint-restore#4 unix_collect_one at criu/sk-unix.c:620
 checkpoint-restore#5 unix_receive_one at criu/sk-unix.c:692
 checkpoint-restore#6 nlmsg_receive at criu/libnetlink.c:45
 checkpoint-restore#7 do_rtnl_req at criu/libnetlink.c:119
 checkpoint-restore#8 do_collect_req at criu/sockets.c:610
 checkpoint-restore#9 collect_sockets at criu/sockets.c:636

https://bugzilla.redhat.com/show_bug.cgi?id=1381351
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
xemul pushed a commit that referenced this issue Oct 26, 2016
A root mount namespace list is used to resolve paths to
unix sockets if they are placed on btrfs.

This patch fixes a crash:
 #0 mount_resolve_path at criu/mount.c:213
 #1 phys_stat_resolve_dev at criu/mount.c:240
 #2 phys_stat_dev_match at criu/mount.c:256
 #3 unix_process_name at criu/sk-unix.c:565
 #4 unix_collect_one at criu/sk-unix.c:620
 #5 unix_receive_one at criu/sk-unix.c:692
 #6 nlmsg_receive at criu/libnetlink.c:45
 #7 do_rtnl_req at criu/libnetlink.c:119
 #8 do_collect_req at criu/sockets.c:610
 #9 collect_sockets at criu/sockets.c:636

travis-ci: success for cr-check: fill up a root task mount namespace
https://bugzilla.redhat.com/show_bug.cgi?id=1381351
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
xemul pushed a commit that referenced this issue Nov 2, 2016
A root mount namespace list is used to resolve paths to
unix sockets if they are placed on btrfs.

This patch fixes a crash:
 #0 mount_resolve_path at criu/mount.c:213
 #1 phys_stat_resolve_dev at criu/mount.c:240
 #2 phys_stat_dev_match at criu/mount.c:256
 #3 unix_process_name at criu/sk-unix.c:565
 #4 unix_collect_one at criu/sk-unix.c:620
 #5 unix_receive_one at criu/sk-unix.c:692
 #6 nlmsg_receive at criu/libnetlink.c:45
 #7 do_rtnl_req at criu/libnetlink.c:119
 #8 do_collect_req at criu/sockets.c:610
 #9 collect_sockets at criu/sockets.c:636

travis-ci: success for cr-check: fill up a root task mount namespace
https://bugzilla.redhat.com/show_bug.cgi?id=1381351
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
0x7f454c46 pushed a commit to 0x7f454c46/criu that referenced this issue Jan 30, 2017
It can be dead-locked:

 #0  0x00007fafbf49f6ac in __lll_lock_wait_private () from /lib64/libc.so.6
 checkpoint-restore#1  0x00007fafbf44af1c in _L_lock_2460 () from /lib64/libc.so.6
 checkpoint-restore#2  0x00007fafbf44ad57 in __tz_convert () from /lib64/libc.so.6
 checkpoint-restore#3  0x00000000004022e2 in test_msg (format=0x404508 "Receive signal %d\n") at msg.c:51
 checkpoint-restore#4  <signal handler called>
 checkpoint-restore#5  0x00007fafbf3f2483 in __GI__IO_vfscanf () from /lib64/libc.so.6
 checkpoint-restore#6  0x00007fafbf408f27 in vsscanf () from /lib64/libc.so.6
 checkpoint-restore#7  0x00007fafbf4032f7 in sscanf () from /lib64/libc.so.6
 checkpoint-restore#8  0x00007fafbf449ba6 in __tzset_parse_tz () from /lib64/libc.so.6
 checkpoint-restore#9  0x00007fafbf44c4cb in __tzfile_compute () from /lib64/libc.so.6
 checkpoint-restore#10 0x00007fafbf44ae17 in __tz_convert () from /lib64/libc.so.6
 checkpoint-restore#11 0x00000000004022e2 in test_msg (format=format@entry=0x40458c "PASS\n") at msg.c:51
 checkpoint-restore#12 0x0000000000401ceb in main (argc=<optimized out>, argv=<optimized out>) at ptrace_sig.c:172

https://jira.sw.ru/browse/PSBM-47772

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Andrew Vagin <avagin@virtuozzo.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@virtuozzo.com>
criupatchwork pushed a commit to criupatchwork/criu that referenced this issue Jan 30, 2017
'info' array is off-by-one, nla_parse_nested() requires destination
array (i.e. 'info') to have maxtype+1 (i.e. IFLA_INFO_MAX+1) elements:

	ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ffef823e3f8
	WRITE of size 48 at 0x7ffef823e3f8 thread T0
	    #0 0x7f9ab7a3915b in __asan_memset (/usr/lib/gcc/x86_64-pc-linux-gnu/5.4.0/libasan.so.2+0x8d15b)
	    #1 0x7f9ab6d4e553 in nla_parse (/usr/lib64/libnl-3.so.200+0xa553)
	    #2 0x4acfb7 in dump_one_netdev criu/net.c:445
	    checkpoint-restore#3 0x4adb60 in dump_one_ethernet criu/net.c:594
	    checkpoint-restore#4 0x4adb60 in dump_one_link criu/net.c:665
	    checkpoint-restore#5 0x48af69 in nlmsg_receive criu/libnetlink.c:45
	    checkpoint-restore#6 0x48af69 in do_rtnl_req criu/libnetlink.c:119
	    checkpoint-restore#7 0x4b0e86 in dump_links criu/net.c:878
	    checkpoint-restore#8 0x4b0e86 in dump_net_ns criu/net.c:1651
	    checkpoint-restore#9 0x4a760d in do_dump_namespaces criu/namespaces.c:985
	    checkpoint-restore#10 0x4a760d in dump_namespaces criu/namespaces.c:1045
	    checkpoint-restore#11 0x451ef7 in cr_dump_tasks criu/cr-dump.c:1799
	    checkpoint-restore#12 0x424588 in main criu/crtools.c:736
	    checkpoint-restore#13 0x7f9ab67b171f in __libc_start_main (/lib64/libc.so.6+0x2071f)
	    checkpoint-restore#14 0x4253d8 in _start (/criu/criu/criu+0x4253d8)

	Address 0x7ffef823e3f8 is located in stack of thread T0 at offset 264 in frame
	    #0 0x4ac9ef in dump_one_netdev criu/net.c:364

	  This frame has 5 object(s):
	    [32, 168) 'netdev'
	    [224, 264) 'info' <== Memory access at offset 264 overflows this variable
	    [320, 1040) 'req'
	    [1088, 3368) 'path'
	    [3424, 3625) 'stable_secret'

Increase 'info' size to fix this.

Fixes: b705dcc ("net: pass the struct nlattrs to dump() functions")
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
xemul pushed a commit that referenced this issue Jan 31, 2017
'info' array is off-by-one, nla_parse_nested() requires destination
array (i.e. 'info') to have maxtype+1 (i.e. IFLA_INFO_MAX+1) elements:

	ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ffef823e3f8
	WRITE of size 48 at 0x7ffef823e3f8 thread T0
	    #0 0x7f9ab7a3915b in __asan_memset (/usr/lib/gcc/x86_64-pc-linux-gnu/5.4.0/libasan.so.2+0x8d15b)
	    #1 0x7f9ab6d4e553 in nla_parse (/usr/lib64/libnl-3.so.200+0xa553)
	    #2 0x4acfb7 in dump_one_netdev criu/net.c:445
	    #3 0x4adb60 in dump_one_ethernet criu/net.c:594
	    #4 0x4adb60 in dump_one_link criu/net.c:665
	    #5 0x48af69 in nlmsg_receive criu/libnetlink.c:45
	    #6 0x48af69 in do_rtnl_req criu/libnetlink.c:119
	    #7 0x4b0e86 in dump_links criu/net.c:878
	    #8 0x4b0e86 in dump_net_ns criu/net.c:1651
	    #9 0x4a760d in do_dump_namespaces criu/namespaces.c:985
	    #10 0x4a760d in dump_namespaces criu/namespaces.c:1045
	    #11 0x451ef7 in cr_dump_tasks criu/cr-dump.c:1799
	    #12 0x424588 in main criu/crtools.c:736
	    #13 0x7f9ab67b171f in __libc_start_main (/lib64/libc.so.6+0x2071f)
	    #14 0x4253d8 in _start (/criu/criu/criu+0x4253d8)

	Address 0x7ffef823e3f8 is located in stack of thread T0 at offset 264 in frame
	    #0 0x4ac9ef in dump_one_netdev criu/net.c:364

	  This frame has 5 object(s):
	    [32, 168) 'netdev'
	    [224, 264) 'info' <== Memory access at offset 264 overflows this variable
	    [320, 1040) 'req'
	    [1088, 3368) 'path'
	    [3424, 3625) 'stable_secret'

Increase 'info' size to fix this.

Fixes: b705dcc ("net: pass the struct nlattrs to dump() functions")
travis-ci: success for net: fix stack out-of-bounds access in dump_one_netdev()
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
xemul pushed a commit that referenced this issue Feb 1, 2017
'info' array is off-by-one, nla_parse_nested() requires destination
array (i.e. 'info') to have maxtype+1 (i.e. IFLA_INFO_MAX+1) elements:

	ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ffef823e3f8
	WRITE of size 48 at 0x7ffef823e3f8 thread T0
	    #0 0x7f9ab7a3915b in __asan_memset (/usr/lib/gcc/x86_64-pc-linux-gnu/5.4.0/libasan.so.2+0x8d15b)
	    #1 0x7f9ab6d4e553 in nla_parse (/usr/lib64/libnl-3.so.200+0xa553)
	    #2 0x4acfb7 in dump_one_netdev criu/net.c:445
	    #3 0x4adb60 in dump_one_ethernet criu/net.c:594
	    #4 0x4adb60 in dump_one_link criu/net.c:665
	    #5 0x48af69 in nlmsg_receive criu/libnetlink.c:45
	    #6 0x48af69 in do_rtnl_req criu/libnetlink.c:119
	    #7 0x4b0e86 in dump_links criu/net.c:878
	    #8 0x4b0e86 in dump_net_ns criu/net.c:1651
	    #9 0x4a760d in do_dump_namespaces criu/namespaces.c:985
	    #10 0x4a760d in dump_namespaces criu/namespaces.c:1045
	    #11 0x451ef7 in cr_dump_tasks criu/cr-dump.c:1799
	    #12 0x424588 in main criu/crtools.c:736
	    #13 0x7f9ab67b171f in __libc_start_main (/lib64/libc.so.6+0x2071f)
	    #14 0x4253d8 in _start (/criu/criu/criu+0x4253d8)

	Address 0x7ffef823e3f8 is located in stack of thread T0 at offset 264 in frame
	    #0 0x4ac9ef in dump_one_netdev criu/net.c:364

	  This frame has 5 object(s):
	    [32, 168) 'netdev'
	    [224, 264) 'info' <== Memory access at offset 264 overflows this variable
	    [320, 1040) 'req'
	    [1088, 3368) 'path'
	    [3424, 3625) 'stable_secret'

Increase 'info' size to fix this.

Fixes: b705dcc ("net: pass the struct nlattrs to dump() functions")
travis-ci: success for net: fix stack out-of-bounds access in dump_one_netdev()
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
xemul pushed a commit that referenced this issue Feb 1, 2017
'info' array is off-by-one, nla_parse_nested() requires destination
array (i.e. 'info') to have maxtype+1 (i.e. IFLA_INFO_MAX+1) elements:

	ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ffef823e3f8
	WRITE of size 48 at 0x7ffef823e3f8 thread T0
	    #0 0x7f9ab7a3915b in __asan_memset (/usr/lib/gcc/x86_64-pc-linux-gnu/5.4.0/libasan.so.2+0x8d15b)
	    #1 0x7f9ab6d4e553 in nla_parse (/usr/lib64/libnl-3.so.200+0xa553)
	    #2 0x4acfb7 in dump_one_netdev criu/net.c:445
	    #3 0x4adb60 in dump_one_ethernet criu/net.c:594
	    #4 0x4adb60 in dump_one_link criu/net.c:665
	    #5 0x48af69 in nlmsg_receive criu/libnetlink.c:45
	    #6 0x48af69 in do_rtnl_req criu/libnetlink.c:119
	    #7 0x4b0e86 in dump_links criu/net.c:878
	    #8 0x4b0e86 in dump_net_ns criu/net.c:1651
	    #9 0x4a760d in do_dump_namespaces criu/namespaces.c:985
	    #10 0x4a760d in dump_namespaces criu/namespaces.c:1045
	    #11 0x451ef7 in cr_dump_tasks criu/cr-dump.c:1799
	    #12 0x424588 in main criu/crtools.c:736
	    #13 0x7f9ab67b171f in __libc_start_main (/lib64/libc.so.6+0x2071f)
	    #14 0x4253d8 in _start (/criu/criu/criu+0x4253d8)

	Address 0x7ffef823e3f8 is located in stack of thread T0 at offset 264 in frame
	    #0 0x4ac9ef in dump_one_netdev criu/net.c:364

	  This frame has 5 object(s):
	    [32, 168) 'netdev'
	    [224, 264) 'info' <== Memory access at offset 264 overflows this variable
	    [320, 1040) 'req'
	    [1088, 3368) 'path'
	    [3424, 3625) 'stable_secret'

Increase 'info' size to fix this.

Fixes: b705dcc ("net: pass the struct nlattrs to dump() functions")
travis-ci: success for net: fix stack out-of-bounds access in dump_one_netdev()
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
criupatchwork pushed a commit to criupatchwork/criu that referenced this issue Mar 29, 2017
==30==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x60300000e3ca at pc 0x7f34144b6be1 bp 0x7ffee7b6bb20 sp 0x7ffee7b6b298
READ of size 26 at 0x60300000e3ca thread T0
    #0 0x7f34144b6be0  (/lib64/libasan.so.3+0x8dbe0)
    #1 0x7f34144b8e4d in __interceptor_vsnprintf (/lib64/libasan.so.3+0x8fe4d)
    #2 0x4966cb in vprint_on_level criu/log.c:228
    checkpoint-restore#3 0x496b64 in print_on_level criu/log.c:249
    checkpoint-restore#4 0x505c94 in collect_one_unixsk criu/sk-unix.c:1401
    checkpoint-restore#5 0x4e7ae3 in collect_image criu/protobuf.c:213
    checkpoint-restore#6 0x462c5c in root_prepare_shared criu/cr-restore.c:247
    checkpoint-restore#7 0x462c5c in restore_task_with_children criu/cr-restore.c:1420
    checkpoint-restore#8 0x7f34132d70ec in __clone (/lib64/libc.so.6+0x1030ec)

0x60300000e3ca is located 0 bytes to the right of 26-byte region [0x60300000e3b0,0x60300000e3ca)
allocated by thread T0 here:
    #0 0x7f34144efe70 in malloc (/lib64/libasan.so.3+0xc6e70)
    #1 0x7f3413bdb021  (/lib64/libprotobuf-c.so.1+0x6021)

Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
criupatchwork pushed a commit to criupatchwork/criu that referenced this issue Mar 30, 2017
In this patch, we replace all zero characters to '@'.

==30==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x60300000e3ca at pc 0x7f34144b6be1 bp 0x7ffee7b6bb20 sp 0x7ffee7b6b298
READ of size 26 at 0x60300000e3ca thread T0
    #0 0x7f34144b6be0  (/lib64/libasan.so.3+0x8dbe0)
    #1 0x7f34144b8e4d in __interceptor_vsnprintf (/lib64/libasan.so.3+0x8fe4d)
    #2 0x4966cb in vprint_on_level criu/log.c:228
    checkpoint-restore#3 0x496b64 in print_on_level criu/log.c:249
    checkpoint-restore#4 0x505c94 in collect_one_unixsk criu/sk-unix.c:1401
    checkpoint-restore#5 0x4e7ae3 in collect_image criu/protobuf.c:213
    checkpoint-restore#6 0x462c5c in root_prepare_shared criu/cr-restore.c:247
    checkpoint-restore#7 0x462c5c in restore_task_with_children criu/cr-restore.c:1420
    checkpoint-restore#8 0x7f34132d70ec in __clone (/lib64/libc.so.6+0x1030ec)

0x60300000e3ca is located 0 bytes to the right of 26-byte region [0x60300000e3b0,0x60300000e3ca)
allocated by thread T0 here:
    #0 0x7f34144efe70 in malloc (/lib64/libasan.so.3+0xc6e70)
    #1 0x7f3413bdb021  (/lib64/libprotobuf-c.so.1+0x6021)

Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
avagin added a commit that referenced this issue Apr 5, 2017
In this patch, we replace all zero characters to '@'.

==30==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x60300000e3ca at pc 0x7f34144b6be1 bp 0x7ffee7b6bb20 sp 0x7ffee7b6b298
READ of size 26 at 0x60300000e3ca thread T0
    #0 0x7f34144b6be0  (/lib64/libasan.so.3+0x8dbe0)
    #1 0x7f34144b8e4d in __interceptor_vsnprintf (/lib64/libasan.so.3+0x8fe4d)
    #2 0x4966cb in vprint_on_level criu/log.c:228
    #3 0x496b64 in print_on_level criu/log.c:249
    #4 0x505c94 in collect_one_unixsk criu/sk-unix.c:1401
    #5 0x4e7ae3 in collect_image criu/protobuf.c:213
    #6 0x462c5c in root_prepare_shared criu/cr-restore.c:247
    #7 0x462c5c in restore_task_with_children criu/cr-restore.c:1420
    #8 0x7f34132d70ec in __clone (/lib64/libc.so.6+0x1030ec)

0x60300000e3ca is located 0 bytes to the right of 26-byte region [0x60300000e3b0,0x60300000e3ca)
allocated by thread T0 here:
    #0 0x7f34144efe70 in malloc (/lib64/libasan.so.3+0xc6e70)
    #1 0x7f3413bdb021  (/lib64/libprotobuf-c.so.1+0x6021)

Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
avagin added a commit that referenced this issue Apr 5, 2017
In this patch, we replace all zero characters to '@'.

==30==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x60300000e3ca at pc 0x7f34144b6be1 bp 0x7ffee7b6bb20 sp 0x7ffee7b6b298
READ of size 26 at 0x60300000e3ca thread T0
    #0 0x7f34144b6be0  (/lib64/libasan.so.3+0x8dbe0)
    #1 0x7f34144b8e4d in __interceptor_vsnprintf (/lib64/libasan.so.3+0x8fe4d)
    #2 0x4966cb in vprint_on_level criu/log.c:228
    #3 0x496b64 in print_on_level criu/log.c:249
    #4 0x505c94 in collect_one_unixsk criu/sk-unix.c:1401
    #5 0x4e7ae3 in collect_image criu/protobuf.c:213
    #6 0x462c5c in root_prepare_shared criu/cr-restore.c:247
    #7 0x462c5c in restore_task_with_children criu/cr-restore.c:1420
    #8 0x7f34132d70ec in __clone (/lib64/libc.so.6+0x1030ec)

0x60300000e3ca is located 0 bytes to the right of 26-byte region [0x60300000e3b0,0x60300000e3ca)
allocated by thread T0 here:
    #0 0x7f34144efe70 in malloc (/lib64/libasan.so.3+0xc6e70)
    #1 0x7f3413bdb021  (/lib64/libprotobuf-c.so.1+0x6021)

Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
xemul pushed a commit that referenced this issue Apr 12, 2017
In this patch, we replace all zero characters to '@'.

==30==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x60300000e3ca at pc 0x7f34144b6be1 bp 0x7ffee7b6bb20 sp 0x7ffee7b6b298
READ of size 26 at 0x60300000e3ca thread T0
    #0 0x7f34144b6be0  (/lib64/libasan.so.3+0x8dbe0)
    #1 0x7f34144b8e4d in __interceptor_vsnprintf (/lib64/libasan.so.3+0x8fe4d)
    #2 0x4966cb in vprint_on_level criu/log.c:228
    #3 0x496b64 in print_on_level criu/log.c:249
    #4 0x505c94 in collect_one_unixsk criu/sk-unix.c:1401
    #5 0x4e7ae3 in collect_image criu/protobuf.c:213
    #6 0x462c5c in root_prepare_shared criu/cr-restore.c:247
    #7 0x462c5c in restore_task_with_children criu/cr-restore.c:1420
    #8 0x7f34132d70ec in __clone (/lib64/libc.so.6+0x1030ec)

0x60300000e3ca is located 0 bytes to the right of 26-byte region [0x60300000e3b0,0x60300000e3ca)
allocated by thread T0 here:
    #0 0x7f34144efe70 in malloc (/lib64/libasan.so.3+0xc6e70)
    #1 0x7f3413bdb021  (/lib64/libprotobuf-c.so.1+0x6021)

Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
0x7f454c46 pushed a commit to 0x7f454c46/criu that referenced this issue Jul 10, 2017
In this patch, we replace all zero characters to '@'.

==30==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x60300000e3ca at pc 0x7f34144b6be1 bp 0x7ffee7b6bb20 sp 0x7ffee7b6b298
READ of size 26 at 0x60300000e3ca thread T0
    #0 0x7f34144b6be0  (/lib64/libasan.so.3+0x8dbe0)
    checkpoint-restore#1 0x7f34144b8e4d in __interceptor_vsnprintf (/lib64/libasan.so.3+0x8fe4d)
    checkpoint-restore#2 0x4966cb in vprint_on_level criu/log.c:228
    checkpoint-restore#3 0x496b64 in print_on_level criu/log.c:249
    checkpoint-restore#4 0x505c94 in collect_one_unixsk criu/sk-unix.c:1401
    checkpoint-restore#5 0x4e7ae3 in collect_image criu/protobuf.c:213
    checkpoint-restore#6 0x462c5c in root_prepare_shared criu/cr-restore.c:247
    checkpoint-restore#7 0x462c5c in restore_task_with_children criu/cr-restore.c:1420
    checkpoint-restore#8 0x7f34132d70ec in __clone (/lib64/libc.so.6+0x1030ec)

0x60300000e3ca is located 0 bytes to the right of 26-byte region [0x60300000e3b0,0x60300000e3ca)
allocated by thread T0 here:
    #0 0x7f34144efe70 in malloc (/lib64/libasan.so.3+0xc6e70)
    checkpoint-restore#1 0x7f3413bdb021  (/lib64/libprotobuf-c.so.1+0x6021)

Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
avagin pushed a commit that referenced this issue Jan 21, 2025
Running the zdtm/static/unlink_regular00 test on Ubuntu 24.04 on aarch64
results in following error:

    # ./zdtm.py run -t zdtm/static/unlink_regular00 -k always
    userns is supported
    === Run 1/1 ================ zdtm/static/unlink_regular00
    ==================== Run zdtm/static/unlink_regular00 in ns ====================
    Skipping rtc at root
    Start test
    Test is SUID
    ./unlink_regular00 --pidfile=unlink_regular00.pid --outfile=unlink_regular00.out --dirname=unlink_regular00.test
    Run criu dump
    *** buffer overflow detected ***: terminated
    ############# Test zdtm/static/unlink_regular00 FAIL at CRIU dump ##############
    Test output: ================================

     <<< ================================
    Send the 9 signal to  47
    Wait for zdtm/static/unlink_regular00(47) to die for 0.100000
    ##################################### FAIL #####################################

According to the backtrace:

    #0  __pthread_kill_implementation (threadid=281473158467616, signo=signo@entry=6, no_tid=no_tid@entry=0) at ./nptl/pthread_kill.c:44
    #1  0x0000ffff93477690 in __pthread_kill_internal (signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:78
    #2  0x0000ffff9342cb3c in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
    #3  0x0000ffff93417e00 in __GI_abort () at ./stdlib/abort.c:79
    #4  0x0000ffff9346abf0 in __libc_message_impl (fmt=fmt@entry=0xffff93552a78 "*** %s ***: terminated\n") at ../sysdeps/posix/libc_fatal.c:132
    #5  0x0000ffff934e81a8 in __GI___fortify_fail (msg=msg@entry=0xffff93552a28 "buffer overflow detected") at ./debug/fortify_fail.c:24
    #6  0x0000ffff934e79e4 in __GI___chk_fail () at ./debug/chk_fail.c:28
    #7  0x0000ffff934e9070 in ___snprintf_chk (s=s@entry=0xffffc6ed04a3 "testfile", maxlen=maxlen@entry=4056, flag=flag@entry=2, slen=slen@entry=4053,
        format=format@entry=0xaaaacffe3888 "link_remap.%d") at ./debug/snprintf_chk.c:29
    #8  0x0000aaaacff4b8b8 in snprintf (__fmt=0xaaaacffe3888 "link_remap.%d", __n=4056, __s=0xffffc6ed04a3 "testfile")
        at /usr/include/aarch64-linux-gnu/bits/stdio2.h:54
    #9  create_link_remap (path=path@entry=0xffffc6ed2901 "/zdtm/static/unlink_regular00.test/subdir/testfile", len=len@entry=60, lfd=lfd@entry=20,
        idp=idp@entry=0xffffc6ed14ec, nsid=nsid@entry=0xaaaada2bac00, parms=parms@entry=0xffffc6ed2808, fallback=0xaaaacff4c6c0 <dump_linked_remap+96>,
        fallback@entry=0xffffc6ed2797) at criu/files-reg.c:1164
    #10 0x0000aaaacff4c6c0 in dump_linked_remap (path=path@entry=0xffffc6ed2901 "/zdtm/static/unlink_regular00.test/subdir/testfile", len=len@entry=60,
        parms=parms@entry=0xffffc6ed2808, lfd=lfd@entry=20, id=id@entry=12, nsid=nsid@entry=0xaaaada2bac00, fallback=fallback@entry=0xffffc6ed2797)
        at criu/files-reg.c:1198
    #11 0x0000aaaacff4d8b0 in check_path_remap (nsid=0xaaaada2bac00, id=12, lfd=20, parms=0xffffc6ed2808, link=<optimized out>) at criu/files-reg.c:1426
    #12 dump_one_reg_file (lfd=20, id=12, p=0xffffc6ed2808) at criu/files-reg.c:1827
    #13 0x0000aaaacff51078 in dump_one_file (pid=<optimized out>, fd=4, lfd=20, opts=opts@entry=0xaaaada2ba2c0, ctl=ctl@entry=0xaaaada2c4d50,
        e=e@entry=0xffffc6ed39c8, dfds=dfds@entry=0xaaaada2c3d40) at criu/files.c:581
    #14 0x0000aaaacff5176c in dump_task_files_seized (ctl=ctl@entry=0xaaaada2c4d50, item=item@entry=0xaaaada2b8f80, dfds=dfds@entry=0xaaaada2c3d40)
        at criu/files.c:657
    #15 0x0000aaaacff3d3c0 in dump_one_task (parent_ie=0x0, item=0xaaaada2b8f80) at criu/cr-dump.c:1679
    #16 cr_dump_tasks (pid=<optimized out>) at criu/cr-dump.c:2224
    #17 0x0000aaaacff163a0 in main (argc=<optimized out>, argv=0xffffc6ed40e8, envp=<optimized out>) at criu/crtools.c:293

This line is the problem:

    snprintf(tmp + 1, sizeof(link_name) - (size_t)(tmp - link_name - 1), "link_remap.%d", rfe.id);

The problem was that the `-1` was on the inside of the braces and not on
the outside. This way the destination size was increase by 1 instead of
being decreased by 1 which triggered the buffer overflow detection.

Signed-off-by: Adrian Reber <areber@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants