Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci: Vagrant Fedora no vdso gets permission denied when writing to ipc sysctl #1982

Closed
minhbq-99 opened this issue Oct 16, 2022 · 2 comments · Fixed by #1984
Closed

ci: Vagrant Fedora no vdso gets permission denied when writing to ipc sysctl #1982

minhbq-99 opened this issue Oct 16, 2022 · 2 comments · Fixed by #1984

Comments

@minhbq-99
Copy link
Member

Description

The Vagrant Fedora no vdso CI gets some failed tests in user namespace mode when writing to ipc sysctl (/proc/sys/kernel/{sem_next_id,msg_next_id,shm_next_id})

################### 5 TEST(S) FAILED (TOTAL 451/SKIPPED 35) ####################
 * zdtm/static/msgque(uns)
 * zdtm/static/shm(uns)
 * zdtm/static/shm-mp(uns)
 * zdtm/static/sem(uns)
 * zdtm/static/shm-unaligned(uns)
========================== Run zdtm/static/sem in uns ==========================
Start test
./sem --pidfile=sem.pid --outfile=sem.out
Run criu dump
Run criu restore
=[log]=> dump/zdtm/static/sem/166/1/restore.log
------------------------ grep Error ------------------------
b'(00.005610)      1: Restoring IPC semaphores sets'
b'(00.005623)      1: id: 0          key: 0x1d2015a6 uid: 18943      gid: 58467      cuid: 18943      cgid: 58467      mode: 777        (00.005627)      1: nsems: 10'
b'(00.005634)      1: uns: calling __userns_sysctl_op (13, 0)'
b'(00.005654) uns: daemon calls 0x4aa8b0 (192, 10, 0)'
b"(00.005908) Error (criu/sysctl.c:100): Can't write kernel/sem_next_id: Operation not permitted"
b'(00.006563) Error (criu/sysctl.c:313): worker failed: 256'
b'(00.006596)      1: Error (criu/ipc_ns.c:579): Failed to set desired IPC sem ID'
b'(00.006602)      1: Error (criu/ipc_ns.c:644): Failed to prepare semaphores set'
b'(00.007358) Error (criu/cr-restore.c:1503): 192 exited, status=1'
b'(00.007387) Warn  (criu/cr-restore.c:2524): Unable to wait 192: No child processes'
b'(00.007402) uns: calling exit_usernsd (-1, 1)'
b'(00.007422) uns: daemon calls 0x478fe0 (189, -1, 1)'
b'(00.007427) uns: `- daemon exits w/ 0'
b'(00.007977) uns: daemon stopped'
b'(00.007988) Error (criu/cr-restore.c:2537): Restoring FAILED.'
------------------------ ERROR OVER ------------------------
@minhbq-99
Copy link
Member Author

This commit looks related
torvalds/linux@1f5c135

Before that commit the logic is as

criu/criu/sysctl.c

Lines 195 to 206 in 4cd295b

/* For files in the IPC/UTS namespaces, restoring is more complicated
* than for net. Unprivileged users cannot even open these files, so
* they must be opened by usernsd. However, the value in the kernel is
* changed for the IPC/UTS namespace that write()s to the open sysctl
* file (not who opened it). So, we must set the value from inside the
* usernsd caller's namespace. We:
*
* 1. unsd opens the sysctl files
* 2. forks a task
* 3. setns()es to the UTS/IPC namespace of the caller
* 4. write()s to the files and exits
*/

The destination when write() to these files changes depending ipc namespace of writing process not stick to the file descriptor returned by open(). That commit sticks the destination of write() to file descriptor so now the destination of write() is determined at the time of open(). So the current logic in CRIU is broken.

In case of uns test, I don't know if I'm right, is it the restored process in different ipc, user namespace from CRIU process which has CAP_SYS_ADMIN/CAP_CHECKPOINT_RESTORE in its user namespace? If so how can we write to sysctl in this case? As I see, we must open file in restored process ipc namespace as the destination of write() is stick to file descriptor now and the restored process does not have capability. And we can only write to this sysctl if the restored process is in the same ipc namespace with CRIU process but different user namespace (uns test).

@minhbq-99
Copy link
Member Author

Ah, my mistake, the restored process has capabilities in its user namespace, these capabilities are dropped at the end of restore. So it is easy to write to sysctl, just open and write to them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant