Docker-based tasks output ownership #922

gaow · 2018-03-12T23:01:44Z

Currently when a task is based on docker the ownership for output is root.root. I am trying to figure out if there is a way to change it to user instead.

The text was updated successfully, but these errors were encountered:

gaow · 2018-03-12T23:03:38Z

Looks like adding f"-u {os.getuid()}" by default might help? Currently it has to be an input option.

BoPeng · 2018-03-15T20:38:24Z

Can this be image dependent? I mean, the user of the system does not have to translate to user in the docker image, and the docker image might be run by a non-root user (?) so it will write file as that user?...

All these need to be clarified before we do anything.

gaow · 2018-03-15T20:58:11Z

Can this be image dependent?

I'm not sure, but a couple of docker images I recently used both writes files as root.

the docker image might be run by a non-root user (?) so it will write file as that user?

Even so, f"-u {os.getuid()}" by default would not seem harmful. In fact I'm not sure what is the reason that we even make user a configuration option, rather than just set it to f"-u {os.getuid()}" by default -- because this will be consistent with the file ownership as a result of running other command programs.

BoPeng · 2018-03-16T00:22:03Z

Even so, f"-u {os.getuid()}" by default would not seem harmful.

The problem is that I am not sure the docker image would work with --user {os.getuid()}. For example, if the docker image is written to run as root, all file system is root, the specified user might not be able to find the program (path problem?), or the program might not be able to write inside the image (cannot create file etc). The same problem exists if the image is designed to run as another user.

So without -u we are running the image with their designed user, which does not have problem in running the image but might not be able to write to mounted image.

With -u we might not be able to run the image, but should be able to write to mounted image.

Is this the correct summary of the situation?

BoPeng · 2018-03-16T00:48:55Z

On my mac, running

run: docker_image='ubuntu', user=0
  echo `pwd` > /Users/bpeng1/a.txt

will result in

[bpeng1@BCBMC07MX084DY3:~]$ ls -l a.txt
-rw-r--r--  1 bpeng1  895809667  18 Mar 15 19:46 a.txt

regardless of user settings. What is the case on your end?

gaow · 2018-03-16T01:36:08Z

Hmm, this is interesting ... here is mine:

-rw-r--r--  1 root root   29 Mar 15 20:32 a.txt

Note that I removed user=0. So the behavior is platform dependent?

I think I get your point and it is completely valid. It seems I had a wrong understanding of what -u does. But otherwise, how can I make sure the output file to the host system is not root locked?

BoPeng · 2018-03-16T01:49:46Z

I do not think there is a perfect solution here so a good default is important. I believe on MacOSX the docker image is running in a VM which is run in the user space so option --user does not matter outside, but will matter inside. The current behavior is good as it works with images with both root and non-root users. On Linux, allowing the default image user write to mounted drives is problematic because docker will either write root-owned files or cannot write at all, so a default non-root user makes more sense.

The decision is then if we should use --user {getuid()} only for Linux or for both systems...

gaow · 2018-03-16T01:56:09Z

I'm actually wondering what nextflow does for this. If it works with Mac I guess it is good for most desktop uses. On Linux because docker configuration requires desktop sudo permission anyways so users are likely to be able to solve the problem on their own, and more importantly the more typical usage of SoS with Linux is on HPC which does not support docker anyways (for many systems) and we may want to look at singularity instead.

BoPeng · 2018-03-16T02:10:37Z

According to nextflow doc, it allows engineOptions, and fixOwnership, and the latter appears to pertain to your problem. Because we can only fix ownership of known output files and SoS allows the execution of scripts without output inside docker, it is not easy for us to fix ownership as an aftereffect of docker execution.

gaow · 2018-03-16T02:30:25Z

Agreed. I do not think I can come up with good suggestions that works safely for all scenarios we considered. I'm cool to close the ticket and leave it as is for now.

BoPeng · 2018-03-16T02:47:02Z

One argument that supports user={os.getuid()} would be that it is more portable. I mean, if we set os.getuid() as default, users can override it with user=0 or user='image_user' which is image dependent. If we do not set this as default, users might have to use user='username' etc, which is user dependent. The notebook is therefore less portable.

gaow · 2018-03-16T02:56:46Z

But the difficulty is we do not know for sure beforehand whether or not the user ID on the host system will also exist in the Docker image, though for my case they are both 1001. That was a bad assumption I made when I proposed using os.getuid. So if we are to implement this we may need to create such user and user group on the fly if the uid does not exist, before running anything; then run under that user to ensure the outcome matches our system. Is that possible?

BoPeng · 2018-03-16T03:06:31Z

docker allows the use of arbitrary user id and will simply treat it as a new normal user. The advantage is that the user-id will be used to create files in mounted drives, which is what we need here.

BoPeng · 2018-03-16T03:07:50Z

According to docker doc

root (id = 0) is the default user within a container. The image developer can create additional users. Those users are accessible by name. When passing a numeric ID, the user does not have to exist in the container.

The developer can set a default user to run the first process with the Dockerfile USER instruction. When starting a container, the operator can override the USER instruction by passing the -u option.

gaow · 2018-03-16T04:08:46Z

When passing a numeric ID, the user does not have to exist in the container.

Okey then I was still accidentally correct about -u behavior :) Then maybe this now seems a good thing to do for all platform?

BoPeng · 2018-03-17T03:37:22Z

Let us see if this works reasonably well.

gaow · 2018-03-26T19:32:44Z

This new default -u works great on my end, though I still get group as root. Maybe there is a default gid option to set it to current gid?

BoPeng · 2018-03-26T19:41:17Z

Could you test if -u {os.getuid()}:{os.getgid()} works?

gaow · 2018-03-26T19:51:44Z

It does -- see patch above!

BoPeng pushed a commit that referenced this issue Mar 17, 2018

Set default docker user to os.getuid() #922

70fb65f

gaow added a commit that referenced this issue Mar 26, 2018

Add getgid to docker -u option #922

7380f7f

BoPeng closed this as completed Mar 26, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Docker-based tasks output ownership #922

Docker-based tasks output ownership #922

gaow commented Mar 12, 2018

gaow commented Mar 12, 2018 •

edited

Loading

BoPeng commented Mar 15, 2018

gaow commented Mar 15, 2018

BoPeng commented Mar 16, 2018

BoPeng commented Mar 16, 2018

gaow commented Mar 16, 2018

BoPeng commented Mar 16, 2018

gaow commented Mar 16, 2018

BoPeng commented Mar 16, 2018 •

edited

Loading

gaow commented Mar 16, 2018

BoPeng commented Mar 16, 2018

gaow commented Mar 16, 2018

BoPeng commented Mar 16, 2018

BoPeng commented Mar 16, 2018

gaow commented Mar 16, 2018

BoPeng commented Mar 17, 2018

gaow commented Mar 26, 2018

BoPeng commented Mar 26, 2018

gaow commented Mar 26, 2018

Docker-based tasks output ownership #922

Docker-based tasks output ownership #922

Comments

gaow commented Mar 12, 2018

gaow commented Mar 12, 2018 • edited Loading

BoPeng commented Mar 15, 2018

gaow commented Mar 15, 2018

BoPeng commented Mar 16, 2018

BoPeng commented Mar 16, 2018

gaow commented Mar 16, 2018

BoPeng commented Mar 16, 2018

gaow commented Mar 16, 2018

BoPeng commented Mar 16, 2018 • edited Loading

gaow commented Mar 16, 2018

BoPeng commented Mar 16, 2018

gaow commented Mar 16, 2018

BoPeng commented Mar 16, 2018

BoPeng commented Mar 16, 2018

gaow commented Mar 16, 2018

BoPeng commented Mar 17, 2018

gaow commented Mar 26, 2018

BoPeng commented Mar 26, 2018

gaow commented Mar 26, 2018

gaow commented Mar 12, 2018 •

edited

Loading

BoPeng commented Mar 16, 2018 •

edited

Loading