Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarification of volume permissions? #26

Closed
mrhotio opened this issue Sep 18, 2020 · 16 comments
Closed

Clarification of volume permissions? #26

mrhotio opened this issue Sep 18, 2020 · 16 comments
Labels
documentation Improvements or additions to documentation enhancement New feature or request waiting for response

Comments

@mrhotio
Copy link

mrhotio commented Sep 18, 2020

What is the reason for the volume /dev/disk having to need write permissions, instead of mounting it read-only?

@AnalogJ
Copy link
Owner

AnalogJ commented Sep 18, 2020

Hey @hotio

Honestly its because I haven't yet had time to verify that it works with read-only. I have a story tracking the use of --privileged and the volume mounts, replacing them with --cap-add and --device calls instead: #18

I've just been busy with other features (like notifications & better device detection) so I haven't had time to work on that yet.

@AnalogJ AnalogJ added enhancement New feature or request documentation Improvements or additions to documentation labels Sep 18, 2020
@Derkades
Copy link

It seems to work fine for me with :ro

@AnalogJ
Copy link
Owner

AnalogJ commented Sep 21, 2020

Good news @Derkades @hotio @kinghat

I just did a quick test replacing --privileged and -v /dev/disk:/dev/disk with --device and --cap-add -- and it seems like everything still works.
I need to do some more extensive testing, but it would be awesome if you could verify that the scrutiny dashboard is still working for you:

docker run --rm \
-p 8080:8080 \
--cap-add SYS_RAWIO \
--device=/dev/sda \
--device=/dev/sdb \
-v /run/udev:/run/udev:ro \
--name scrutiny \
 analogj/scrutiny:smartctl_scan

Note: you'll need to add --device=/dev/sd* lines for each of your hard disk devices. Also, please use the analogj/scrutiny:smartctl_scan image, as it's running the latest code (with new device detection code).

@mrhotio
Copy link
Author

mrhotio commented Sep 21, 2020

The last commit from master (same for the smartctl_scan tag) has completely broken nvme output for me
image

@mrhotio
Copy link
Author

mrhotio commented Sep 21, 2020

--cap-add SYS_RAWIO for nvme drives doesn't provide sufficient permissions

{
  "json_format_version": [
    1,
    0
  ],
  "smartctl": {
    "version": [
      7,
      0
    ],
    "svn_revision": "4883",
    "platform_info": "x86_64-linux-5.4.0-47-generic",
    "build_info": "(local build)",
    "argv": [
      "smartctl",
      "-a",
      "-j",
      "-d",
      "nvme",
      "/dev/nvme1"
    ],
    "messages": [
      {
        "string": "Read NVMe Identify Controller failed: NVME_IOCTL_ADMIN_CMD: Permission denied",
        "severity": "error"
      }
    ],
    "exit_status": 2
  },
  "device": {
    "name": "/dev/nvme1",
    "info_name": "/dev/nvme1",
    "type": "nvme",
    "protocol": "NVMe"
  }
}

@mrhotio
Copy link
Author

mrhotio commented Sep 21, 2020

--cap-add SYS_ADMIN does provide sufficient perms, but isn't as restrictive from what I can gather

@AnalogJ
Copy link
Owner

AnalogJ commented Sep 22, 2020

Yeah, my understanding is that --cap-add SYS_ADMIN is basically the equivalent of --privileged https://man7.org/linux/man-pages/man7/capabilities.7.html

I had a quick chat with Justin Cormack via the Docker community slack and he pointed out that:

if you read the source code for NVME_IOCTL_ADMIN_CMD it requires CAP_SYS_ADMIN, so there is nothing you can do except grant that.
it calls nvme_user_cmd which s the check in https://elixir.bootlin.com/linux/v4.4/source/drivers/nvme/host/pci.c#L1882
the call is https://elixir.bootlin.com/linux/v4.4/source/drivers/nvme/host/pci.c#L1940

So it looks like --cap-add SYS_ADMIN is required for NVMe drives, but --cap-add SYS_RAWIO is enough for SATA/SCSI drives.

If you have a mix of both SATA/NVMe, you must include both flags:

--cap-add SYS_ADMIN \
--cap-add SYS_RAWIO \

I tried to do some googling (and spelunking in the Linux kernel) but I couldn't figure out which capability allows access to NVME_IOCTL_ADMIN_CMD. The only one's that even look they're vaguely related is CAP_SYS_RESOURCE, which docker provides via --cap-add SYS_RESOURCE.

Unfortunately I don't have a NVMe drive myself, but if I was going to test this (and SYS_RESOURCE didn't work), I would do the following:

  • Run Scrutiny with all extended capabilities
    • excluding SYS_ADMIN, since we know that works
  • Verify that it works correctly
  • Start removing capabilities one by one
    • make sure we do not remove SYS_RAWIO
    • watch for the NVME_IOCTL_ADMIN_CMD error when running smartctl. The last capability we removed is the one we need.
  • Re-add the capability to ensure that NVME_IOCTL_ADMIN_CMD disappears.

It could be either super quick (smartctl fails without removing any capabilities, meaning it requires SYS_ADMIN to run with NVMe drives), or it could be incredibly tedious. I totally understand if you're not up for it.

@mrhotio
Copy link
Author

mrhotio commented Sep 23, 2020

fyi, now that it works....I only need the following stuff to get it working

--cap-add SYS_ADMIN
--device /dev/nvme0

so no -v /run/udev:/run/udev:ro, or is that for some upcoming feature?

@AnalogJ
Copy link
Owner

AnalogJ commented Sep 24, 2020

Yeah, -v /run/udev is used to retrieve metadata for filtering virtual devices & as a fallback method to lookup WWN device identifiers.

Though, if you don't have it, Scrutiny may use your serial number as a device identifier instead.

@derekcentrico
Copy link

This workaround isn't working for me at all. Is this still valid? Any changes?

My docker:

  scrutinyanalogj:
    ports:
      - '86:80'
      - '8886:8080'
    volumes:
#      - '/var/run/docker.sock:/tmp/docker.sock:ro'
      - '/home/docker/scrutinyanalogj:/opt/scrutiny/config'
      - '/home/docker/influxdb2:/opt/scrutiny/influxdb'
      - '/run/udev:/run/udev:ro'
    restart: always
    logging:
      options:
        max-size: 1g
    container_name: scrutinyanalogj
    environment:
      - PUID=1000
      - PGID=996
      - TZ=America/New_York
    devices:
      - '/dev/nvme0n1p1:/dev/nvme0n1p1'
      - '/dev/nvme1n1p1:/dev/nvme1n1p1'
      - '/dev/nvme2n1p1:/dev/nvme2n1p1'
      - '/dev/sda:/dev/sda'
      - '/dev/sdb:/dev/sdb'
      - '/dev/sdc:/dev/sdc'
      - '/dev/sdd:/dev/sdd'
      - '/dev/sde:/dev/sde'
      - '/dev/sdf:/dev/sdf'
      - '/dev/sdg:/dev/sdg'
      - '/dev/sdh:/dev/sdh'
      - '/dev/sdi:/dev/sdi'
      - '/dev/sdj:/dev/sdj'
      - '/dev/sdk:/dev/sdk'
      - '/dev/sdl:/dev/sdl'
    cap_add:
      - SYS_ADMIN
      - SYS_RAWIO
    image: ghcr.io/analogj/scrutiny:master-omnibus
    networks:
      vpnsys_net:
        ipv4_address: '172.22.0.109'

@Jarikk
Copy link

Jarikk commented Sep 10, 2023

This workaround isn't working for me at all. Is this still valid? Any changes?

My docker:

    devices:
      - '/dev/nvme0n1p1:/dev/nvme0'
      - '/dev/nvme1n1p1:/dev/nvme1'
      - '/dev/nvme2n1p1:/dev/nvme2'

Try to remap nvme devices to the "/dev/nvmeN" inside the container. Looks like the container can't recognize different names but "nvmeN"

@derekcentrico
Copy link

LOL! What a dumbass I am. Didn't even notice I kept the partitions listed on the nvme drives. All is good! Thanks for pointing out the Homer Simpson situation.

@BlueCoffee34
Copy link

This workaround isn't working for me at all. Is this still valid? Any changes?
My docker:

    devices:
      - '/dev/nvme0n1p1:/dev/nvme0'
      - '/dev/nvme1n1p1:/dev/nvme1'
      - '/dev/nvme2n1p1:/dev/nvme2'

Try to remap nvme devices to the "/dev/nvmeN" inside the container. Looks like the container can't recognize different names but "nvmeN"

What does this mean sorry? I am having the same problem

@derekcentrico
Copy link

I goofed by not stripping off the partition parts. So n1p1 on each needed deleted.

@BlueCoffee34
Copy link

/dev/nvme0'

So you only needed this?

    devices:
      - '/dev/nvme0'
      - '/dev/nvme1'
      - '/dev/nvme2'

@derekcentrico
Copy link

Correct, assuming your device is nvme0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation enhancement New feature or request waiting for response
Projects
None yet
Development

No branches or pull requests

6 participants