Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change /etc/mtab to /proc/self/mounts #5029

Merged
merged 1 commit into from
Sep 20, 2016

Conversation

slashdd
Copy link
Contributor

@slashdd slashdd commented Aug 26, 2016

Misleading error message "The /dev/zfs device is missing and must be created.", if /etc/mtab is missing.

Signed-off-by: Eric Desrochers eric.desrochers@canonical.com
Closes #4680

@slashdd
Copy link
Contributor Author

slashdd commented Aug 26, 2016

@behlendorf
@rlaager

Could you please shed me some light on why the zfstests failed on all tests ?

Additionally, any idea why script "zfs_003_neg.ksh" have to rename /dev/zfs and /etc/mnttab prior to run the ZFS cmds ?

I'll need to workaround/fix this as well as it try to rename /proc/self/mounts now:
17:58:04.35 /bin/mv: cannot move '/proc/self/mounts' to '/proc/self/mounts.bak': No such file or directory

But I don't think it's the reason why the zfstest (/sbin/zfs list) failed though:
17:58:04.36 ERROR: eval /sbin/zfs list >/dev/null 2>&1 unexpectedly exited 0

Eric

@rlaager
Copy link
Member

rlaager commented Aug 26, 2016

That test is making sure that zfs fails if /dev/zfs or /etc/mtab is missing. Your changes made zfs succeed if /etc/mtab is missing (since it no longer cares about /etc/mtab). Ideally, we'd be able to unmount /proc as part of that test. But I doubt that's viable.

So at a minimum, you'll have to remove all mentions from MNTTAB from that test. That is, make it stop testing with /etc/mtab moved out of the way. So it would only test with /dev/zfs. Then, the description should probably not mention "scenarios" (plural), but just mention "if zfs device is missing" or something.

@behlendorf
Copy link
Contributor

For Linux specific modifications like this to the test suite there exists an is_linux function which should be used. This is useful because it help document where the Linux implementation diverges in behavior from the other OpenZFS platforms.

Regarding the other test failures keep in mind that it's not uncommon for one test failure to cause many or all of the subsequent tests to fail. So in this case it looks like a problem in zconfig.sh caused the modules not to be unloaded resulting in subsequent test failures. The same thing can obviously happen if say a proposed patch triggers an ASSERT in the kernel.

@slashdd
Copy link
Contributor Author

slashdd commented Aug 27, 2016

@behlendorf
@rlaager

All the tests succeed now but:
CentOS 6.7 X86_64
Ubuntu 14.04 X86_64

@behlendorf
Copy link
Contributor

@slashdd the tests failed on CentOS 6.7 and Ubuntu 14.04 because this patch break umount -t zfs -a for systems which don't use a symlink for /etc/mtab. We're going to need to keep the mtab update changes you dropped for older platforms which haven't yet adapted a symlink.

@rlaager
Copy link
Member

rlaager commented Aug 30, 2016

We need to figure out why that is failing and fix it. The intent of this patch is to remove all references to /etc/mtab. It shouldn't matter if they have a symlink or not, as nothing in ZFS should care about /etc/mtab after this patch.

@slashdd
Copy link
Contributor Author

slashdd commented Aug 30, 2016

@behlendorf, as per @rlaager mentioned an hour ago (see above), the purpose of this patch is to remove /etc/mtab of the equation to rely on /proc/self/mounts only.

@slashdd
Copy link
Contributor Author

slashdd commented Aug 30, 2016

@behlendorf
@rlaager

It looks to me like that the problem is somewhere around here:

http://build.zfsonlinux.org/builders/CentOS%206.7%20x86_64%20%28TEST%29/builds/2293/steps/shell_7/logs/stdio
...

  • sudo -E zconfig.sh -c -s10
    1 persistent zpool.cache zfs.sh: Unload these modules with 'zfs.sh -u':
    zfs zcommon zunicode znvpair zavl spl
    Fail (5)
    ...

Where error 5 comes from:
${ZFS_SH} zfs="spa_config_path=${TMP_CACHE}" || fail 5
${ZPOOL} import -c ${TMP_CACHE} ${POOL_NAME} || fail 5

from scripts/zconfig.sh

I just don't see yet how my patch can impact this on Ubuntu 14.04 and Centos 6.7 only.

@slashdd
Copy link
Contributor Author

slashdd commented Aug 30, 2016

@behlendorf,

I see what you mean now:

scripts/zfs.sh
66 if [ ${UNLOAD} ]; then
67 kill_zed
==> 68 umount -t zfs -a
69 stack_check
70 unload_modules

Probably that umount try to update /proc/self/mounts to reflect the new reality (zfs being umount) and can't because /proc/self/mounts is "-r--r--r--".

@slashdd
Copy link
Contributor Author

slashdd commented Aug 30, 2016

@behlendorf
@rlaager,

I wonder if that would work:

-n, --no-mtab
Unmount without writing in /etc/mtab.

Instead of -a:

-a, --all
All of the filesystems described in /etc/mtab are unmounted, except the proc filesystem.

@slashdd slashdd force-pushed the openzfs-4680 branch 3 times, most recently from 71e4989 to d0475e0 Compare August 30, 2016 15:50
@behlendorf
Copy link
Contributor

the purpose of this patch is to remove /etc/mtab of the equation to rely on /proc/self/mounts only.

@slashdd @rlaager that's not going to be possible on older systems. We can definitely prefer /proc/self/mounts and use it internally but in order for umount -a -t zfs to function properly we need to update the mtab file. Modern versions of umount have added the -A option to limit the scope to the current namespace (/proc/self/mounts).

       -a, --all
              All  of  the filesystems described in /etc/mtab are unmounted, except the proc filesys‐
              tem.

       -A, --all-targets
              Unmount all mountpoints in the current namespace for  the  specified  filesystem.   The
              filesystem  can  be  specified  by  one of the mountpoints or the device name (or UUID,
              etc.).  When this option is used together with  --recursive,  then  all  nested  mounts
              within the filesystem are recursively unmounted.  This option is only supported on sys‐
              tems where /etc/mtab is a symlink to /proc/mounts.

We need to keep this functionality to avoid accidentally breaking existing scripts/utilities which depend on it. However, we don't need to make this fatal. If for any reason we can't update the /etc/mtab file (it's missing, has the wrong permissions, it's a symlink, etc) we just skip the update, the same as if --no-mtab was passed.

@slashdd
Copy link
Contributor Author

slashdd commented Aug 30, 2016

@behlendorf , What if replacing "umount -t zfs -a" by "zfs umount -a" ?

@behlendorf
Copy link
Contributor

@slashdd that will fix the automated tests. But it means we'll be breaking any script or utility which depends on the -a flag working as described in the umount(8) man page. We'll almost certainly see a new issue opened reporting this as a bug.

@rlaager
Copy link
Member

rlaager commented Aug 30, 2016

How about this approach:

  1. Keep the changes to make all reads use hard-coded /proc/self/mounts. This would include the error message in issue 4680.
  2. Adjust the patch to NOT remove the mtab writing code. This would write to /etc/mtab as before, to keep older version of umount(8) working.
  3. Optionally, add a ./configure --disable-mtab check which #ifdefs out the mtab writing code.

@behlendorf
Copy link
Contributor

That sounds workable to me, just two comments.

This would write to /etc/mtab as before, to keep older version of umount(8) working.

This isn't an issue with an old version of umount(8) per say. New versions would be impacted as well if /etc/mtab hadn't also been replaced by a symlink which redirect the IO to /proc/mounts. -a in new versions still consults /etc/mtab it's just that the mount helpers are no longer responsible to update it since it is generated dynamically by the kernel.

Optionally, add a ./configure --disable-mtab check which #ifdefs out the mtab writing code.

I'm OK with this, but what's the motivation for this when there already exists a command line option to do exactly this?

@rlaager
Copy link
Member

rlaager commented Aug 30, 2016

Optionally, add a ./configure --disable-mtab check which #ifdefs out the mtab writing code.
I'm OK with this, but what's the motivation for this when there already exists a command line option to
do exactly this?

I'm talking about disabling the mtab writing code in ZFS. There's not a command-line option for that, is there? And even so, I think the point would be to ensure that mtab is well-and-thoroughly dead. If umount(8) is still using mtab, then Ubuntu can/should/probably-will work on killing that separately.

@behlendorf
Copy link
Contributor

Both mount(8) and umount(8) take the -n --no-mtab option which gets passed to the helper and honored by mount.zfs. We could make that the default behavior for zfs mount and zfs unmount be to pass this option but I'm not sure it buys us much as long as mount.zfs just silently fails if for any reasons it can't update /etc/mtab.

@slashdd
Copy link
Contributor Author

slashdd commented Aug 31, 2016

@behlendorf
@rlaager

I wonder if at this point, using libmount would be a more suitable solution.

"The libmount library is used to parse /etc/fstab, /etc/mtab and /proc/self/mountinfo files, manage the mtab file, evaluate mount options, etc."

@behlendorf
Copy link
Contributor

@slashdd yes absolutely. That's what it's there for and it's the only safe way to lock to the mtab file when updating it. It just didn't exist on Linux when the mount helper code was originally written.

@behlendorf
Copy link
Contributor

@slashdd thanks for refreshing this. This LGTM as a good first step towards shifting to /proc/self/mount and maybe eventually libmount as needed. @rlaager can you please review this.

@rlaager
Copy link
Member

rlaager commented Sep 9, 2016

@slashdd, it sounded like @behlendorf wanted the test changes conditionalized on the is_linux function. I will give this a more in depth review soon.

@behlendorf
Copy link
Contributor

Right, we wanted to keep the test suite in a form it'll run on other platforms so please use the is_linux function for any non-portable changes. This also acts as documentation for where we've had to diverge.

@rlaager
Copy link
Member

rlaager commented Sep 11, 2016

I made some updates here:
https://github.com/rlaager/zfs/tree/pr-5029-updates

This hopefully addresses the is_linux thing in the tests. I haven't run this on the buildbots.

It also actually fixes the error message that started this whole thing. ;)

I realize the irony that one of my commits eliminates a /proc/self/mounts hardcoding, and another adds one.

@behlendorf
Copy link
Contributor

behlendorf commented Sep 12, 2016

@rlaager thanks for the review, @slashdd it could you include Richard's suggestions in to this PR to make it easier to merge.

[edit] I accidentally closed and reopened this triggering a fresh test run.

@behlendorf behlendorf closed this Sep 12, 2016
@behlendorf behlendorf reopened this Sep 12, 2016
@slashdd
Copy link
Contributor Author

slashdd commented Sep 12, 2016

@behlendorf , sure I'll do another 'git push' by EOD with @rlaager 's suggestions.

@slashdd
Copy link
Contributor Author

slashdd commented Sep 12, 2016

@behlendorf, is your triggering for a fresh test run and my recent 'git push' interfere and created this tests failure ? How can I force the re-run the build/tests ?

@behlendorf
Copy link
Contributor

Strange, let me just resubmit these and see if they fail in the same way again.

@slashdd
Copy link
Contributor Author

slashdd commented Sep 12, 2016

@behlendorf, still failing including @rlaager's suggestion.

@slashdd
Copy link
Contributor Author

slashdd commented Sep 12, 2016

rebuilding without rlaager's suggestion to see if test still succeed.

@slashdd
Copy link
Contributor Author

slashdd commented Sep 12, 2016

@behlendorf is something wrong with the buildbot ?

@behlendorf
Copy link
Contributor

@slashdd you're going to need to rebase this on the very latest master. The version in the META file was updated recently to 0.7.0-rc1 which is causing it to fail to locate the correct version of the SPL. If you rebase on master and force update this branch it'll build correctly.

Generally that's a good habit to get in to anyway since it minimizes the chance of merge conflicts when applying the patch.

@behlendorf
Copy link
Contributor

Both buildbot failures were due to #4034 which is related. I've resubmitted testing on those builders. @slashdd it looks like you still need to incorporate @rlaager suggestions.

@behlendorf behlendorf added this to the 0.7.0 milestone Sep 13, 2016
@slashdd
Copy link
Contributor Author

slashdd commented Sep 13, 2016

@behlendorf, this latest push "bc3cfba" include @rlaager's suggestion.


#verify zfs failed if ZFS_DEV cannot be opened
ZFS_DEV=/dev/zfs

for file in $ZFS_DEV $MNTTAB; do
for file in $ZFS_DEV; do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The is_linux conditionals from rlaager@79fd00a are missing.

@behlendorf
Copy link
Contributor

It looks like you've just applied the top most patch in the stack. See my inline patch comments.

@slashdd
Copy link
Contributor Author

slashdd commented Sep 13, 2016

@behlendorf ok thanks... for some reasons, I missed that portion ... will work on it today.

Copy link
Contributor

@behlendorf behlendorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for refreshing this. @rlaager can you also give this one last review before it's merged.

@@ -259,9 +259,9 @@ zfs_is_mountable(zfs_handle_t *zhp, char *buf, size_t buflen,

/*
* The filesystem is mounted by invoking the system mount utility rather
* than by the system call mount(2). This ensures that the /etc/mtab
* than by the system call mount(2). This ensures that the /proc/self/mounts
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is now wrong. We do not lock /proc/self/mounts for an update.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

* file is correctly locked for the update. Performing our own locking
* and /etc/mtab update requires making an unsafe assumption about how
* and /proc/self/mounts update requires making an unsafe assumption about how
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likewise, /proc/self/mounts is not updated. Perhaps these should remain /etc/mtab, because there is /etc/mtab updating that still happens?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -348,7 +348,7 @@ zfs_add_option(zfs_handle_t *zhp, char *options, int len,

/*
* zfs_prop_get_int() to not used to ensure our mount options
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is existing, but should this be "is not used" rather than "to not used"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -348,7 +348,7 @@ zfs_add_option(zfs_handle_t *zhp, char *options, int len,

/*
* zfs_prop_get_int() to not used to ensure our mount options
* are not influenced by the current /etc/mtab contents.
* are not influenced by the current /proc/self/mounts contents.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which file should this be?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me, it sound like /proc/self/mounts is the file.

Misleading error message "The /dev/zfs device is missing and must be created.", if /etc/mtab is missing.

Signed-off-by: Eric Desrochers <eric.desrochers@canonical.com>
@rlaager
Copy link
Member

rlaager commented Sep 20, 2016

LGTM
Reviewed-by: Richard Laager rlaager@wiktel.com

@behlendorf behlendorf merged commit 7925173 into openzfs:master Sep 20, 2016
@behlendorf
Copy link
Contributor

Thanks guys!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants