Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Concurrent modifications to "/etc/dfs/sharetab" does not work #7943

Closed
prakashsurya opened this issue Sep 21, 2018 · 5 comments · Fixed by #10300
Closed

Concurrent modifications to "/etc/dfs/sharetab" does not work #7943

prakashsurya opened this issue Sep 21, 2018 · 5 comments · Fixed by #10300
Assignees
Labels
Component: Share "zfs share" feature

Comments

@prakashsurya
Copy link
Member

System information

Type Version/Name
Distribution Name Ubuntu
Distribution Version Ubuntu 18.04
Linux Kernel 4.15.0-34-generic
Architecture x86_64
ZFS Version v0.8.0-rc1
SPL Version v0.8.0-rc1

Describe the problem you're observing

When multiple processes are concurrently modifying the /etc/dfs/sharetab file (e.g. zfs set sharenfs=on ..., zfs create -o sharenfs=on ..., etc.) the contents of /etc/dfs/sharetab can become "corrupt"; i.e. not all shared filesystems will be listed in that file.

Describe how to reproduce the problem

This can be reproduced with the following script:

#!/bin/bash
touch /etc/dfs/sharetab
echo -n "exportfs -- "; exportfs | wc -l
echo -n "sharetab -- "; cat /etc/dfs/sharetab | wc -l
for i in `seq 1 100`; do
        (sleep 1; zfs create -o sharenfs=on tank/fs-$i) &
done
wait
echo -n "exportfs -- "; exportfs | wc -l
echo -n "sharetab -- "; cat /etc/dfs/sharetab | wc -l

When I run this script, I see the following:

$ sudo ./reproducer.sh
exportfs -- 0
sharetab -- 0
exportfs -- 100
sharetab -- 1

Here's some more information after running that reproducer:

$ sudo exportfs | head
/tank/fs-71     <world>
/tank/fs-42     <world>
/tank/fs-61     <world>
/tank/fs-48     <world>
/tank/fs-16     <world>
/tank/fs-40     <world>
/tank/fs-52     <world>
/tank/fs-80     <world>
/tank/fs-90     <world>
/tank/fs-57     <world>

$ sudo cat /etc/dfs/sharetab | head
/tank/fs-71     -       nfs     rw,crossmnt

$ sudo zfs list -r tank | head
NAME          USED  AVAIL  REFER  MOUNTPOINT
tank         3.80M  21.8G    30K  /tank
tank/fs-1      24K  21.8G    24K  /tank/fs-1
tank/fs-10     24K  21.8G    24K  /tank/fs-10
tank/fs-100    24K  21.8G    24K  /tank/fs-100
tank/fs-11     24K  21.8G    24K  /tank/fs-11
tank/fs-12     24K  21.8G    24K  /tank/fs-12
tank/fs-13     24K  21.8G    24K  /tank/fs-13
tank/fs-14     24K  21.8G    24K  /tank/fs-14
tank/fs-15     24K  21.8G    24K  /tank/fs-15
@prakashsurya
Copy link
Member Author

We have a change that we've been using to explore a potential fix for this issue here: prakashsurya@f593311

With this in place, the /etc/dfs/sharetab file remains consistent with the contents of exportfs and the number of shared filesystems that are created, but I've seen cases where zfs set sharenfs=off or zfs destroy will fail. So far, I'm not sure if these failures are a result of our work-in-progress change that I linked to, or due to a tangential problem in the code.

To be clear, here's roughly what I'm doing to test the change I linked to above:

  1. Concurrently create 100 filesystems with zfs create
    • After, verify exportfs and sharetab have 0 lines
  2. Concurrently set sharenfs=on on these 100 filesystems with zfs set sharenfs=on
    • After, verify exportfs and sharetab have 100 lines
  3. Concurrently set sharenfs=off on these 100 filesystems with zfs set sharenfs=off
    • After, verify exportfs and sharetab have 0 lines
  4. Concurrently destroy these 100 filesystems with zfs destroy
    • After, verify exportfs and sharetab have 0 lines (redundant due to step 3)

@behlendorf
Copy link
Contributor

behlendorf commented Sep 24, 2018

@prakashsurya have you considered using libmount for this. These days most (maybe all) distributions rely on it to perform these updates safely.

http://ftp.ntu.edu.tw/linux/utils/util-linux/v2.22/libmount-docs/index.html

One of the major advantages of this approach is that it makes these updates safe even when they're running concurrently with other system tools (i.e. exportfs, mount). Right now on Linux we're updating /etc/dfs/sharetab which is effectively private to ZFS since on Linux the correct file path is /var/lib/nfs/etab. By switching to libmount, which didn't exist when this code was originally written, it should be safe to rely on /var/lib/nfs/etab since it can now lock it.

/*
 * nfs_check_exportfs() checks that the exportfs command runs
 * and also maintains a temporary copy of the output from
 * exportfs -v.
 * To update this temporary copy simply call this function again.
 *
 * TODO : Use /var/lib/nfs/etab instead of our private copy.
 *        But must implement locking to prevent concurrent access.
 *

It'd be great to see those tests added to the ZTS as part of this.

@prakashsurya
Copy link
Member Author

Thanks for the pointer!

I didn't know about libmount, so I'll try and give that a look, and see when I can make some time to move the code over to it. I had already thought lots of this code would need to get rewritten based on what I've looked at so far (that comment included), so it's good to know that there's a proper library that we can now use.

@behlendorf
Copy link
Contributor

@alek-p may have already started on the libmount conversion but I'm not sure how far along that work is.

@alek-p
Copy link
Contributor

alek-p commented Sep 24, 2018

Bringing in libmount was something we've looked into but I haven't actually started any integration work.

@behlendorf behlendorf added the Component: Share "zfs share" feature label Sep 24, 2018
@grwilson grwilson self-assigned this Jul 12, 2020
behlendorf pushed a commit that referenced this issue Jul 13, 2020
== Motivation and Context

The current implementation of 'sharenfs' and 'sharesmb' relies on
the use of the sharetab file. The use of this file is os-specific
and not required by linux or freebsd. Currently the code must
maintain updates to this file which adds complexity and presents
a significant performance impact when sharing many datasets. In
addition, concurrently running 'zfs sharenfs' command results in
missing entries in the sharetab file leading to unexpected failures.

== Description

This change removes the sharetab logic from the linux and freebsd
implementation of 'sharenfs' and 'sharesmb'. It still preserves an
os-specific library which contains the logic required for sharing
NFS or SMB. The following entry points exist in the vastly simplified
libshare library:

- sa_enable_share -- shares a dataset but may not commit the change
- sa_disable_share -- unshares a dataset but may not commit the change
- sa_is_shared -- determine if a dataset is shared
- sa_commit_share -- notify NFS/SMB subsystem to commit the shares
- sa_validate_shareopts -- determine if sharing options are valid

The sa_commit_share entry point is provided as a performance enhancement
and is not required. The sa_enable_share/sa_disable_share may commit
the share as part of the implementation. Libshare provides a framework
for both NFS and SMB but some operating systems may not fully support
these protocols or all features of the protocol.

NFS Operation:
For linux, libshare updates /etc/exports.d/zfs.exports to add
and remove shares and then commits the changes by invoking
'exportfs -r'. This file, is automatically read by the kernel NFS
implementation which makes for better integration with the NFS systemd
service. For FreeBSD, libshare updates /etc/zfs/exports to add and
remove shares and then commits the changes by sending a SIGHUP to
mountd.

SMB Operation:
For linux, libshare adds and removes files in /var/lib/samba/usershares
by calling the 'net' command directly. There is no need to commit the
changes. FreeBSD does not support SMB.

== Performance Results

To test sharing performance we created a pool with an increasing number
of datasets and invoked various zfs actions that would enable and
disable sharing. The performance testing was limited to NFS sharing.
The following tests were performed on an 8 vCPU system with 128GB and
a pool comprised of 4 50GB SSDs:

Scale testing:
- Share all filesystems in parallel -- zfs sharenfs=on <dataset> &
- Unshare all filesystems in parallel -- zfs sharenfs=off <dataset> &

Functional testing:
- share each filesystem serially -- zfs share -a
- unshare each filesystem serially -- zfs unshare -a
- reset sharenfs property and unshare -- zfs inherit -r sharenfs <pool>

For 'zfs sharenfs=on' scale testing we saw an average reduction in time
of 89.43% and for 'zfs sharenfs=off' we saw an average reduction in time
of 83.36%.

Functional testing also shows a huge improvement:
- zfs share -- 97.97% reduction in time
- zfs unshare -- 96.47% reduction in time
- zfs inhert -r sharenfs -- 99.01% reduction in time

Reviewed-by: Matt Ahrens <matt@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Bryant G. Ly <bryangly@gmail.com>
Signed-off-by: George Wilson <gwilson@delphix.com>
External-Issue: DLPX-68690
Closes #1603
Closes #7692
Closes #7943
Closes #10300
jsai20 pushed a commit to jsai20/zfs that referenced this issue Mar 30, 2021
== Motivation and Context

The current implementation of 'sharenfs' and 'sharesmb' relies on
the use of the sharetab file. The use of this file is os-specific
and not required by linux or freebsd. Currently the code must
maintain updates to this file which adds complexity and presents
a significant performance impact when sharing many datasets. In
addition, concurrently running 'zfs sharenfs' command results in
missing entries in the sharetab file leading to unexpected failures.

== Description

This change removes the sharetab logic from the linux and freebsd
implementation of 'sharenfs' and 'sharesmb'. It still preserves an
os-specific library which contains the logic required for sharing
NFS or SMB. The following entry points exist in the vastly simplified
libshare library:

- sa_enable_share -- shares a dataset but may not commit the change
- sa_disable_share -- unshares a dataset but may not commit the change
- sa_is_shared -- determine if a dataset is shared
- sa_commit_share -- notify NFS/SMB subsystem to commit the shares
- sa_validate_shareopts -- determine if sharing options are valid

The sa_commit_share entry point is provided as a performance enhancement
and is not required. The sa_enable_share/sa_disable_share may commit
the share as part of the implementation. Libshare provides a framework
for both NFS and SMB but some operating systems may not fully support
these protocols or all features of the protocol.

NFS Operation:
For linux, libshare updates /etc/exports.d/zfs.exports to add
and remove shares and then commits the changes by invoking
'exportfs -r'. This file, is automatically read by the kernel NFS
implementation which makes for better integration with the NFS systemd
service. For FreeBSD, libshare updates /etc/zfs/exports to add and
remove shares and then commits the changes by sending a SIGHUP to
mountd.

SMB Operation:
For linux, libshare adds and removes files in /var/lib/samba/usershares
by calling the 'net' command directly. There is no need to commit the
changes. FreeBSD does not support SMB.

== Performance Results

To test sharing performance we created a pool with an increasing number
of datasets and invoked various zfs actions that would enable and
disable sharing. The performance testing was limited to NFS sharing.
The following tests were performed on an 8 vCPU system with 128GB and
a pool comprised of 4 50GB SSDs:

Scale testing:
- Share all filesystems in parallel -- zfs sharenfs=on <dataset> &
- Unshare all filesystems in parallel -- zfs sharenfs=off <dataset> &

Functional testing:
- share each filesystem serially -- zfs share -a
- unshare each filesystem serially -- zfs unshare -a
- reset sharenfs property and unshare -- zfs inherit -r sharenfs <pool>

For 'zfs sharenfs=on' scale testing we saw an average reduction in time
of 89.43% and for 'zfs sharenfs=off' we saw an average reduction in time
of 83.36%.

Functional testing also shows a huge improvement:
- zfs share -- 97.97% reduction in time
- zfs unshare -- 96.47% reduction in time
- zfs inhert -r sharenfs -- 99.01% reduction in time

Reviewed-by: Matt Ahrens <matt@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Bryant G. Ly <bryangly@gmail.com>
Signed-off-by: George Wilson <gwilson@delphix.com>
External-Issue: DLPX-68690
Closes openzfs#1603
Closes openzfs#7692
Closes openzfs#7943
Closes openzfs#10300
sempervictus pushed a commit to sempervictus/zfs that referenced this issue May 31, 2021
== Motivation and Context

The current implementation of 'sharenfs' and 'sharesmb' relies on
the use of the sharetab file. The use of this file is os-specific
and not required by linux or freebsd. Currently the code must
maintain updates to this file which adds complexity and presents
a significant performance impact when sharing many datasets. In
addition, concurrently running 'zfs sharenfs' command results in
missing entries in the sharetab file leading to unexpected failures.

== Description

This change removes the sharetab logic from the linux and freebsd
implementation of 'sharenfs' and 'sharesmb'. It still preserves an
os-specific library which contains the logic required for sharing
NFS or SMB. The following entry points exist in the vastly simplified
libshare library:

- sa_enable_share -- shares a dataset but may not commit the change
- sa_disable_share -- unshares a dataset but may not commit the change
- sa_is_shared -- determine if a dataset is shared
- sa_commit_share -- notify NFS/SMB subsystem to commit the shares
- sa_validate_shareopts -- determine if sharing options are valid

The sa_commit_share entry point is provided as a performance enhancement
and is not required. The sa_enable_share/sa_disable_share may commit
the share as part of the implementation. Libshare provides a framework
for both NFS and SMB but some operating systems may not fully support
these protocols or all features of the protocol.

NFS Operation:
For linux, libshare updates /etc/exports.d/zfs.exports to add
and remove shares and then commits the changes by invoking
'exportfs -r'. This file, is automatically read by the kernel NFS
implementation which makes for better integration with the NFS systemd
service. For FreeBSD, libshare updates /etc/zfs/exports to add and
remove shares and then commits the changes by sending a SIGHUP to
mountd.

SMB Operation:
For linux, libshare adds and removes files in /var/lib/samba/usershares
by calling the 'net' command directly. There is no need to commit the
changes. FreeBSD does not support SMB.

== Performance Results

To test sharing performance we created a pool with an increasing number
of datasets and invoked various zfs actions that would enable and
disable sharing. The performance testing was limited to NFS sharing.
The following tests were performed on an 8 vCPU system with 128GB and
a pool comprised of 4 50GB SSDs:

Scale testing:
- Share all filesystems in parallel -- zfs sharenfs=on <dataset> &
- Unshare all filesystems in parallel -- zfs sharenfs=off <dataset> &

Functional testing:
- share each filesystem serially -- zfs share -a
- unshare each filesystem serially -- zfs unshare -a
- reset sharenfs property and unshare -- zfs inherit -r sharenfs <pool>

For 'zfs sharenfs=on' scale testing we saw an average reduction in time
of 89.43% and for 'zfs sharenfs=off' we saw an average reduction in time
of 83.36%.

Functional testing also shows a huge improvement:
- zfs share -- 97.97% reduction in time
- zfs unshare -- 96.47% reduction in time
- zfs inhert -r sharenfs -- 99.01% reduction in time

Reviewed-by: Matt Ahrens <matt@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Bryant G. Ly <bryangly@gmail.com>
Signed-off-by: George Wilson <gwilson@delphix.com>
External-Issue: DLPX-68690
Closes openzfs#1603
Closes openzfs#7692
Closes openzfs#7943
Closes openzfs#10300
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: Share "zfs share" feature
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants