Alpine Linux Cannot Navigate Azure Files Mounts Reliably(SMB client issue in alpha image) #1325

GuyPaddock · 2019-11-19T14:26:57Z

When using Azure Files with Alpine-Linux-based containers on AKS, you may observe strange behavior when applications attempt to navigate folders containing more than 62 files. In fact, commands like rm -rf from CLI will fail with rm: can't remove 'test': Directory not empty.

A copious amount of more information (including repro steps, environment, etc) is available here:
https://gitlab.alpinelinux.org/alpine/aports/issues/10960

I'm posting a link to this issue here for two reasons:

To serve as a reference for other AKS users.
To see if there is anything that Azure can do on the kernel side of things to address this issue in case musl does not.

Our nodes are currently running the following kernel version: 4.15.0-1063-azure #68-Ubuntu SMP Fri Nov 8 09:30:20 UTC 2019 x86_64 Linux.

With this version of Kubernetes:

Client Version: version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.7", GitCommit:"8fca2ec50a6133511b771a11559e24191b1aa2b4", GitTreeState:"clean", BuildDate:"2019-09-18T
14:47:22Z", GoVersion:"go1.12.9", Compiler:"gc", Platform:"windows/amd64"}
Server Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.12", GitCommit:"524c3a1238422529d62f8e49506df658fa9c8b8c", GitTreeState:"clean", BuildDate:"2019-11-14
T05:26:24Z", GoVersion:"go1.11.13", Compiler:"gc", Platform:"linux/amd64"}

The text was updated successfully, but these errors were encountered:

andyzhangx · 2020-02-29T14:00:57Z

could you provide more details about azure file? what's the azure file storage class? Have you tried using premium file?

GuyPaddock · 2020-05-07T14:43:56Z

I would reference the linked thread for more details.

We're using Azure Files standard. Azure Files Premium breaks our cost structure due to the 100 TiB minimum quota requirement.

Per the musl team in the thread I linked to, the issue I'm reporting does not appear in Ubuntu containers using the GNU C library but does appear in Alpine-based containers using the MUSL C library. The root cause appears to be somewhere in the kernel -- it's caching data in some way that causes unexpected results when iterating-while-deleting 64+ files on a mounted NFS or SMB share.

The issue does not happen on GNU C because it uses much, much larger reads for directory iteration, which seems to effectively prevent the kernel from buffering the directory listing.

smfrench · 2020-05-07T16:57:14Z

This would be useful to get more information about. Many performance optimizations for metadata went in after the 4.18 kernel (especially around the 5.0 kernel) for SMB3 queries, but even on 4.15 there are a few obvious things that are worth trying, and it is also possible that this bug has been fixed in the last three years (and is in more recent kernels) but in the meantime can you try some potential workarounds? Have you tried setting mount option "actimeo=0" (to disable caching and see if the "can't remove directory" issue goes away)? The reverse, caching directory entries for longer periods of time (SMB3 defaults to a short cache lifetime for dentries of only 1 second) by e.g. setting actime=60 (instead of its default of one second) would be useful to see how that affected your workload.

In addition there are valid cases where "rm -rf" would fail. For example, if the application left open one of the files in that directory tree then it can not be deleted from the server until the file is closed (although it can be marked as to be deleted on close). NFS client on Linux can work around this with a strategy called 'silly-rename' and so if you do turn out to have this problem (ie the application forgot to close one or more of the files before deleting it), there may be a 'silly-rename' strategy on SMB3 client that we can add to cifs.ko to workaround the application problem in a similar way. One way to check if the "application forgot to close a file" is related to your problem is to do an "lsof +D " before you do "rm -rf" to ensure no files are open in that directory tree.

GuyPaddock · 2020-05-07T17:28:20Z

In addition there are valid cases where "rm -rf" would fail. For example, if the application left open one of the files in that directory tree then it can not be deleted from the server until the file is closed (although it can be marked as to be deleted on close).

Per the info in this article, we are able to demonstrate rm -rf failing in a single bash script that uses no file locks and no concurrency:
https://gitlab.alpinelinux.org/alpine/aports/issues/10960

GuyPaddock · 2020-05-07T17:29:38Z

I tried actimeo=0 back in October and all it did was result in massive throttling from Azure Files that eventually caused the mount to fall off (similar to #1587).

smfrench · 2020-05-07T22:04:56Z

In addition there are valid cases where "rm -rf" would fail. For example, if the application left open one of the files in that directory tree then it can not be deleted from the server until the file is closed (although it can be marked as to be deleted on close).

Per the info in this article, we are able to demonstrate rm -rf failing in a single bash script that uses no file locks and no concurrency:
https://gitlab.alpinelinux.org/alpine/aports/issues/10960

I tried an experiment just now with this old Ubuntu kernel (4.15) mounted with SMB3 to Azure and didn't see a problem with either 256 or 2048 files (see below) using the test script mentioned earlier in the post. To reproduce this problem may require a more complex setup with containers (or perhaps a very old kernel missing some fixes?).

root@smf-old-ubuntu:/mnt/smftestshares# ~/test.sh 256
Creating '256' test files...

Trying to delete test files...
DELETED: 257 BEFORE: 256 AFTER: 0

root@smf-old-ubuntu:/mnt/smftestshares# ls
root@smf-old-ubuntu:/mnt/smftestshares# ~/test.sh 2048
Creating '2048' test files...

Trying to delete test files...
DELETED: 2049 BEFORE: 2048 AFTER: 0

root@smf-old-ubuntu# uname -a
Linux smf-old-ubuntu 4.15.0-1082-azure #92~16.04.1-Ubuntu SMP Tue Apr 14 22:28:34 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

GuyPaddock · 2020-05-08T01:00:16Z

@smfrench: As I mentioned, the bug is not reproducible on Ubuntu because it uses GNU Standard C instead of MUSL C. GNU C works around the kernel bug by doing large reads to avoid caching; MUSL does small reads.

You would need to try this on an Alpine container.

andyzhangx · 2020-05-08T03:54:07Z

I tried on a AKS 1.17.3 cluster, could not repro, use the exact same steps as https://gitlab.alpinelinux.org/alpine/aports/issues/10960

# k get no -o wide
NAME                                STATUS   ROLES   AGE   VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION      CONTAINER-RUNTIME
aks-agentpool-22126781-vmss000002   Ready    agent   21m   v1.17.3   10.240.0.4    <none>        Ubuntu 16.04.6 LTS   4.15.0-1077-azure   docker://3.0.10+azure
aks-agentpool-22126781-vmss000003   Ready    agent   21m   v1.17.3   10.240.0.5    <none>        Ubuntu 16.04.6 LTS   4.15.0-1077-azure   docker://3.0.10+azure

# k get po test-shares-bdf6d7956-x9mcd -o wide
NAME                          READY   STATUS    RESTARTS   AGE     IP           NODE                                NOMINATED NODE   READINESS GATES
test-shares-bdf6d7956-x9mcd   1/1     Running   0          3m15s   10.244.3.3   aks-agentpool-22126781-vmss000002   <none>           <none>

# k exec -it test-shares-bdf6d7956-x9mcd sh
/ # vi test.sh
/ # chmod 0755 test.sh
/ # ./test.sh 128
Creating '128' test files...

Trying to delete test files...
DELETED: 129  BEFORE: 128  AFTER: 0

/ # ./test.sh 128
Creating '128' test files...

Trying to delete test files...
DELETED: 129  BEFORE: 128  AFTER: 0

/ # ./test.sh 128
Creating '128' test files...

Trying to delete test files...
DELETED: 129  BEFORE: 128  AFTER: 0

I was wrong, it could repro, I forgot to cd /var/www/html/data/ in previous experiment:

# k exec -it test-shares-bdf6d7956-skq8r sh
/ # cd /var/www/html/data/
/var/www/html/data # vi test.sh
/var/www/html/data # chmod 0755 test.sh
/var/www/html/data # ./test.sh 128
Creating '128' test files...

Trying to delete test files...
DELETED: 66  BEFORE: 128  AFTER: 62
DELETED: 63  BEFORE: 62  AFTER: 0

/var/www/html/data # ./test.sh 128
Creating '128' test files...

Trying to delete test files...
DELETED: 66  BEFORE: 128  AFTER: 62
DELETED: 63  BEFORE: 62  AFTER: 0

andyzhangx · 2020-05-08T06:23:35Z

Here is the way how to repro this issue on your local environment, it’s directly related to SMB client issue(don’t need AKS cluster):

mkdir /tmp/test
sudo mount -t cifs //accountname.file.core.windows.net/test /tmp/test -o vers=3.0,username= accountname,password=…,dir_mode=0777,file_mode=0777,cache=strict,actimeo=30

wget -O /tmp/test/test.sh https://raw.githubusercontent.com/andyzhangx/demo/master/debug/test.sh
docker run -it -v /tmp/test:/var/www/html/data/ --name alpine alpine:3.10 sh

# cd /var/www/html/data/
/var/www/html/data # ./test.sh 128
Creating '128' test files...

Trying to delete test files...
DELETED: 66  BEFORE: 128  AFTER: 62
DELETED: 63  BEFORE: 62  AFTER: 0

We are already looping SMB experts to take a look at this issue.

Also, Same result on AKS node Ubuntu 18.04 5.0.0-1036-azure running with alpine:3.10 image:

./test.sh 128
Creating '128' test files...

Trying to delete test files...
DELETED: 66  BEFORE: 128  AFTER: 62
DELETED: 63  BEFORE: 62  AFTER: 0

andyzhangx · 2020-05-08T06:30:32Z

while by using ubuntu:16.04 image, it's working as expected:

# docker run -it -v /tmp/test:/var/www/html/data/ --name ubuntu ubuntu:16.04 sh
# cd /var/www/html/data/
# ./test.sh 128
Creating '128' test files...

Trying to delete test files...
DELETED: 129  BEFORE: 128  AFTER: 0

github-actions · 2020-07-21T01:33:00Z

Action required from @Azure/aks-pm

ghost · 2020-07-26T16:03:02Z

Action required from @Azure/aks-pm

palma21 · 2020-07-27T11:11:20Z

@VybavaRamadoss @RenaShahMSFT could you help?

smfrench · 2020-07-27T16:39:14Z

When debugged this back in May, didn't this show the bug in the Alpine version of ls not in the network fs client(s)? The SMB3 (and presumably nfs client as well) was returning the expected files, and delete worked fine but the Alpine library (unlike libc library called by ls) had a bug. It seemed to be related to Alpine library not restarting the search properly after changing the directory contents after removing some of the files in the middle of doing a directory search.

GuyPaddock · 2020-07-28T16:01:04Z

@smfrench No, the issue is that there is a kernel bug that GNU LibC avoids by doing greedy/large reads. Alpine does smaller reads of directory listings to limit memory consumption. So, it is more accurate to say that Alpine does not work around the kernel bug while GNU LibC does. But it is hard to say whether GNU was aware that they were working around the bug or whether it was just coincedental.

palma21 · 2020-08-06T18:23:03Z

Could you confirm if this issue should still be open then if it's specific to Alpine?

ghost · 2020-10-05T20:01:14Z

This issue has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs within 15 days of this comment.

ghost · 2020-10-21T01:01:09Z

This issue will now be closed because it hasn't had any activity for 15 days after stale. GuyPaddock feel free to comment again on the next 7 days to reopen or open a new issue after that time if you still have a question/issue or suggestion.

sprasad-microsoft · 2020-10-21T07:34:32Z

I did some digging into this issue and also discussed this in the linux-cifs mailing list.

The documentation on readdir reads (https://pubs.opengroup.org/onlinepubs/9699919799/functions/readdir_r.html):
If a file is removed from or added to the directory after the most
recent call to opendir() or rewinddir(), whether a subsequent call to
readdir() returns an entry for that file is unspecified.

So the different filesystems are left free to choose their own behaviour when this happens.
cifs.ko (the Linux SMB client) makes sure that it's not returning stale data, at the cost of missing some entries for this particular use case. It just so happens that the way ext4 handles this is to reposition the dir offset back to 0. However, with that, ext4 could end up emitting duplicate entries during successive readdirs, in case of changes to the directory. Either way, there can be issues.

This issue is seen quite often in Alpine, because it uses musl libc, which seems to send much smaller buffers down to VFS to read the dirents into.

However, the main issue here is the implementation of rm used here (I don't know if this is the default GNU version of coreutils). It depends on the undefined behaviour of Linux VFS, where it should not. When doing recursive readdirs (where it knows that the directory has changed), it should rewind back to position 0 and start the next readdir again. This way, the problem can be fixed; and to me, that sounds like the right way to fix this problem.

triage-new-issues bot added the triage label Nov 19, 2019

This comment has been minimized.

Sign in to view

andyzhangx changed the title ~~Alpine Linux Cannot Navigate Azure Files Mounts Reliably~~ Alpine Linux Cannot Navigate Azure Files Mounts Reliably(SMB client issue in alpha image) May 8, 2020

GuyPaddock mentioned this issue May 14, 2020

Sporadic I/O Errors Lead to Files Getting Sporadically Locked on azureFile Volumes #1593

Closed

GuyPaddock mentioned this issue May 27, 2020

Removing Larger Folders with rmdir() is Unreliable in Alpine-based FPM environments when Data Folder is Mounted via NFS or SMBv3 nextcloud/server#17980

Closed

github-actions bot added the action-required label Jul 21, 2020

triage-new-issues bot removed the triage label Jul 21, 2020

ghost added the Needs Attention 👋 Issues needs attention/assignee/owner label Jul 26, 2020

palma21 added azure/files storage labels Jul 27, 2020

palma21 assigned VybavaRamadoss and RenaShahMSFT Jul 27, 2020

palma21 removed Needs Attention 👋 Issues needs attention/assignee/owner action-required labels Aug 6, 2020

palma21 added the Needs Information label Aug 6, 2020

ghost added the stale Stale issue label Oct 5, 2020

ghost closed this as completed Oct 21, 2020

ghost removed the stale Stale issue label Oct 21, 2020

ghost locked as resolved and limited conversation to collaborators Nov 20, 2020

This issue was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Alpine Linux Cannot Navigate Azure Files Mounts Reliably(SMB client issue in alpha image) #1325

Alpine Linux Cannot Navigate Azure Files Mounts Reliably(SMB client issue in alpha image) #1325

GuyPaddock commented Nov 19, 2019

andyzhangx commented Feb 29, 2020

GuyPaddock commented May 7, 2020

smfrench commented May 7, 2020 •

edited

Loading

GuyPaddock commented May 7, 2020

GuyPaddock commented May 7, 2020

smfrench commented May 7, 2020 •

edited

Loading

GuyPaddock commented May 8, 2020

This comment has been minimized.

andyzhangx commented May 8, 2020

andyzhangx commented May 8, 2020 •

edited

Loading

andyzhangx commented May 8, 2020 •

edited

Loading

github-actions bot commented Jul 21, 2020

ghost commented Jul 26, 2020

palma21 commented Jul 27, 2020

smfrench commented Jul 27, 2020 •

edited

Loading

GuyPaddock commented Jul 28, 2020

palma21 commented Aug 6, 2020

ghost commented Oct 5, 2020

ghost commented Oct 21, 2020

sprasad-microsoft commented Oct 21, 2020 •

edited

Loading

Alpine Linux Cannot Navigate Azure Files Mounts Reliably(SMB client issue in alpha image) #1325

Alpine Linux Cannot Navigate Azure Files Mounts Reliably(SMB client issue in alpha image) #1325

Comments

GuyPaddock commented Nov 19, 2019

andyzhangx commented Feb 29, 2020

GuyPaddock commented May 7, 2020

smfrench commented May 7, 2020 • edited Loading

GuyPaddock commented May 7, 2020

GuyPaddock commented May 7, 2020

smfrench commented May 7, 2020 • edited Loading

GuyPaddock commented May 8, 2020

This comment has been minimized.

andyzhangx commented May 8, 2020

andyzhangx commented May 8, 2020 • edited Loading

andyzhangx commented May 8, 2020 • edited Loading

github-actions bot commented Jul 21, 2020

ghost commented Jul 26, 2020

palma21 commented Jul 27, 2020

smfrench commented Jul 27, 2020 • edited Loading

GuyPaddock commented Jul 28, 2020

palma21 commented Aug 6, 2020

ghost commented Oct 5, 2020

ghost commented Oct 21, 2020

sprasad-microsoft commented Oct 21, 2020 • edited Loading

smfrench commented May 7, 2020 •

edited

Loading

smfrench commented May 7, 2020 •

edited

Loading

andyzhangx commented May 8, 2020 •

edited

Loading

andyzhangx commented May 8, 2020 •

edited

Loading

smfrench commented Jul 27, 2020 •

edited

Loading

sprasad-microsoft commented Oct 21, 2020 •

edited

Loading