Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xe sr-probe does not find NFSv4/NFSv4.1 #135

Open
borzel opened this issue Feb 8, 2019 · 30 comments
Open

xe sr-probe does not find NFSv4/NFSv4.1 #135

borzel opened this issue Feb 8, 2019 · 30 comments

Comments

@borzel
Copy link
Member

borzel commented Feb 8, 2019

Background:

  • XCP-ng 7.6 (with all extras/testing packages up to date)
  • FreeNAS-11.2-RELEASE-U1 as NFS-Server (NFSv4 enabled)

Steps:

  1. enable NFSv4 on FreeNAS and reboot it to fully enable it (!)
    grafik
  2. Create new SR with XCP-ng Center (https://github.com/xcp-ng/xenadmin/releases/tag/v7.6.3.21) with NFSv4.1 (or XenOrchestra -> this step just proves NFSv4.1 support)
    grafik
  3. Detach SR
  4. run xe sr-probe type=nfs device-config:server=<some-ip> device-config:serverpath=/mnt/testpool/nfsv41test device-config:probeversion

Output of sr-probe

<?xml version="1.0" ?>
<SRlist>
        <SR>
                <UUID>2131f8f8-49be-0868-2b97-d624c9851584</UUID>
        </SR>
        <SR>
                <UUID>cce30ede-0fd2-b34a-04e5-291b56b9f8fc</UUID>
        </SR>
        <SupportedVersions>
                <Version>3</Version>
        </SupportedVersions>
</SRlist>

Expected result
Shows SupportedVersions 3, 4 and 4.1

@borzel borzel changed the title xe sr-probe doesn't find NFSv4/NFSv4.1 protocol xe sr-probe does not find NFSv4/NFSv4.1 protocol Feb 8, 2019
@borzel borzel changed the title xe sr-probe does not find NFSv4/NFSv4.1 protocol xe sr-probe does not find NFSv4/NFSv4.1 Feb 8, 2019
@borzel borzel added the bug 🐛 label Feb 8, 2019
@thctlo
Copy link

thctlo commented Feb 15, 2019

Hm getting :
Host Xen XCP 7.6.3 , NFSv4 enabled Server Debian 9.
exports file contains.
/srv/nfs4 192.168.xxx.0/24(rw,sync,fsid=0,crossmnt,no_subtree_check,sec=sys:krb5:krb5i:krb5p)
/srv/nfs4/xcphosts 192.168.xxx.0/24(rw,sync,no_subtree_check,sec=sys:krb5:krb5i:krb5p)

This current setup works for v3/v4/v4.1 with and without kerberos mounts.

xe sr-probe type=nfs device-config:server=IP_HERE device-config:serverpath=/xen/nfs-stor device-config:probeversion
Error code: SR_BACKEND_FAILURE_73
Error parameters: , NFS mount error [opterr=mount failed with return code 32],

@thctlo
Copy link

thctlo commented Feb 15, 2019

running : xe sr-probe type=nfs device-config:server=IP_HERE device-config:probeversion

Error code: SR_BACKEND_FAILURE_101
Error parameters: , The request is missing the serverpath parameter, <?xml version="1.0" ?>
<nfs-exports>
<Export>
<Target>192.168.xxx.xxx</Target>
<Path>/srv/nfs4/xcphosts</Path>
<Accesslist>192.168.xxx.0/24</Accesslist>
</Export>
<Export>
<Target>192.168.xxx.xxx</Target>
<Path>/srv/nfs4</Path>
<Accesslist>192.168.xxx.0/24</Accesslist>
</Export>
</nfs-exports>

@stormi
Copy link
Member

stormi commented Feb 19, 2019

@borzel Is that a problem that occurs for any NFS share that supports NFS 4 or above, or only with specific servers?

@NormHenderson
Copy link

NormHenderson commented Aug 2, 2020

Not sure if it is related to the same root cause however: when xe sr-create specifies only type=nfs, it defaults to NFSv3 and will not negotiate NFSv4/NFSv4.1 share without adding device-config:nfsversion=4.1 (which isn't documented as far as I can tell). IMHO it should be starting with NFSv4.1 and negotiating downwards. (XCP-ng 8.1 connecting to nfs-kernel-server on Ubuntu 20.04)

@ondraknezour
Copy link

I had similar problem with FreeBSD NFS server set up with minimal NFS version 4 (vfs.nfsd.server_min_nfsvers=4 in the /etc/sysctl.conf file) [1].

I was told [2], that NFS v4 doesn't use RPC, so if support for older protocol version isn't needed, nfsd would not register with rcpbind, making function check_server_service in /opt/xensource/sm/nfs.py unreliable and invalid, because it checks for condition (nfs service in rpcinfo -s output) which is not always present.

[1] https://lists.freebsd.org/pipermail/freebsd-net/2021-January/057371.html
[2] https://lists.freebsd.org/pipermail/freebsd-net/2021-January/057372.html

@jcharaoui
Copy link

I've hit this bug with an NFSv4+ only server (eg. NFSv3 is disabled). XCP-ng is unable to add the SR because it depends on the presence of NFSv3 services.

@olivierlambert
Copy link
Member

Then I think it should be reported upstream ASAP :)

https://github.com/xapi-project/sm/issues

@TristisOris
Copy link

still actual for Huawei storage.

@olivierlambert
Copy link
Member

IIRC, this was fixed in a recent upstream SMAPI patch (but likely not yet available in XCP-ng. @stormi can you take a look where you are around? Thanks!

@stormi
Copy link
Member

stormi commented Nov 2, 2021

I don't remember commits that would address this, and the issue on the upstream repository got no answers from the devs.

Recent commits that are about NFS in sm are: xapi-project/sm@e121864 and xapi-project/sm@6fbff68 but I don't think they are related to this issue here.

benjamreis added a commit to xcp-ng/sm that referenced this issue May 2, 2023
Do not call `rpcinfo` nor `showmount` when
`device-config:nfsversion` is a NFSv4 version.

This modification requires the user to already
know `nfsversion>=4` and the `serverpath`.

See: xapi-project#551
And: xcp-ng/xcp#135

Signed-off-by: BenjiReis <benjamin.reis@vates.fr>
benjamreis added a commit to xcp-ng/sm that referenced this issue May 2, 2023
NFSv4 only environments don't support `rpcinfo`
and `showmount` calls as they're missing NFSv3
services.

Do not call `rpcinfo` nor `showmount` when
`device-config:nfsversion` is a NFSv4 version.

This modification requires the user to already
know `nfsversion>=4` and the `serverpath`.

See: xapi-project#551
And: xcp-ng/xcp#135

Signed-off-by: BenjiReis <benjamin.reis@vates.fr>
benjamreis added a commit to xcp-ng/sm that referenced this issue May 4, 2023
NFSv4 only environments don't support `rpcinfo`
and `showmount` calls as they're missing NFSv3
services.

Do not call `rpcinfo` nor `showmount` when
`device-config:nfsversion` is a NFSv4 version.

This modification requires the user to already
know `nfsversion>=4` and the `serverpath`.

See: xapi-project#551
And: xcp-ng/xcp#135

Signed-off-by: BenjiReis <benjamin.reis@vates.fr>
benjamreis added a commit to xcp-ng/sm that referenced this issue May 4, 2023
NFSv4 only environments don't support `rpcinfo`
and `showmount` calls as they're missing NFSv3
services.

Do not call `rpcinfo` nor `showmount` when
`device-config:nfsversion` is a NFSv4 version.

This modification requires the user to already
know `nfsversion>=4` and the `serverpath`.

See: xapi-project#551
And: xcp-ng/xcp#135

Signed-off-by: BenjiReis <benjamin.reis@vates.fr>
benjamreis added a commit to xcp-ng/sm that referenced this issue May 4, 2023
NFSv4 only environments don't support `rpcinfo`
and `showmount` calls as they're missing NFSv3
services.

When `rpcinfo` or `showmount` fails, try to mount
NFSv4 pseudo FS '/' to add '4' to supported NFS versions
and run `ls` on the mounted pseudo FS to offer the first
folder level of exports.

See: xapi-project#551
And: xcp-ng/xcp#135

Signed-off-by: BenjiReis <benjamin.reis@vates.fr>
benjamreis added a commit to xcp-ng/sm that referenced this issue May 5, 2023
NFSv4 only environments don't support `rpcinfo`
and `showmount` calls as they're missing NFSv3
services.

When `rpcinfo` or `showmount` fails, try to mount
NFSv4 pseudo FS '/' to add '4' to supported NFS versions
and run `ls` on the mounted pseudo FS to offer the first
folder level of exports.

See: xapi-project#551
And: xcp-ng/xcp#135

Signed-off-by: BenjiReis <benjamin.reis@vates.fr>
benjamreis added a commit to xcp-ng/sm that referenced this issue Jul 19, 2023
NFSv4 only environments don't support `rpcinfo`
and `showmount` calls as they're missing NFSv3
services.

When `rpcinfo` or `showmount` fails, try to mount
NFSv4 pseudo FS '/' to add '4' to supported NFS versions
and run `ls` on the mounted pseudo FS to offer the first
folder level of exports.

See: xapi-project#551
And: xcp-ng/xcp#135

Signed-off-by: BenjiReis <benjamin.reis@vates.fr>
benjamreis added a commit to xcp-ng/sm that referenced this issue Jul 19, 2023
NFSv4 only environments don't support `rpcinfo`
and `showmount` calls as they're missing NFSv3
services.

When `rpcinfo` or `showmount` fails, try to mount
NFSv4 pseudo FS '/' to add '4' to supported NFS versions
and run `ls` on the mounted pseudo FS to offer the first
folder level of exports.

See: xapi-project#551
And: xcp-ng/xcp#135

Signed-off-by: BenjiReis <benjamin.reis@vates.fr>
MarkSymsCtx pushed a commit to xapi-project/sm that referenced this issue Sep 25, 2023
NFSv4 only environments don't support `rpcinfo`
and `showmount` calls as they're missing NFSv3
services.

When `rpcinfo` or `showmount` fails, try to mount
NFSv4 pseudo FS '/' to add '4' to supported NFS versions
and run `ls` on the mounted pseudo FS to offer the first
folder level of exports.

See: #551
And: xcp-ng/xcp#135

Signed-off-by: BenjiReis <benjamin.reis@vates.fr>
@emanzx
Copy link

emanzx commented Jan 8, 2024

Any update on this? I have checked that the upstream already push a fix for this issues. but when I update my XCP-ng installation the file still not updated. So I take my own way and just replaced the driver file for NFSSR.py and nfs.py with the update. but still not working and xpc-ng center just giving me this error.
image
I really need NFS V4 to work as PETASAN instance only support NFS v4 and above for their NFS exports. Thanks.

@benjamreis
Copy link
Collaborator

Hi!

The fix wasn't available in XCP-ng 8.2.1, it'll be released soon but if you want to test it in advance: yum update sm sm-rawhba --enablerepo=xcp-ng-ci

Bear in mind it is a test build so not safe to run in production.
Regards :)

@emanzx
Copy link

emanzx commented Jan 8, 2024

Hi!

The fix wasn't available in XCP-ng 8.2.1, it'll be released soon but if you want to test it in advance: yum update sm sm-rawhba --enablerepo=xcp-ng-ci

Bear in mind it is a test build so not safe to run in production. Regards :)

Thanks for the update. I will try it with my test server. may I know when the next update that the fix will be commited?

@benjamreis
Copy link
Collaborator

It's currently in the CI phase of our pipeline, so if everything goes smoothly I'd say a couple weeks.
More if we find issues.

@viniciusferrao
Copy link

Any news on this one? I was actually surprised to see that NFSv4 only servers are an issue because XCP-ng manual states that NFS is preferred instead of iSCSI.

So I started the planning to move away from iSCSI and stumbled upon this issue.

@NormHenderson
Copy link

@viniciusferrao I had a very bad experience with XCP-ng storage on iSCSI. For the last 3 years I have been using nfs 4.1 without any real difficulties (some performance concerns when VMs boot and until they stabilize, but I was never able to narrow down the cause). I am on XCP-ng v.8.2 which has the option to select nfs v3 v4 or v4.1 when creating a new nfs SR. I also use option "hard" which has pros and cons, there are other threads here on that subject.

@viniciusferrao
Copy link

@viniciusferrao I had a very bad experience with XCP-ng storage on iSCSI. For the last 3 years I have been using nfs 4.1 without any real difficulties (some performance concerns when VMs boot and until they stabilize, but I was never able to narrow down the cause). I am on XCP-ng v.8.2 which has the option to select nfs v3 v4 or v4.1 when creating a new nfs SR. I also use option "hard" which has pros and cons, there are other threads here on that subject.

But is there any workaround today? Because I tried to mount the volume and was affected by the issue on this ticket.

My XCP-ng dates back to 2013 when I originally installed XenServer 6.2. I've been updating it since then. The same for the storage system that's FreeNAS (at the time) and now TrueNAS. The disk pool was created in early 2014. Since the beginning, this pool is iSCSI and I had very expensive workloads on it, like Exchange 2010 and later 2013 with 700 user accounts, more than a TB of iSCSI mailboxes on top of XenServer virtual disks.

And now I was moving to NFS, due to the cited recommendation and I'm unable to.

How to mount the NFS share? What's the workaround? TrueNAS does not enables NFSv3 and v4 at the same time.

Thanks.

@NormHenderson
Copy link

For me, it was just in Xen Orchestra:
select the pool
SR - create a new SR
Select storage type: NFS
Settings: Server (your NFS path) NFS version 4.1 NFS options (in my case, study the implications) hard
Similar process in XCP-ng Center.

However your question makes me wonder if you are even talking about an XCP-ng storage repositiory - possibly connecting to an NFS server from a VM? I do that too, from Linux at least it's standard mount -t nfs4, no magic.

@viniciusferrao
Copy link

viniciusferrao commented Feb 25, 2024

For me, it was just in Xen Orchestra: select the pool SR - create a new SR Select storage type: NFS Settings: Server (your NFS path) NFS version 4.1 NFS options (in my case, study the implications) hard Similar process in XCP-ng Center.

However your question makes me wonder if you are even talking about an XCP-ng storage repositiory - possibly connecting to an NFS server from a VM? I do that too, from Linux at least it's standard mount -t nfs4, no magic.

Yeah, this does not work. I'm affected by the bug on this thread. I thought you had a workaround for it. Your NFS server probably supports NFSv3 and v4 at the same time, which isn't my case.

@stormi
Copy link
Member

stormi commented Feb 26, 2024

Yes, the issue is when the NFS server doesn't advertise what it supports through rpcbind, which is a v3-only thing (rpcbing can report, although is not obligated to do so, also v4.x protocol versions, which explains why some users can select v4.x protocols when their server also supports v3).

We do have a fix for this, it is build, and is currently on a pre-release repository before it can be released to all users.

On XCP-ng 8.2, you can try it with:

yum update sm sm-rawhba --enablerepo=xcp-ng-ci,xcp-ng-testing,xcp-ng-candidates

Internal CI tests already ran successfully.

On XCP-ng 8.3, it should be already supported.

@viniciusferrao
Copy link

Yes, the issue is when the NFS server doesn't advertise what it supports through rpcbind, which is a v3-only thing (rpcbing can report, although is not obligated to do so, also v4.x protocol versions, which explains why some users can select v4.x protocols when their server also supports v3).

We do have a fix for this, it is build, and is currently on a pre-release repository before it can be released to all users.

On XCP-ng 8.2, you can try it with:

yum update sm sm-rawhba --enablerepo=xcp-ng-ci,xcp-ng-testing,xcp-ng-candidates

Internal CI tests already ran successfully.

On XCP-ng 8.3, it should be already supported.

Thank you @stormi. But may I ask if there's any timeline to it lands on stable channels? On 8.2.1 or 8.3?

@stormi
Copy link
Member

stormi commented Feb 26, 2024

On 8.2, it will go with the next train of updates, which is not scheduled yet. A few weeks maybe. It's already in XCP-ng 8.3, but 8.3 itself is still a (rather stable) beta.

@prilly-dev
Copy link

This bug still exists in xcp-ng 8.3

@stormi
Copy link
Member

stormi commented Nov 19, 2024

Please elaborate, as it's actually fixed from our point of view. It's likely you have a different albeit similar issue.

@prilly-dev
Copy link

What to say, attaching nfs share with v4 or v4.1 only works when the nfs share has v3 enabled, what you write earlier perfectly sums this issu up:

Yes, the issue is when the NFS server doesn't advertise what it supports through rpcbind, which is a v3-only thing (rpcbing can report, although is not obligated to do so, also v4.x protocol versions, which explains why some users can select v4.x protocols when their server also supports v3

Also worth noting this issue is occuring when attaching storage in xo also, with or without kerberose.

@stormi
Copy link
Member

stormi commented Nov 19, 2024

We have automated tests which precisely test a server which only has v4+ and no v3, so it's likely there's something else in the picture. @benjamreis how to debug this?

@benjamreis
Copy link
Collaborator

benjamreis commented Nov 19, 2024

Probably sharing the error gotten while trying to probe or create the SR would be a good start -- even better the corresponding logs in xensource.log and SMlog 👍

@prilly-dev
Copy link

Give me some time, i will post the logs latertoday

@prilly-dev
Copy link

prilly-dev commented Nov 20, 2024

This is log from XO storage when V4 and v4.1 only share in works:

remote.test
{
"id": "a74654a5-509b-4d6b-8a42-06e5713ed882

"
}
{
"shortMessage": "Command failed with exit code 32: mount -o port=2049 -t nfs 172.16.10.10:/nfs/backup /run/xo-server/mounts/a74654a5-509b-4d6b-8a42-06e5713ed882

",
"command": "mount -o port=2049 -t nfs 172.16.10.10:/nfs/backup /run/xo-server/mounts/a74654a5-509b-4d6b-8a42-06e5713ed882

",
"escapedCommand": "mount -o "port=2049" -t nfs "172.16.10.10:/nfs/backup" "/run/xo-server/mounts/a74654a5-509b-4d6b-8a42-06e5713ed882

"",
"exitCode": 32,
"stdout": "",
"stderr": "mount.nfs: Protocol not supported",
"failed": true,
"timedOut": false,
"isCanceled": false,
"killed": false,
"message": "Command failed with exit code 32: mount -o port=2049 -t nfs 172.16.10.10:/nfs/backup /run/xo-server/mounts/a74654a5-509b-4d6b-8a42-06e5713ed882

mount.nfs: Protocol not supported",
"name": "Error",
"stack": "Error: Command failed with exit code 32: mount -o port=2049 -t nfs 172.16.10.10:/nfs/backup /run/xo-server/mounts/a74654a5-509b-4d6b-8a42-06e5713ed882

mount.nfs: Protocol not supported
at makeError (/etc/xen-orchestra/node_modules/execa/lib/error.js:60:11)
at handlePromise (/etc/xen-orchestra/node_modules/execa/index.js:118:26)
at NfsHandler._sync (/etc/xen-orchestra/@xen-orchestra/fs/src/_mount.js:68:7)"
}

This is log from a NFS SR attached with V3 V4 and V4.1 enabled, then disabled V3 and did a rescan of the SR

sr.scan
{
"id": "7a89bd71-8635-173f-54de-19684d061d4f"
}
{
"code": "SR_BACKEND_FAILURE_47",
"params": [
"",
"The SR is not available [opterr=no such directory /var/run/sr-mount/7a89bd71-8635-173f-54de-19684d061d4f]",
""
],
"task": {
"uuid": "28f35853-1149-45ed-ca17-ad7ae65a8082

",
"name_label": "Async.SR.scan",
"name_description": "",
"allowed_operations": [],
"current_operations": {},
"created": "20241120T18:21:50Z",
"finished": "20241120T18:21:50Z",
"status": "failure",
"resident_on": "OpaqueRef:a2ff60a3-d6ce-465b-874c-be3d797ba33a",
"progress": 1,
"type": "",
"result": "",
"error_info": [
"SR_BACKEND_FAILURE_47",
"",
"The SR is not available [opterr=no such directory /var/run/sr-mount/7a89bd71-8635-173f-54de-19684d061d4f]",
""
],
"other_config": {},
"subtask_of": "OpaqueRef:NULL",
"subtasks": [],
"backtrace": "(((process xapi)(filename lib/backtrace.ml)(line 210))((process xapi)(filename ocaml/xapi/storage_access.ml)(line 36))((process xapi)(filename ocaml/libs/xapi-stdext/lib/xapi-stdext-pervasives/pervasiveext.ml)(line 24))((process xapi)(filename ocaml/libs/xapi-stdext/lib/xapi-stdext-pervasives/pervasiveext.ml)(line 39))((process xapi)(filename ocaml/xapi/message_forwarding.ml)(line 143))((process xapi)(filename ocaml/libs/xapi-stdext/lib/xapi-stdext-pervasives/pervasiveext.ml)(line 24))((process xapi)(filename ocaml/libs/xapi-stdext/lib/xapi-stdext-pervasives/pervasiveext.ml)(line 39))((process xapi)(filename ocaml/xapi/rbac.ml)(line 191))((process xapi)(filename ocaml/xapi/rbac.ml)(line 200))((process xapi)(filename ocaml/xapi/server_helpers.ml)(line 75)))"
},
"message": "SR_BACKEND_FAILURE_47(, The SR is not available [opterr=no such directory /var/run/sr-mount/7a89bd71-8635-173f-54de-19684d061d4f], )",
"name": "XapiError",
"stack": "XapiError: SR_BACKEND_FAILURE_47(, The SR is not available [opterr=no such directory /var/run/sr-mount/7a89bd71-8635-173f-54de-19684d061d4f], )
at Function.wrap (file:///etc/xen-orchestra/packages/xen-api/_XapiError.mjs:16:12)
at default (file:///etc/xen-orchestra/packages/xen-api/_getTaskResult.mjs:13:29)
at Xapi._addRecordToCache (file:///etc/xen-orchestra/packages/xen-api/index.mjs:1047:24)
at file:///etc/xen-orchestra/packages/xen-api/index.mjs:1081:14
at Array.forEach ()
at Xapi._processEvents (file:///etc/xen-orchestra/packages/xen-api/index.mjs:1071:12)
at Xapi._watchEvents (file:///etc/xen-orchestra/packages/xen-api/index.mjs:1244:14)"
}

There might be a possibility that this error is caused by a issue in QNAP QTS version 5.2.1, this is unconfirmed but some googling indicates QNAP is crap as usual, i will test this with a dell powerstore and see if this is storage related, as i realy suspect now after testing

@benjamreis
Copy link
Collaborator

Hi,

Thx for the logs - unfortunantely XO doesn't provide all the necessay info of the error as its only a client of th XAPI.
What I asked was the returns of the xe sr-probe and sr-create calls and the log in /var/log/xensource.log /var/log/SMlog
corresponding to the call.

The error does sm to indicate the mount is attempted on NFS3 for som reason... While you gather the logs i askd i'll take a look at the code again but as mentioned by @stormi - our CI does have a NFS4+ only tests that run successfully.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests