-
Notifications
You must be signed in to change notification settings - Fork 142
reqh: md device service type is not M0_CST_IOS #1388
Conversation
4dc7c19
to
7f5795a
Compare
According to @mssawant this patch should fix the issue I'm seeing now with the latest hare+motr:
Meanwhile, it works fine on one of the previous hare commits suggested by @mssawant - 83dcfe2. |
@mssawant, can you elaborate a bit - why m0_reqh_mdpool_service_index_to_session() |
@andriytk, earlier, even though Hare CDF had a provision to specify metadata device, the device was not part of the pool devices and configuration. Process's metadata device was set to the given metadata device.
Before Seagate/cortx-hare#1888, a dummy (/dev/null) device was used for CAS service, now we create a separate ConfDrive object specified by the
Now, although associated with the process, we create a separate ConfDrive object for the given metadata device in the CDF explicitly, associated with CAS service type and that is also the part of the configuration.
So as you see earlier, a ConfDrive object was not getting created for metadata device specified in CDF and thus was not part of the configuration, CAS device was set to /dev/null. Frankly, I am not sure how things worked, it was using the first data device attached to the ioservice. |
CAS device mapping from old hare cfgen code in confd.xc,
CAS device mapping in latest Hare main by cfgen in confd.xc
for the same CDF,
|
@mssawant : There are two types of meta-data, motr internal meta-data which is stored in cobs called mdcob in ioservice, which is stored in BE seg1 associated with the ioservice. When ioservice and CAS service are part of same m0d, they share the BE seg1. For motr clients other than S3, we need both of them. |
@@ -213,7 +213,7 @@ m0_reqh_mdpool_service_index_to_session(const struct m0_reqh *reqh, | |||
pd_sdev_idx; | |||
ctx = md_pv->pv_pc->pc_dev2svc[idx].pds_ctx; | |||
M0_ASSERT(ctx != NULL); | |||
M0_ASSERT(ctx->sc_type == M0_CST_IOS); | |||
M0_ASSERT(ctx->sc_type == M0_CST_CAS); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be IOS only, MDPOOL associated with IOS needs to have dummy devices or first device in each m0d to use mdcobs associated with the m0d's.
They are not related to CAS service.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suppose if CAS service is part of separate m0d where m0d isn ot present, this won't work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also even if this change is there, issue is still coming up,
# ./build-deploy/cortx-motr/utils/m0cp -l inet:tcp:10.230.250.75@5001 -H inet:tcp:10.230.250.75@2001 -p 0x7000000000000001:0x4f -P 0x7200000000000001:0x29 -o 12:39 -s
1m -c 1 -L 9 /tmp/128M
motr[126663]: 4cf0 ALWAYS [client_init.c:468:client_net_init] trasnport ep:inet:tcp:10.230.250.75@5001
motr[126663]: 4650 FATAL [lib/assert.c:50:m0_panic] panic: (cas_svc->sc_type == M0_CST_CAS) at dix_cas_rops_send() (dix/req.c:1749) [git: 2.0.0-527-112-g409f711-dirty] /root/m0trace.126663
Motr panic```
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@madhavemuri, so, we have 2 devices then,
- CAS device, is that the meta_data device mentioned in the CDF?
- IOS md device, there's no other mechanism to specify IOS metadata device and one device cannot be associated with multiple service types.
MDPOOL associated with IOS needs to have dummy devices or first device in each m0d to use mdcobs associated with the m0d's.
- dummy device - how will it be used?
- first device of ioservice - that means it is same as the data device and will share space with data. Is that what we want?, will the first device assumption always hold true? or do we want an explicit mechanism to mention the CAS device and IOS md device separately?
Also even if this change is there, issue is still coming up,
@madhavemuri, This means a wrong CAS device is being accessed. It seems, the problem is in the clear differentiation and assignment of metadata devices, which device must be used for CAS and which for IOS metadata.
@mssawant @supriyachavan4398 : After above changes in cfgen m0cp is working fine. |
@madhavemuri, the patch you mentioned is good if metadata device is not specified in CDF then cfgen is presently not creating a default ConfDrive object for CAS (i.e.
This is because the default metadata device is now created for M0_CST_CAS service type and not for M0_CST_IOS service type. So we need #1388 as well. |
I understand that there are 2 types of metadata, but there are no 2 types of metadata devices. But if CAS device is required and by default we set it to /dev/null, how does that work? I mean if S3 uses it to create objects how are they referred back? |
Yes, we need M0_BE_CST service, where along with CAS service the meta-data device needs to be addded. |
@mssawant : I think this PR is not needed as hare PR 1952 is landed, Optimizations related to BE service can be taken up in a separate task. |
@madhavemuri, as I mentioned above even with #1952, if a metadata device is specified in CDF then we see,
|
This issue/pull request has been marked as |
Metdata pool expects a device from each ioservice in the cluster, as it uses cob domain and in turn need to refer ioservice to create mdcobs.
@mssawant : I have check confd.xc, looks like with recent changes this pattern is changed, causing above issue. |
While generating confguration, Hare assigns metadata device to CAS service type, M0_CST_CAS. But m0_reqh_mdpool_service_index_to_session() expects it to be M0_CST_IOS and asserts the same. Solution: Expect metadata device service type to be M0_CST_CAS instead of M0_CST_IOS. Signed-off-by: Mandar Sawant <mandar.sawant@seagate.com>
7f5795a
to
97cc2d1
Compare
@madhavemuri, as discussed we will use this patch and refine the fix to use M0_CST_BE as a service type for meta data devices used in CAS and IOS m0ds in a separate patch, presently motr clients work fine with this patch, tested with latest Hare main and motr main + pr 1388,
I have updated the patch with the comment as discussed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Device associated with the BE service of ios m0d needs to be added to mdpool, otherwise if we have a separate m0d without ioservice, it will fail.
A todo is added for it.
Thanks for your contribution to CORTX! 🎉 |
While generating configuration, Hare assigns metadata device (BE seg1) to CAS service type, M0_CST_CAS. But m0_reqh_mdpool_service_index_to_session() expects it to be M0_CST_IOS and asserts the same. Solution: Expect metadata device service type to be M0_CST_CAS instead of M0_CST_IOS. And in future this will be updated to M0_CST_BE. Signed-off-by: Mandar Sawant <mandar.sawant@seagate.com> Signed-off-by: Atul Deshmukh <atul.deshmukh@seagate.com>
While generating configuration, Hare assigns metadata device (BE seg1) to CAS service type, M0_CST_CAS. But m0_reqh_mdpool_service_index_to_session() expects it to be M0_CST_IOS and asserts the same. Solution: Expect metadata device service type to be M0_CST_CAS instead of M0_CST_IOS. And in future this will be updated to M0_CST_BE. Signed-off-by: Mandar Sawant <mandar.sawant@seagate.com> Signed-off-by: Atul Deshmukh <atul.deshmukh@seagate.com>
@mssawant. m0cp is dumping core(similar assert) with latest Motr(1e9feed) and Hare(b617d0875ca45a2fed493891ca4a09d03808001f). (gdb) bt |
This reverts commit 6d6c31c.
While generating confguration, Hare assigns metadata device to CAS
service type, M0_CST_CAS. But m0_reqh_mdpool_service_index_to_session()
expects it to be M0_CST_IOS and asserts the same.
Solution:
Expect metadata device service type to be M0_CST_CAS instead
of M0_CST_IOS.
Signed-off-by: Mandar Sawant mandar.sawant@seagate.com
Problem Statement
Design
Coding
Checklist for Author
Testing
Checklist for Author
Impact Analysis
Checklist for Author/Reviewer/GateKeeper
Review Checklist
Checklist for Author
Documentation
Checklist for Author