Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request and offer to PR: provide a "no netlink" mode for libtcmu #678

Open
geordieintheshellcode opened this issue Nov 20, 2021 · 3 comments

Comments

@geordieintheshellcode
Copy link

Some folks might be using libtcmu in their own applications which don't hook into TCMU runner, much like the consumer.c example. In those cases, it can be useful to avoid using netlink altogether. As I'm sure you are aware, it's almost impossible to have more than one user of TCMU when netlink is involved. This is because the kernel broadcasts device events to all TCMU listeners over netlink and simply accepts the first response it receives. This can lead to the following scenario:

  1. Application A creates a TCMU backstore with handler=app_a
  2. The kernel (target_core_user) broadcasts the netlink message to all instances of libtcmu
  3. Application B receives the message first. Application B doesn't have a handler for app_a, so it replies with an error code to the kernel.
  4. The kernel receives the error and aborts backstore creation.
  5. Application A wonders why its backstore creation failed.

Users can avoid using netlink altogether by setting nl_reply_supported=-1 at backing store creation time. This is great as multiple applications using TCMU can now coexist on the same machine. The problem is we don't have a nice way to utilise libtcmu in this model. We need a way to notify libtcmu that a device has been created, reconfigured, removed so the relevant handlers can be called. Currently, this is done via netlink.

Would you accept a PR that allows users to utilise all the great features of libtcmu, but without using netlink? The PR would be roughly as follows:

  • modify tcmulib_initialize to take a boolean parameter which informs libtcmu whether or not it should set up netlink. If true we do. If false we don't.
  • expose new methods tcmulib_notify_device_added, tcmulib_notify_device_removed and tcmulib_notify_device_reconfiged, which application code should call to inform libtcmu that a device has been added, removed, or reconfigured. These API's take the dev_name and pass through to the existing device_add, device_remove and device_reconfig functions in libtcmu.c. tcmulib_notify_device_reconfiged will need additional parameters to specify the new config.
  • updates the documentation to reflect these changes and notes that tcmulib_master_fd_ready and tcmulib_get_master_fd should not be called if you chose to call tcmulib_initialize in "don't use netlink mode".

This change benefits all users of libtcmu who are managing their own devices and who would like to be able to run multiple instances of their app on one machine and/or coexist with other users of libtcmu/tcmu-runner.

Great work with the library and let me know if this feature sounds interesting!

@lxbsz
Copy link
Collaborator

lxbsz commented Nov 22, 2021

@geordieintheshellcode

This sounds reasonable to me.

BTW, AFAIK currently the target_core_user.c only supports the netlink method, if you want to support this, you need to code the target_core_user.c first and possibly you could do just as the kernel nbd.c driver, which also supports ioctl, does.

@geordieintheshellcode
Copy link
Author

This patch allows us to disable netlink on "per-device" basis. It was merged into the kernel mainline on Sep 13 14:01:22 2017 +0900:

commit b849b4567549d5a54ab34ffacfd48fca05e8b34e
Author: Kenjiro Nakayama <nakayamakenjiro@gmail.com>
Date:   Wed Sep 13 14:01:22 2017 +0900

    target: Add netlink command reply supported option for each device

    Currently netlink command reply support option
    (TCMU_ATTR_SUPP_KERN_CMD_REPLY) can be enabled only on module
    scope. Because of that, once an application enables the netlink
    command reply support, all applications using target_core_user.ko
    would be expected to support the netlink reply. To make matters worse,
    users will not be able to add a device via configfs manually.

    To fix these issues, this patch adds an option to make netlink command
    reply disabled on each device through configfs. Original
    TCMU_ATTR_SUPP_KERN_CMD_REPLY is still enabled on module scope to keep
    backward-compatibility and used by default, however once users set
    nl_reply_supported=<NAGATIVE_VALUE> via configfs for a particular
    device, the device disables the netlink command reply support.

    Signed-off-by: Kenjiro Nakayama <nakayamakenjiro@gmail.com>
    Reviewed-by: Mike Christie <mchristi@redhat.com>
    Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>

When users create the backstore they can set nl_reply_supported=-1 via configfs and this disables netlink for that backstore. So looks like we are all good! I've PoC'd this on my machine and it works great. I'll raise a PR some time in the next week hopefully.

@lxbsz
Copy link
Collaborator

lxbsz commented Nov 22, 2021

Sounds nice.

geordieintheshellcode pushed a commit to ondat/tcmu-runner that referenced this issue Feb 20, 2023
Some folks might be using libtcmu in their own applications which don't
hook into TCMU runner, much like the consumer.c example. In those cases,
it can be useful to avoid using netlink altogether. As I'm sure you are
aware, it's almost impossible to have more than one user of TCMU when
netlink is involved. This is because the kernel broadcasts device events
to all TCMU listeners over netlink and simply accepts the first response
it receives. This can lead to the following scenario:

1. Application A creates a TCMU backstore with handler=app_a
2. The kernel (target_core_user) broadcasts the netlink message to all
instances of libtcmu
3. Application B receives the message first. Application B doesn't have a
handler for app_a, so it replies with an error code to the kernel.
4. The kernel receives the error and aborts backstore creation.
5. Application A wonders why its backstore creation failed.

Users can avoid using netlink altogether by setting
nl_reply_supported=-1 at backing store creation time. This is great as
multiple applications using TCMU can now coexist on the same machine.
The problem is we don't have a nice way to utilise libtcmu in this
model. We need a way to notify libtcmu that a device has been created,
reconfigured, removed so the relevant handlers can be called. Currently,
this is done via netlink.

This PR allows users to utilise all the great features of libtcmu, but
without using netlink. The PR makes the following changes:

* modify tcmulib_initialize to take a boolean parameter which informs
libtcmu whether or not it should set up netlink. If true we do. If false
we don't.
* expose new methods tcmulib_notify_device_added,
tcmulib_notify_device_removed and tcmulib_notify_device_reconfiged,
which application code should call to inform libtcmu that a device has
been added, removed, or reconfigured. These API's take the dev_name and
pass through to the existing device_add, device_remove and
device_reconfig functions in libtcmu.c. tcmulib_notify_device_reconfiged
will need additional parameters to specify the new config.
* updates the documentation to reflect these changes and notes that
tcmulib_master_fd_ready and tcmulib_get_master_fd should not be called
if you chose to call tcmulib_initialize in "don't use netlink mode".
* adds an example application consumer_no_netlink which shows how to
create a standalone application using libtcmu without netlink.

Issue open-iscsi#678. PR open-iscsi#694

Signed-off-by: Xiubo Li <xiubli@redhat.com>
geordieintheshellcode pushed a commit to ondat/tcmu-runner that referenced this issue Feb 24, 2023
Some folks might be using libtcmu in their own applications which don't
hook into TCMU runner, much like the consumer.c example. In those cases,
it can be useful to avoid using netlink altogether. As I'm sure you are
aware, it's almost impossible to have more than one user of TCMU when
netlink is involved. This is because the kernel broadcasts device events
to all TCMU listeners over netlink and simply accepts the first response
it receives. This can lead to the following scenario:

1. Application A creates a TCMU backstore with handler=app_a
2. The kernel (target_core_user) broadcasts the netlink message to all
instances of libtcmu
3. Application B receives the message first. Application B doesn't have a
handler for app_a, so it replies with an error code to the kernel.
4. The kernel receives the error and aborts backstore creation.
5. Application A wonders why its backstore creation failed.

Users can avoid using netlink altogether by setting
nl_reply_supported=-1 at backing store creation time. This is great as
multiple applications using TCMU can now coexist on the same machine.
The problem is we don't have a nice way to utilise libtcmu in this
model. We need a way to notify libtcmu that a device has been created,
reconfigured, removed so the relevant handlers can be called. Currently,
this is done via netlink.

This PR allows users to utilise all the great features of libtcmu, but
without using netlink. The PR makes the following changes:

* modify tcmulib_initialize to take a boolean parameter which informs
libtcmu whether or not it should set up netlink. If true we do. If false
we don't.
* expose new methods tcmulib_notify_device_added,
tcmulib_notify_device_removed and tcmulib_notify_device_reconfiged,
which application code should call to inform libtcmu that a device has
been added, removed, or reconfigured. These API's take the dev_name and
pass through to the existing device_add, device_remove and
device_reconfig functions in libtcmu.c. tcmulib_notify_device_reconfiged
will need additional parameters to specify the new config.
* updates the documentation to reflect these changes and notes that
tcmulib_master_fd_ready and tcmulib_get_master_fd should not be called
if you chose to call tcmulib_initialize in "don't use netlink mode".
* adds an example application consumer_no_netlink which shows how to
create a standalone application using libtcmu without netlink.

Issue open-iscsi#678. PR open-iscsi#694

Signed-off-by: Alex Reid <alex.reid@storageos.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants