Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configured subscription integration #1440

Open
wants to merge 8 commits into
base: udp_notif
Choose a base branch
from

Conversation

jeremie6wind
Copy link
Contributor

Hello,
Several commits to handle configured subscriptions.
To test this, it was necessary to make manual tests:

  • create/modify/delete receivers, play with remote port
  • create/modify/delete subscriptions, play with with source address, receiver reference, stop time, subscription type.
    Please could you give your feedback?
    Many thanks

Copy link
Member

@michalvasko michalvasko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I have commented only on minor/general issues I have noticed, most of them apply for many cases, not just where I have put the comment. After those are fixed, I will test the actual functionality and provide feedback for it.

src/configured_subscribed_notifications.c Outdated Show resolved Hide resolved
src/configured_subscribed_notifications.c Outdated Show resolved Hide resolved
src/configured_subscribed_notifications.c Outdated Show resolved Hide resolved
src/configured_subscribed_notifications.c Outdated Show resolved Hide resolved
src/configured_subscribed_notifications.c Outdated Show resolved Hide resolved
src/netconf_subscribed_notifications.c Show resolved Hide resolved
src/netconf_subscribed_notifications.h Outdated Show resolved Hide resolved
src/netconf_subscribed_notifications.h Outdated Show resolved Hide resolved
src/receivers.h Outdated Show resolved Hide resolved
src/subscribed_notifications.c Outdated Show resolved Hide resolved
@jeremie6wind jeremie6wind force-pushed the configured-subscription-integration branch from ffdc083 to 9af4e86 Compare July 20, 2023 12:23
@jeremie6wind
Copy link
Contributor Author

jeremie6wind commented Jul 20, 2023

Hello,
Thank you for for your feedback.
These commits fix all the points you mentioned:

  • add the code in the same files instead of duplicating.
  • use uncrustify.
  • rename enum
    Best regards,
    Jeremie

Copy link
Member

@michalvasko michalvasko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After brief look it seems fine and neater than before. Sadly, I am not able to try it out and provide feedback for the actual functionality because I am going on a vacation for 2 weeks, I will respond after that.

Copy link
Member

@michalvasko michalvasko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I am back and wanting to test this, why not create some new tests? The code seems finished, so tests would definitely be required, at least some basic ones. Thanks.

@jeremie6wind jeremie6wind force-pushed the configured-subscription-integration branch from 9af4e86 to a9f5508 Compare August 17, 2023 07:48
@jeremie6wind
Copy link
Contributor Author

Hi,
I added a test in test_sub_ntf_advanced.c
It helped me fix some issues.
However, I had 2 things I was not sure:

  1. In yang-push.c sr_get_data in yang_push_notif_update_send failed for configured yang push.
  2. When doing a reset of a receiver, I was not sure how to do it properly. Especially when Netopeer2 checks the operational data.
    so i added a test in np2srv_oper_sub_ntf_subscriptions_cb.
    Thank you.
    Best regards.
    Jeremie

@michalvasko
Copy link
Member

Firstly, I have rebased this branch to the current devel, you will need the same branch of sysrepo for it to compile. There could have been some useful fixes, more below. However, I have accidentally also merged the commits from this PR when resolving the conflicts so you may want to reverse that, sorry about that. Regarding your questions:

  1. Any more details? Perhaps there is a lock problem, similar issue has been recently fixed so you just try again now.
  2. What you are doing is fine except the notification subscription-started should be sent when moving the subscription from connecting to active state, I do not see it in csn_receiver_reset() unless I missed something. Other than that, operational data will not be readable because you are holding INFO_WLOCK for the whole duration of the action callback. Ideally, you should release it after setting the state to connecting and then reacquire it (safely, being aware that the info structure could have changed in the meantime) to set it to active.

So, thanks for the tests and everything, I have made some minor changes to be able to compile it and the tests produce some meaningful output so please commit it with some of your commits (the author is not important).

I think you can create a separate test file for these tests because there should be several of them. I thought you are actually going to test sending and receiving the notifications but I assume you did that only with your proprietary notification receiver because there is none in netopeer2. Which brings me to my (hopefully) last larger request, add support for notification receiver somewhere to our projects so that it can be properly tested, both manually and in the test suite (so that there is not just setting and reading the configuration). Ideally into libnetconf2 and then it can be used with ease in both cases (netopeer2 test suite and netopeer2-cli).

Other than that I have briefly went over the code and it seemed fine, no major changes will be required, I think. But I will not know unless I can actually see the notification being send and received. Thanks again for all your effort.

@jeremie6wind jeremie6wind force-pushed the configured-subscription-integration branch from 68340fc to 6c4e50e Compare August 23, 2023 15:48
@jeremie6wind
Copy link
Contributor Author

jeremie6wind commented Aug 23, 2023

Hi,
About my issues of last time:

  1. I would say something is missing in the initialization of the sysrepo session used for configured yang push.
  2. I added some subscription-started in the code.

As you suggested, I reverted the commit with the test.
I added your improvements.
Then I added a new test test_configured_subscriptions.
The test itself starts unyte udp collectors, that receive notifications.
So netopeer2-server sends notifications and the test parses the notifications.
However, there are still some memory leaks in the test due to unyte_udp_collector.

One implementation of the notification receiver is https://github.com/pmacct/pmacct/tree/master.
This implementation uses https://github.com/network-analytics/udp-notif-c-collector.

Copy link
Member

@michalvasko michalvasko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am about to run all the tests and try to go through all the code as it is executed but in the meantime you can address the comment I made and, in case you really do not know how to solve a problem, can add a test that is failing. I can then fix it much more easily than based on a description only (regarding the problem 2.).

src/netconf_subscribed_notifications.c Outdated Show resolved Hide resolved
@michalvasko
Copy link
Member

Sorry, but the test is failing for me

[==========] tests: Running 12 test(s).
Address type: IPv6 | 12345
Address type: IPv6 | 12346
Address type: IPv6 | 12347
[ RUN      ] test_configured_subscriptions_receivers
[       OK ] test_configured_subscriptions_receivers
[ RUN      ] test_configured_subscriptions_receivers_modif
[       OK ] test_configured_subscriptions_receivers_modif
[ RUN      ] test_configured_subscriptions_add
tc_12345.xml_data[0][0] != '\0'
[   LINE   ] --- /home/vasko/Documents/netopeer2/tests/test_configured_subscriptions.c:768: error: Failure!Aborted (core dumped)

and I found no simple way of fixing it, seems no notifications are being received for some reason. Other than that, they could use some improvements. Mainly using pthread barriers, for example, for exact synchronization instead of fixed usleep(). Other than that, I think there are way too many few-line functions but that does not matter much, I can fix that afterwards if I really think it necessary.

So, please make sure the tests pass. If they do for you, I should be able to find out what is wrong with some assistance.

@jeremie6wind jeremie6wind force-pushed the configured-subscription-integration branch from 6c4e50e to 57f2ad4 Compare August 24, 2023 14:06
@jeremie6wind
Copy link
Contributor Author

Hi Mishal,
Thank you for your reply.
I put the csn_send_notif_one inside the 'if' as you suggested.
I changed the logs to "Could not send notification ."
Also I added a commit that makes the test configured yang push fail.

However you may see it once the other tests pass.

The tests pass on my machine except with valgrind, as I explained earlier.
I have no idea why the tests fail on your side. I can help.
I have to let you know that I won't be available next week.

@jeremie6wind
Copy link
Contributor Author

FYI On my side I get:
Address type: IPv4 | 12345
Address type: IPv4 | 12346
Address type: IPv4 | 12347

@michalvasko
Copy link
Member

Thanks for the changes.

Address type: IPv4 | 12345

That is interesting, any reason why it would not work on IPv6? Because it seems that no notifications are being received.

I have to let you know that I won't be available next week.

No problem, me neither, actually.

@jeremie6wind jeremie6wind force-pushed the configured-subscription-integration branch from 57f2ad4 to e043020 Compare September 4, 2023 15:43
@jeremie6wind
Copy link
Contributor Author

Ok so the udp collector has also 127.0.0.1 local address in the test.
The test is fully ipv4 so far.
usleep have been replaced by pthread_barrier.

@michalvasko
Copy link
Member

I have tested the latest changes and this is the test output

[==========] tests: Running 12 test(s).
Address type: IPv4 | 12345
Address type: IPv4 | 12346
Address type: IPv4 | 12347
[ RUN      ] test_configured_subscriptions_receivers
[       OK ] test_configured_subscriptions_receivers
[ RUN      ] test_configured_subscriptions_receivers_modif
[       OK ] test_configured_subscriptions_receivers_modif
[ RUN      ] test_configured_subscriptions_add
[       OK ] test_configured_subscriptions_add
[ RUN      ] test_configured_subscriptions_reset
[       OK ] test_configured_subscriptions_reset
[ RUN      ] test_configured_subscriptions_modif
[       OK ] test_configured_subscriptions_modif
[ RUN      ] test_configured_subscriptions_modif2
[       OK ] test_configured_subscriptions_modif2
[ RUN      ] test_configured_subscriptions_modif3
[       OK ] test_configured_subscriptions_modif3
[ RUN      ] test_configured_subscriptions_yang_push
[       OK ] test_configured_subscriptions_yang_push
[ RUN      ] test_configured_subscriptions_yang_push_modif
[       OK ] test_configured_subscriptions_yang_push_modif
[ RUN      ] test_configured_subscriptions_back
[       OK ] test_configured_subscriptions_back
[ RUN      ] test_configured_subscriptions_remove
[       OK ] test_configured_subscriptions_remove
[ RUN      ] test_configured_subscriptions_remove_receivers
[       OK ] test_configured_subscriptions_remove_receivers
[ERR] Waiting on a conditional variable failed (sr_remove_modules: Connection timed out).
sr_remove_module() failed (Timeout expired)
[  FAILED  ] GROUP TEARDOWN
[  ERROR   ] tests
[==========] tests: 12 test(s) run.
[  PASSED  ] 12 test(s).

but to get there several fixes were required. I have changed the subscription to configuration changes to support failing with an error because that is what had been happening to me but the error was ignored. I have fixed the error, though, but I think it will be better this way. Question is whether not to remove SR_SUBSCR_DONE_ONLY from all the subscriptions so they can properly fail on any error (I have used it before just to simplify things but I probably shouldn't).

Next, I did some refactoring and additional error checking. The latter could definitely use some improvements so that all the callbacks return proper error to the client and ideally print it as well. Naturally, no errors should be ignored either, so please fix all that.

I tried to look at the current problem but the tests need some work before they can be nicely debugged. But I did notice that there was still a timer thread running in the server so there is likely a problem with stopping the subscriptions. Also, the tests must be independent of each other, each must pass separately, and not rely on any particular setup (the error I had before in the 8th test was caused by some missing data in sysrepo).

Finally, some more refactoring would be nice, to use static functions where applicable and document them all. The files also got quite big so it may make sense to separate some code to another file or simply refactor it so that one can still make sense of it. But if you are fine with it, I can finish this myself after everything works as it should. Again, thanks for the effort.

@jeremie6wind
Copy link
Contributor Author

Thank you for the feedback.
I am currently improving the code accordingly.

However, regarding the timer_helper_thread, Isn't it normal to have this timer thread running. Maybe it is due to the implementation of timer_* functions?
According to my tests, as soon as it starts using timer_create, this thread appears.
The thread doesn't disappear even after timer_delete.

@michalvasko
Copy link
Member

According to my tests, as soon as it starts using timer_create, this thread appears.
The thread doesn't disappear even after timer_delete.

Okay, you have looked at it more than I have, seems fine then.

@jeremie6wind jeremie6wind force-pushed the configured-subscription-integration branch from e043020 to 5a1eada Compare September 8, 2023 07:34
@jeremie6wind
Copy link
Contributor Author

Hi,
In these new commits:
I add the fixes that you provided.
However one thing was not working on my side with:
if (event != SR_EV_CHANGE) {
the callback np2srv_config_subscriptions_cb was never called.
I had to replace by:
if (event != SR_EV_DONE) {

I added the missing return codes and logs, so no error is ignored anymore.
I made the tests independent from each others.
I reviewed the udp notifications and now I check the content of most of them.
Some of the csn_* functions are now static with their documentation header.

CMakeLists.txt is also updated. Thus we can use a custom udp-notif-c-collector library.

@michalvasko
Copy link
Member

However one thing was not working on my side with:
if (event != SR_EV_CHANGE) {
the callback np2srv_config_subscriptions_cb was never called.
I had to replace by:
if (event != SR_EV_DONE) {

What exactly means "not working"? The only difference it could make is that when the callback is called, some module locks are still held and can potentially block API calls you make from the callback. The other being the ability to return a proper error.

So, I wanted to test this but it is not even possible to compile, you left out some final changes of a test

[ 68%] Building C object tests/CMakeFiles/test_configured_subscriptions.dir/test_configured_subscriptions.c.o
/home/vasko/Documents/netopeer2/tests/test_configured_subscriptions.c: In function ‘delete_subscription_config’:
/home/vasko/Documents/netopeer2/tests/test_configured_subscriptions.c:274:5: error: ‘result’ undeclared (first use in this function)
  274 |     result = strcmp(st->str, EMPTY_GETCONFIG);
      |     ^~~~~~
/home/vasko/Documents/netopeer2/tests/test_configured_subscriptions.c:274:5: note: each undeclared identifier is reported only once for each function it appears in
/home/vasko/Documents/netopeer2/tests/test_configured_subscriptions.c:278:5: error: ‘result2’ undeclared (first use in this function)
  278 |     result2 = strcmp(st->str, EMPTY_GET);

I have tried to fix it the way I thought was intended but it did not work, so please finish it.

@jeremie6wind jeremie6wind force-pushed the configured-subscription-integration branch from 5a1eada to 1832680 Compare September 11, 2023 09:44
@jeremie6wind
Copy link
Contributor Author

jeremie6wind commented Sep 11, 2023

Sorry, it seems I made some mistakes.
I could test again with if (event != SR_EV_CHANGE) { and it worked fine.
Also I fixed the build issue.

@jeremie6wind jeremie6wind force-pushed the configured-subscription-integration branch 2 times, most recently from 3d4aea3 to b9907d9 Compare September 11, 2023 14:27
@michalvasko
Copy link
Member

When I try to run all the tests, they get stuck at the global_teardown() module removal and time out. Do they work for you? It seems context read locks are not being released by the server, there are 5 of them at this point. Also, the test itself seems to have up to 6 leftover threads of the UDP notif library, performing some cleanup periodically and stuff like that, would expect them to be terminated by this time.

@jeremie6wind
Copy link
Contributor Author

When running all the tests, this is the output:

15/37 Testing: test_configured_subscriptions
15/37 Test: test_configured_subscriptions
Command: "/root/Netopeer2/build/tests/test_configured_subscriptions"
Directory: /root/Netopeer2/build
"test_configured_subscriptions" start time: Sep 12 10:49 CEST
Output:

[==========] Running 12 test(s).
Address type: IPv4 | 12345
Address type: IPv4 | 12346
Address type: IPv4 | 12347
[ RUN ] test_configured_subscriptions_receivers
[ OK ] test_configured_subscriptions_receivers
[ RUN ] test_configured_subscriptions_receivers_modif
[ OK ] test_configured_subscriptions_receivers_modif
[ RUN ] test_configured_subscriptions_add
[ OK ] test_configured_subscriptions_add
[ RUN ] test_configured_subscriptions_reset
[ OK ] test_configured_subscriptions_reset
[ RUN ] test_configured_subscriptions_modif
[ OK ] test_configured_subscriptions_modif
[ RUN ] test_configured_subscriptions_modif2
[ OK ] test_configured_subscriptions_modif2
[ RUN ] test_configured_subscriptions_modif3
[ OK ] test_configured_subscriptions_modif3
[ RUN ] test_configured_subscriptions_yang_push
[ OK ] test_configured_subscriptions_yang_push
[ RUN ] test_configured_subscriptions_yang_push_modif
[ OK ] test_configured_subscriptions_yang_push_modif
[ RUN ] test_configured_subscriptions_back
[ OK ] test_configured_subscriptions_back
[ RUN ] test_configured_subscriptions_remove
[ OK ] test_configured_subscriptions_remove
[ RUN ] test_configured_subscriptions_remove_receivers
[ OK ] test_configured_subscriptions_remove_receivers
[ERR] Waiting on a conditional variable failed (sr_remove_modules: Connection timed out).
[ FAILED ] GROUP TEARDOWN
[ ERROR ] tests
sr_remove_module() failed (Timeout expired)
[==========] 12 test(s) run.
[ PASSED ] 12 test(s).

Test time = 17.61 sec

Test Passed.
"test_configured_subscriptions" end time: Sep 12 10:49 CEST
"test_configured_subscriptions" time elapsed: 00:00:17

[...]

32/37 Testing: test_configured_subscriptions_valgrind
32/37 Test: test_configured_subscriptions_valgrind
Command: "/usr/bin/valgrind" "--leak-check=full" "--show-leak-kinds=all" "--error-exitcode=1" "/root/Netopeer2/build/tests/test_configured_subscriptions"
Directory: /root/Netopeer2/build
"test_configured_subscriptions_valgrind" start time: Sep 12 10:52 CEST
Output:

==14685== Memcheck, a memory error detector
==14685== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==14685== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==14685== Command: /root/Netopeer2/build/tests/test_configured_subscriptions
==14685==
[==========] Running 12 test(s).
Address type: IPv4 | 12345
Address type: IPv4 | 12346
Address type: IPv4 | 12347
[ RUN ] test_configured_subscriptions_receivers
[ OK ] test_configured_subscriptions_receivers
[ RUN ] test_configured_subscriptions_receivers_modif
[ OK ] test_configured_subscriptions_receivers_modif
[ RUN ] test_configured_subscriptions_add
[ OK ] test_configured_subscriptions_add
[ RUN ] test_configured_subscriptions_reset
[ OK ] test_configured_subscriptions_reset
[ RUN ] test_configured_subscriptions_modif
[ OK ] test_configured_subscriptions_modif
[ RUN ] test_configured_subscriptions_modif2
[ OK ] test_configured_subscriptions_modif2
[ RUN ] test_configured_subscriptions_modif3
[ OK ] test_configured_subscriptions_modif3
[ RUN ] test_configured_subscriptions_yang_push
[ OK ] test_configured_subscriptions_yang_push
[ RUN ] test_configured_subscriptions_yang_push_modif
[ OK ] test_configured_subscriptions_yang_push_modif
[ RUN ] test_configured_subscriptions_back
[ OK ] test_configured_subscriptions_back
[ RUN ] test_configured_subscriptions_remove
[ OK ] test_configured_subscriptions_remove
[ RUN ] test_configured_subscriptions_remove_receivers
[ OK ] test_configured_subscriptions_remove_receivers
[ERR] Waiting on a conditional variable failed (sr_remove_modules: Connection timed out).
[ FAILED ] GROUP TEARDOWN
[ ERROR ] tests
sr_remove_module() failed (Timeout expired)
[==========] 12 test(s) run.
[ PASSED ] 12 test(s).

When executing the test, I sometimes get this output:
[WRN] Recovered a read-lock of CID 12 (sr_remove_modules).

There are some memory leaks in the test itself due to the udp collector set to receive the udp notifications. When removing the listener in the test, there are no leaks anymore.

@michalvasko
Copy link
Member

Well, yes, my output is the same and like I said, this is not okay despite the print that all the tests passed. The following module removal must not time out and instead succeed. So there are still some lock problems, please fix them, with some basic information I provided in my previous reply.

As for the listener leaks, please remove it then, in the teardown or wherever, the tests need to pass with valgrind and there is generally no good reason for not cleaning up after yourself.

@jeremie6wind jeremie6wind force-pushed the configured-subscription-integration branch from b9907d9 to e33cd6b Compare September 13, 2023 13:00
@jeremie6wind
Copy link
Contributor Author

jeremie6wind commented Sep 13, 2023

Ok, I replaced the collector code by simpler udp socket code.
This removes the valgrind issues.
Also I fixed the issue with the timeout. It was a missing sr_release_data.
Finally, the last patch fixes a wrong index:

-            sub_id = sub->sub_ids[0];
+            sub_id = sub->sub_ids[idx];

Copy link
Member

@michalvasko michalvasko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, thanks, it seems to be functional now so there are just some less serious problems in this PR to fix before it can be merged.

CMakeLists.txt Outdated
Comment on lines 128 to 129
if (NOT UDP_NOTIF_BIN)
set(BUILD_UDP_NOTIF yes)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not really understand the logic of these variables, are they even required? Is this feature supposed to be optional? If so, the variable is supposed to be with the options at the beginning. In any case, please simplify this and do not use redundant variables. If you are trying to implement caching of this external project (not sure if needed, I think cmake can handle that on its own), using UDP_NOTIF_STATIC_LIB and UDP_NOTIF_INCLUDES should be enough.

goto cleanup;
}

message->buffer = string_notification;
message->buffer_len = strlen(string_notification);
message->buffer_len = strlen(message->buffer);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor improvement, the message length is returned by asprintf() so it saves you this strlen() call.

free(string_notification);

ATOMIC_INC_RELAXED(sub->sent_count);
free(message->buffer);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer for this to be in the cleanup, with condition if (message) .... Occurs at several places.


receiver->udp.sender = unyte_start_sender(&receiver->udp.options);
if (!receiver->udp.sender) {
ERR("Cannot create udp sender\n");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Redundant \n, always appended. Every message should, in general, start with a capital, end with a period and ideally provide error cause if possible (such as strerror(errno) for standard functions, not sure if that is possible here).

@@ -601,22 +839,25 @@ csn_config_sub_receivers_prepare(const struct lyd_node *node_receiver,

receiver.name = strdup(lyd_get_value(node_receiver_name));
if (!receiver.name) {
rc = SR_ERR_NO_MEMORY;
EMEM;
csn_receiver_destroy(&receiver, 0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems you can put this into the cleanup with condition if (rc) ... and save this call in all following error conditions.

Comment on lines 2371 to 2388
if (strstr(request_xpath, "subscriptions/subscription[") &&
strstr(request_xpath, "/receivers/receiver[")) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really part of this PR but this is wrong, the request_path can be, for example, /* and then you should provide all the data including the receivers. You should, instead, subscribe np2srv_oper_sub_ntf_receivers_cb() separately for its own nested subtree. Sysrepo will handle this correctly and call np2srv_oper_sub_ntf_subscriptions_cb() before np2srv_oper_sub_ntf_receivers_cb() as needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure how to solve this one.
During the tests I see that np2srv_oper_sub_ntf_subscriptions_cb()
can give the operational data of everything in /subscriptions.
However when doing the reset action on a receiver, sysrepo calls the operational callback and does not accept the complete operational data. But only the receiver operational data.
That's why I used this call to np2srv_oper_sub_ntf_receivers_cb() in this case.
What should np2srv_oper_sub_ntf_receivers_cb() do when it is called with request_xpath "/subscriptions" "or "/*" ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, so I have finally looked more into the changes you made and actually understood at least the config/oper data. Frankly, the current relevant code is a mess and will need major improvements, so a long description of the changes is following.

The data tree /ietf-subscribed-notifications:subscriptions/subscription is supposed to hold information about both dynamic and configured subscriptions. Since there were only dynamic subscriptions before, I have implemented it only as operational data since they were based on RPC calls and not configuration data. However, that has changed with the configured subscriptions. Here is how I would put all this together:

  • create a new config subscription (sr_module_change_subscribe()) for the whole subtree (you have implemented this in the callback np2srv_config_subscriptions_cb())
  • create a new config subscription for the receivers (your callback np2srv_config_subscriptions_receivers_cb())
  • create a new oper subscription to populate the state data in /ietf-subscribed-notifications:subscriptions/subscription for configured subscriptions
    • there are several ways of doing this, either subscribe for providing /ietf-subscribed-notifications:subscriptions (the callback will be called once) and create all the state data or subscribe to /ietf-subscribed-notifications:subscriptions/subscription (will be called for each configured subscription) and in the callback create data only for the specific subscription
    • use SR_SUBSCR_OPER_MERGE to only merge these data to the configuration data
  • create a new config subscription for /ietf-subscribed-notifications:subscriptions/ietf-subscribed-notif-receivers:receiver-instances and store the data in internal structures (your callback np2srv_config_receivers_cb())
  • keep the oper subscription for /ietf-subscribed-notifications:subscriptions that creates the data for dynamic subscriptions exactly as it was but they should be merged (flag SR_SUBSCR_OPER_MERGE) to the data of the configured subscriptions
  • regarding the internal structures holding all the information, it may make sense to completely separate dynamic and configured subscriptions because they are quite different, obviously

If you have any additional questions, we may schedule a call and talk about the design in detail.

@jeremie6wind jeremie6wind force-pushed the configured-subscription-integration branch from e33cd6b to 3cd54aa Compare September 15, 2023 09:42
@jeremie6wind
Copy link
Contributor Author

Thanks for your last answer. I fixed the code according to the previous comments.
I will work on the last suggestions and get back to you if it is necessary to schedule a call.

@michalvasko
Copy link
Member

An important update. After some discussion with @jktjkt it has been decided that no notification subscriptions will be supported directly by netopeer2 and instead implemented by a new daemon sysrepo-notifd in sysrepo that will pass any relevant notifications to the respective server such as netopeer2. This solution has many advantages:

  • possibility to reuse the subscription code (including YANG Push) by all YANG-based servers (NETCONF, RESTCONF, UDP notif, ...)
  • separate code for easier maintenance instead of everything being in netopeer2
  • much simpler support of new protocols, encodings, etc. for additional servers/standards

The relevant consequence of this for you is that once the daemon is finished, this is my idea of how UDP notifications will work. There will be a separate project implementing a "UDP server" (can even be hosted by 6wind directly and netopeer2 will mention a link) that will install all its YANG modules for configured subscriptions/UDP transport on its own into sysrepo. Then, it will handle any configuration changes in the modules and send a specific RPC that will create a new subscription to sysrepo-notifd. It will be very similar to establish-subscription except also include a path to a file-descriptor that the UDP server will listen on and read the actual notifications from. These will then be sent to the configured clients. Note that the subscriptions could still be created by NETCONF (or any other source of YANG data) but they will not be handled specially in netopeer2-server at all, the configuration data will simply be written into sysrepo and then picked up by the UDP server.

So, it is up to you to decide what you want to do in the meantime. The last changes we talked about are no longer relevant since it will all be in a separate project but most of the functionality you implemented will still be used. I should start the implementation of sysrepo-notifd today but it is hard to guess how long it will take, several weeks at least, maybe even a few months. Also, during that time the design may change so I cannot guarantee it will all work the way I have just described.

@jeremie6wind jeremie6wind force-pushed the configured-subscription-integration branch from 3cd54aa to c1fd0fd Compare September 18, 2023 15:22
@jeremie6wind
Copy link
Contributor Author

Ok, I used SR_SUBSCR_OPER_MERGE and I kept only one oper callback for the moment.
I could remove np2srv_oper_sub_ntf_receivers_cb.
I read your last comment.
We ll discuss internally to see how we proceed with sysrepo-notifd and "UDP server".

@jeremie6wind jeremie6wind force-pushed the configured-subscription-integration branch from c1fd0fd to 714c9ec Compare September 19, 2023 13:37
refactor configured subscriptions functions:
csn_send_notif_one to send to one specific receiver.
new csn_build_notification function to create a unyte messages.
Send subscription-started and subscription-terminated.
Increment sent_count on each message.
Add missing return code. Add missing function description. Rename
some functions.
@jeremie6wind jeremie6wind force-pushed the configured-subscription-integration branch 2 times, most recently from ad96ecc to b83e4a0 Compare September 22, 2023 07:59
@jlesk
Copy link

jlesk commented Oct 19, 2023

Hi,
I am very interested in the changes that you described last time regarding sysrepo-notifd and 'UDP server'.
I wanted to know if you had any updates on this subject?
Also do you intend to push the first commits of this pull request to the devel branch?
Finally, tell me if I can be of any help in this.
Thanks

@michalvasko
Copy link
Member

I have made a fair progress with sysrepo-notifd (some basic tests are already working) but had to postpone it because of some other work for maybe a week now. Unfortunately, it is really difficult to estimate when it may be finished and merged into sysrepo, into the devel branch so you will have to wait. Hopefully, the next week I will resume working on it. Appreciate the offer to help but there is no way you can help me, not sure about the UDP notification server.

But I can mention some relevant information that is unlikely to change now. To use sysrepo-notifd, one needs to create a named pipe and then send an RPC /sysrepo-notif:notif-subscribe with its path and optional NACM username (via sysrepo). After that the pipe can be read for the actual notifications but there will be utility functions available for either polling and reading the notifications or for creating a new thread with a callback that will automatically read new notifications. The main work left for this UDP server will be to properly implement all the special behavior of configured notifications, sysrepo-notifd will support only the base dynamic subscriptions (there are some additional special notifications and other things with configured feature in ietf-subscribed-notifications).

@jktjkt
Copy link
Contributor

jktjkt commented Oct 19, 2023

Thanks for the update.

one needs to create a named pipe and then send an RPC /sysrepo-notif:notif-subscribe with its path and optional NACM username (via sysrepo)

Any particular reason for a YANG-level RPC here, and for a pipe that's visible in the filesystem? I would actually prefer a C library call that would just return a FD for reading. That way, there's:

  • no need to filter that RPC out in a frontend (such as Netopeer2 or rousette),
  • no need to worry about where to create these pipes,
  • no potential for security implications of a privileged daemon (sysrpeo-notifd) running as root that's opening user-specified files for writing, which is usually a sure way of getting a CVE assigned.

@michalvasko
Copy link
Member

The reason being that that is how we designed it before. Also, it would allow for more code being moved to the new daemon (such as operational data reporting) but your worries seem valid and the proposed change should work so I will try to implement it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants