Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delay in receiving rpc reply #1682

Open
srikanthsubbaramu opened this issue Dec 18, 2024 · 5 comments
Open

Delay in receiving rpc reply #1682

srikanthsubbaramu opened this issue Dec 18, 2024 · 5 comments
Labels
is:question Issue is actually a question.

Comments

@srikanthsubbaramu
Copy link

srikanthsubbaramu commented Dec 18, 2024

Hi Michal,

we have netopeer2-server running on one container, and we have application(netconf client) running in other container on same VM. we have connected to server via libnetconf through ssh, and we are trying to do get operations using nc_rpc_get and rpc send, in some instances we see that get rpc (recv_reply) is timing out, and we do not see any reply for about 8-10 seconds, we want to triage and identify where is the delay caused,

on netconf server logs, we did not observe any ERR for /ietf-netconf: get logs, is there any pointers in the log to identify where delay is introduced, ( it may also be in network we dont know, so we just want to identify where is delay introduced )
Sample Log

[2024-12-18 13:14:36.846143] [INFO] Get day1 data for path /ManagedElement/GNBCUCPFunction/EP_XnC_Local
[2024-12-18 13:14:37.847223] [INFO] Couldn't receive a reply from the server ret:1
[2024-12-18 13:14:37.847249] [INFO] output data is null. envp
[2024-12-18 13:14:37.847260] [INFO] day1 path,data /ManagedElement/GNBCUCPFunction/EP_XnC_Local
[2024-12-18 13:14:37.847264] [INFO] Get day1 data for path /ManagedElement/GNBCUCPFunction/EP_X2C_Local

[2024-12-18 13:14:53.107708] [INFO] Session is valid and active.
ParseFromASession 2 [ERR]: Received a <rpc-reply> with an unexpected message-id 111 (expected 119).

netopeer2-logs

[INF]: SR: EV LISTEN: "/ietf-netconf:get" "rpc" ID 260 priority 0 processing (remaining 1 subscribers).
[INF]: SR: EV LISTEN: "/ietf-netconf:get" "rpc" ID 260 priority 0 success (remaining 0 subscribers).
[INF]: SR: EV ORIGIN: "/ietf-netconf:get" "rpc" ID 260 priority 0 succeeded.
[INF]: SR: EV ORIGIN: "/ietf-netconf:get" "rpc" ID 261 priority 0 for 1 subscribers published.
[INF]: SR: EV LISTEN: "/ietf-netconf:get" "rpc" ID 261 priority 0 processing (remaining 1 subscribers).
[INF]: SR: EV LISTEN: "/ietf-netconf:get" "rpc" ID 261 priority 0 success (remaining 0 subscribers).
[INF]: SR: EV ORIGIN: "/ietf-netconf:get" "rpc" ID 261 priority 0 succeeded.
[INF]: SR: EV ORIGIN: "/ietf-netconf:get" "rpc" ID 262 priority 0 for 1 subscribers published.
[INF]: SR: EV LISTEN: "/ietf-netconf:get" "rpc" ID 262 priority 0 processing (remaining 1 subscribers).
[INF]: SR: EV LISTEN: "/ietf-netconf:get" "rpc" ID 262 priority 0 success (remaining 0 subscribers).
[INF]: SR: EV ORIGIN: "/ietf-netconf:get" "rpc" ID 262 priority 0 succeeded.

Thanks,
Srikanth

@michalvasko
Copy link
Member

Not sure I can help you. The logs in netopeer2-server do not have timestamps but you should be able to see when they are generated and based on that learn whether the delay is before receiving the RPC, during its processing, or after the reply is sent.

@michalvasko michalvasko added the is:question Issue is actually a question. label Dec 18, 2024
@srikanthsubbaramu
Copy link
Author

Design
Hi Michal ,

Let me explain design and problem in detailed way,

In Scenario, there are multiple clients, who does get rpc call on certain paths for day1 on running datastore and after get is completed, they immediately do a user rpc for candidate ds for day2 changes,
The below activity for client1,client2 and client3 are happening simultaneously at the server
Client1 is doing usercall rpc
client2 is doing usercall rpc
client3 is doing multiple get rpc call on running data store

Here in client 3 is facing a timeout on get rpc and this stall is seen for about 5-6 seconds.

We wanted to understand if user rpc callback under sr context is holding any resource or causing any problems to netopeer2-server responding to get Netconf rpc calls for other clients
One observation is we tried removing establish push rpc call on user-rpc call(subscribe-xpath) and we did not observe above timeout related issue

please provide your inputs.

Thanks,
Srikanth

@michalvasko
Copy link
Member

I am sorry but this is way too complex for me to be able to analyze it without actually running the use-case, so I cannot help you.

@srikanthsubbaramu
Copy link
Author

Hi Michal,
We have increased NC threads , and also changed from using get to getconfig at clients, and we did not encounter any issues
Libnetconf2: set(MAX_PSPOLL_THREAD_COUNT 15 CACHE STRING "Maximum number of threads that could simultaneously access a ps_poll structure")
NETOPEER2: set(THREAD_COUNT 12 CACHE STRING "Number of threads accepting new sessions and handling requests")
We did not observe any delay on get/get-config calls with increased threads

couple of questions,

  1. Should we pursue on using sysrepo api for yang push subscription, instead of issuing a Netconf-rpc call within sysrepo rpc callback?
  2. can we try to make this number of threads configurable at run time ( we need to change arrays to dynamic allocation though)?
    Thanks,
    Srikanth

@michalvasko
Copy link
Member

  1. This is up to you and would simplify these calls and make them faster. On the other hand, there is some effort involved, especially for someone not yet familiar with the API. So I guess you can leave it as it is for now and just keep this possibility in mind if you encounter any more issues.
  2. Yes, I believe it should not be too difficult to add support in netopeer2-server to stop some worker threads or create new ones.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
is:question Issue is actually a question.
Projects
None yet
Development

No branches or pull requests

2 participants