Replies: 5 comments 5 replies
-
The SRP client already supports random jitter mechanism, which was further enhanced in the following PRs. Please see the PR descriptions for more details:
I recommend updating the firmware to include these PRs if they are not already included. If you want to add even more random jitter (to initial registration), the next layer of code can control it (delay when and how the service is registered, e.g., when |
Beta Was this translation helpful? Give feedback.
-
@motters In earlier versions of IDF, there was an issue with the handling of Spinel traffic that could lead to crashes under heavy load. We believe this problem has been addressed in recent IDF releases. Could you please attempt to reproduce the issue using IDF version v5.3.2 and share the BR logs with us? The IDF v5.3.2 already includes the two PRs mentiond by Abtin. |
Beta Was this translation helpful? Give feedback.
-
Thanks for the speedy replies! @abtink Ah thank you. The devices (srp clients) are using nrf-sdk v2.6 which doesn't include the above commits. We'll update to v2.8 which does and report back. @chshu thanks for the information, we will update the BR's IDF version to the latest stable release (currently using IDF 5.1). When the BR crashes due to the speed of SRP requests, there are no error messages. It simple stops performing border routing operations. Normally the ESP BR outputs 4 log messages for each SRP request. During a crash it will only output 2 of them and then stop processing thread. We've set aside all next week to address this issue, I'll report back any information. |
Beta Was this translation helpful? Give feedback.
-
Hi All, Christmas break slowed things down slightly, however we have:
We now have 17 devices attached and working with SRP enabled. In the following weeks we'll be testing to 100 devices. I'll keep you updated with any additional tweaks that were required. Thanks for everyones help! |
Beta Was this translation helpful? Give feedback.
-
Apologies for the delay. We are currently in the process of setting up a test rig comprising of multiple device arrays for testing. Yesterday, we increased the number of test devices to 24, with an additional 24 this week. All 24 devices have successfully joined the BR with SRP enabled, improving on the result before. However, as the device count increases, other scaling issues are popping up. Any advise on the following? Error messagesWe're getting numerous error messages such as:
Finding all SRP DevicesWe have a function "refresh()" on the ESP BR that retrieves all our devices with SRP enabled. When we execute this function we're getting the following error Are we doing something wrong in the refresh function? static void refresh()
{
// Clear last found devices
m_devices.clear();
#if srp_server_enable == 1
// Iterate over all registered devices
otError error = OT_ERROR_NONE;
const otSrpServerHost *host = nullptr;
// For each srp server in the network
while ((host = otSrpServerGetNextHost(esp_openthread_get_instance(), host)) != nullptr)
{
const otSrpServerService *service = nullptr;
// Iterate through each service on the network
while ((service = otSrpServerHostGetNextService(host, service)) != nullptr)
{
bool isDeleted = otSrpServerServiceIsDeleted(service);
const uint8_t *txtData;
uint16_t txtDataLength;
bool hasSubType = false;
// Ensure the service is not deleted
if(otSrpServerServiceIsDeleted(service))
continue;
// Get the services name which includes the id and service
std::string name(otSrpServerServiceGetInstanceName(service));
// Get the service port number
uint16_t port = otSrpServerServiceGetPort(service);
// Get all lease info
otSrpServerLeaseInfo leaseInfo;
otSrpServerServiceGetLeaseInfo(service, &leaseInfo);
// Get how long the lease is
uint32_t lease = static_cast<uint32_t>(leaseInfo.mLease / 1000);
// Get how long the key lease is
uint32_t key_lease = static_cast<uint32_t>(leaseInfo.mKeyLease / 1000);
// Get the host where the service is registered
// e.g my-host.default.service.arpa.
std::string host_name(otSrpServerHostGetFullName(host));
// Get host address: for now just take the first ip
uint8_t addressesNum;
auto addresses = otSrpServerHostGetAddresses(host, &addressesNum);
char ip[OT_IP6_ADDRESS_STRING_SIZE];
otIp6AddressToString(&addresses[0], ip, sizeof(ip));
// If the device is a CSC
if(name.find("_DEVICE-SERVICE-TYPE._udp") != std::string::npos)
{
auto serivce_parts = split(name, '.');
device found {
.host_name = host_name.substr(0, 9),
.ot6Address = addresses[1],
.string6Address = ip,
.eui64 = serivce_parts[0],
.port = port,
};
m_devices[host_name.substr(0, 9)] = found;
ESP_LOGD(tag, "Found service %s at host %s using ip [%s]:%d with eui64 %s",
name.c_str(),
host_name.c_str(),
ip,
port,
serivce_parts[0].c_str());
}
}
}
#endif
} Thanks again! |
Beta Was this translation helpful? Give feedback.
-
Hello Everyone!
We have a border router based off an ESP32 with around 8-10 devices in the mesh.
The devices are using the openthread SRP client and the border router (BR) is using the openthread SRP server. The devices are using
otSrpClientEnableAutoStartMode
to handle the submission of services to the BR's SRP server as required.The problem we're having is all 10 devices are submitting & updating their SRP service details to the BR's SRP server at the same time. This is causing the BR's ESP32 to crash as it can't handle the wave of requests.
As such we want to delay the device's SRP requests/transmissions by a random time. This allows the ESP32 to handle each request successfully (we have proven this).
I was hoping you may have some advise on the best way of implementing this "random delay"?
Thanks again!
Beta Was this translation helpful? Give feedback.
All reactions