Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(websocket): release client-lock during WEBSOCKET_EVENT_DATA #704

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

bryghtlabs-richard
Copy link
Contributor

@bryghtlabs-richard bryghtlabs-richard commented Dec 2, 2024

Description

Prevent deadlocks when reserving locks in WEBSOCKET_EVENT_DATA handler, and lock is held by another thread sending a websocket message.

Fix high latency caused by writers serialized with WEBSOCKET_EVENT_DATA while calling esp_websocket_client_send(), even when TCP window has enough space for the entire message being queued to send.

Multiple writers are still serialized at fragment boundaries, but only with other writers and websocket error updates.

Related

Fixes #625.

Testing

I tested with a WebSocket echo server running on the same WiFi access point, using the target-example with the following modifications:

  • AppMain thread tries to send 20 unfragmented frames, waiting 100ms between frames.
  • WebSocket client is patched to record the time taken to send each frame.
  • WEBSOCKET_EVENT_DATA handler calls vTaskDelay(configTICK_RATE_HZ); to simulate application-processing.

Without this patch, the AppMain thread experiences long delays, often either almost 1 second, and sometimes much longer(like 6 seconds) depending on which cores the AppMain and WebSocketClient thread run. The AppMain thread is unable to send at 10Hz, even though the TCP send buffer is never full.

With this patch, the AppMain thread experiences no delays writing websocket frames, and is able to send at 10Hz, as long as the TCP send buffer is not filled.

Examples logs:
test_before_fix.txt
test_before_fix2.txt
test_with_fix.txt

Checklist

Before submitting a Pull Request, please ensure the following:

  • 🚨 This PR does not introduce breaking changes.
  • All CI checks (GH Actions) pass.
  • Documentation is updated as needed.
  • Tests are updated or added as necessary.
  • Code is well-commented, especially in complex areas.
  • Git history is clean — commits are squashed to the minimum necessary.

This resolves:

 1) Deadlock when trying to reserve a lock in WEBSOCKET_EVENT_DATA,
    but lock is held by a thread trying to send a websocket message.
 2) High latency caused by writers serialized with WEBSOCKET_EVENT_DATA
    while calling esp_websocket_client_send(), even when TCP window
    has enough space for the entire message being queued to send.

Multiple writers are still serialized at fragment boundaries, but
only with other writers and websocket error updates.

Fixes espressif#625
xSemaphoreTakeRecursive(client->lock, lock_timeout);
esp_websocket_client_abort_connection(client, WEBSOCKET_ERROR_TYPE_TCP_TRANSPORT);
xSemaphoreGiveRecursive(client->lock);
break;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this break is probably wrong

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant