Skip to content

Conversation

@adamshiervani
Copy link
Contributor

Previously, the device connected to the cloud API with a WebSocket and just waited. On the client side, the browser would create a peer connection, gather all ICE candidates, then send a long-running HTTP request to the API. The API would look up the right WebSocket connection, forward the session description from the HTTP request, and wait for a response over WS - all while keeping the HTTP request open. Once the device sent back the remote session description over WS, the HTTP request would return it. This setup was a bit fragile.

In local mode, we just did that HTTP request to the device, and it responded with the remote session description.

With this implementation, we implement what MDN calls the perfect negotiation pattern which does Trickle ICE. Instead of waiting for all the ICE candidates, we simply send them once they are ready. For this, we need 2-way communication, so both peers can update each other at any given time.

We chose WebSocket as the transport layer. In cloud mode, the Cloud API now just proxies WebSocket messages - no more long-running HTTP. In local mode, the device runs its own WebSocket server, and the client connects directly to it.

In addition to implementing "Trickle ICE", we also added:

  • Better retry mechanism - Much easier to follow now
  • Proper failure modes for known connection issues, like the ones in our docs
  • Informative Loading screens, with real connection information. No more "Loading video stream…"
  • Improved logging to pinpoint exactly where things fail.

Overall, this will at best solve a lot of the standing connection issues, at worst it will give us a lot of troubleshooting opportunity with the new logs.

Note:

  • In cloud mode, you need the Cloud API changes. I'm currently working on tidying up the code, should soon have a PR. Just try in local mode for now.
  • There are some To-dos left - I will address those
  • Spoke with @ym about the Prometheus metrics, and he will properly tag then in the PR, so the consumer knows what the metrics are about.

Create in draft mode, but the branch in fully testable - no major changes expected

@adamshiervani adamshiervani requested a review from Copilot April 8, 2025 13:24
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 8 out of 10 changed files in this pull request and generated no comments.

Files not reviewed (2)
  • ui/package-lock.json: Language not supported
  • ui/package.json: Language not supported
Comments suppressed due to low confidence (3)

web.go:129

  • Consider renaming 'handLocalWebRTCSignal' to 'handleLocalWebRTCSignal' for clarity and consistency with naming conventions.
func handLocalWebRTCSignal(c *gin.Context) {

ui/src/routes/devices.$id.tsx:290

  • Use strict equality (===) instead of the loose equality (==) when comparing the message type.
isSettingRemoteAnswerPending.current = parsedMessage.type == "answer";

cloud.go:78

  • The help string for 'metricConnectionTotalPingCount' appears to be incomplete; consider updating it to provide a full description.
Help: "The total number of pings sent to the",

@adamshiervani adamshiervani requested a review from Copilot April 8, 2025 13:30
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 8 out of 10 changed files in this pull request and generated no comments.

Files not reviewed (2)
  • ui/package-lock.json: Language not supported
  • ui/package.json: Language not supported
Comments suppressed due to low confidence (4)

ui/src/routes/devices.$id.tsx:235

  • [nitpick] Consider updating the log message to 'isOnDevice' for consistency in casing.
console.log("isondevice", isOnDevice);

ui/src/routes/devices.$id.tsx:410

  • [nitpick] Consider using the unified 'peerConnectionState' variable in the dependency array rather than mixing state references.
peerConnection?.signalingState,

ui/src/routes/devices.$id.tsx:261

  • [nitpick] Consider using a dedicated logger instead of console.log for error handling in the onError callback to maintain consistency with the rest of the code.
onError(event) { console.log("[Websocket] onError", event); ... }

cloud.go:77

  • [nitpick] Consider updating the metric help description to clearly indicate its purpose (for example, 'The total number of ping messages sent over the connection').
Name: "jetkvm_connection_total_ping_count",

@ym ym force-pushed the feat/trickle-ice branch from 18633ba to a9a8df5 Compare April 8, 2025 15:08
@ym ym marked this pull request as ready for review April 8, 2025 16:25
@adamshiervani adamshiervani requested review from Copilot and ym April 8, 2025 20:33
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 8 out of 10 changed files in this pull request and generated no comments.

Files not reviewed (2)
  • ui/package-lock.json: Language not supported
  • ui/package.json: Language not supported
Comments suppressed due to low confidence (1)

cloud.go:286

  • [nitpick] Using an empty string for the source label in cloud connections may reduce metric observability granularity. Consider using a specific identifier (e.g., 'cloud-default') to improve metric labeling.
wsResetMetrics(true, "cloud", "")

@adamshiervani adamshiervani merged commit 1a30977 into dev Apr 8, 2025
5 checks passed
@adamshiervani adamshiervani deleted the feat/trickle-ice branch April 8, 2025 22:10
@Savid Savid mentioned this pull request Apr 9, 2025
ym added a commit to ym/jetkvm-kvm that referenced this pull request Apr 9, 2025
* feat(cloud): Use Websocket signaling in cloud mode

* refactor: Enhance WebRTC signaling and connection handling

* refactor: Improve WebRTC connection management and logging in KvmIdRoute

* refactor: Update PeerConnectionDisconnectedOverlay to use Card component for better UI structure

* refactor: Standardize metric naming and improve websocket logging

* refactor: Rename WebRTC signaling functions and update deployment script for debug version

* fix: Handle error when writing new ICE candidate to WebRTC signaling channel

* refactor: Rename signaling handler function for clarity

* refactor: Remove old http local http endpoint

* refactor: Improve metric help text and standardize comparison operator in KvmIdRoute

* chore(websocket): use MetricVec instead of Metric to store metrics

* fix conflicts

* fix: use wss when the page is served over https

* feat: Add app version header and update WebRTC signaling endpoint

* fix: Handle error when writing device metadata to WebRTC signaling channel

---------

Co-authored-by: Siyuan Miao <i@xswan.net>
ym added a commit to ym/jetkvm-kvm that referenced this pull request Jun 13, 2025
* feat(cloud): Use Websocket signaling in cloud mode

* refactor: Enhance WebRTC signaling and connection handling

* refactor: Improve WebRTC connection management and logging in KvmIdRoute

* refactor: Update PeerConnectionDisconnectedOverlay to use Card component for better UI structure

* refactor: Standardize metric naming and improve websocket logging

* refactor: Rename WebRTC signaling functions and update deployment script for debug version

* fix: Handle error when writing new ICE candidate to WebRTC signaling channel

* refactor: Rename signaling handler function for clarity

* refactor: Remove old http local http endpoint

* refactor: Improve metric help text and standardize comparison operator in KvmIdRoute

* chore(websocket): use MetricVec instead of Metric to store metrics

* fix conflicts

* fix: use wss when the page is served over https

* feat: Add app version header and update WebRTC signaling endpoint

* fix: Handle error when writing device metadata to WebRTC signaling channel

---------

Co-authored-by: Siyuan Miao <i@xswan.net>
ym added a commit to ym/jetkvm-kvm that referenced this pull request Sep 26, 2025
* feat(cloud): Use Websocket signaling in cloud mode

* refactor: Enhance WebRTC signaling and connection handling

* refactor: Improve WebRTC connection management and logging in KvmIdRoute

* refactor: Update PeerConnectionDisconnectedOverlay to use Card component for better UI structure

* refactor: Standardize metric naming and improve websocket logging

* refactor: Rename WebRTC signaling functions and update deployment script for debug version

* fix: Handle error when writing new ICE candidate to WebRTC signaling channel

* refactor: Rename signaling handler function for clarity

* refactor: Remove old http local http endpoint

* refactor: Improve metric help text and standardize comparison operator in KvmIdRoute

* chore(websocket): use MetricVec instead of Metric to store metrics

* fix conflicts

* fix: use wss when the page is served over https

* feat: Add app version header and update WebRTC signaling endpoint

* fix: Handle error when writing device metadata to WebRTC signaling channel

---------

Co-authored-by: Siyuan Miao <i@xswan.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants