Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integration of Agora 'uid' Labeling for Speaker Differentiation in RTT Offline Transcripts in webVTT Format #1466

Open
Catherinesjkim opened this issue Sep 14, 2023 · 0 comments
Assignees
Labels
improvement Additions that make docs better

Comments

@Catherinesjkim
Copy link

Catherinesjkim commented Sep 14, 2023

Issue Description:

Background:

Post recent interactions with users, there has been a highlighted need for enhanced speaker differentiation within RTT's offline transcripts after it's saved in their AWS S3 bucket, etc. A suggested approach revolves around leveraging the 'uid' generated by Agora's SDK as a labeling method for individual speakers. This unique 'uid' is returned as a promise whenever a participant joins a call, making it a viable candidate for labeling.

Problem:

The current RTT system and its documentation lack clarity and guidelines on using this 'uid' for speaker differentiation. This omission could hinder clients aiming to adopt this solution and possibly impede the realization of the full potential of our RTT system.

Proposed Solution:

Technical Evaluation: The key areas of concern include its reflection within offline transcripts i.e. the uid "941847" being returned when a participant joins the call but it's not clear for our customers/developers that's actually a label generated by Agora and used as the label within the webVTT formatted transcript.

RTT_UID

Documentation Update: The 'uid' labeling technique was found to be viable and aligns with our platform's standards. Therefore we should proceed with updating the RTT documentation. This update should provide a clear guideline on how to utilize 'uid' for labeling speakers in offline transcripts in webVTT format, ensuring ease of implementation for our users.

Action Items:

Suggestions to Update RTT documentation: Incorporating the Agora UID into the webVTT format, especially after the regionID, can provide a seamless and intuitive way for developers to differentiate speakers.

Here’s how Agora developers can make this implementation more obvious:

  1. Integrate UID into webVTT Region Definitions:

Before any cue is written, the webVTT file usually has a series of region definitions. These can be modified to incorporate the Agora UID.

REGION
id: RegionID1_UID_941847
width: 40%
lines: 1
regionanchor: 0%,100%
viewportanchor: 10%,90%
scroll: up
  1. Modify Cue Settings:

When introducing a new cue (or line of dialogue) in the webVTT format, developers can utilize the modified region ID.

00:11.000 --> 00:13.000 region:RegionID1_UID_941847
Hello, this is the therapist speaking.
  1. Documentation and Commenting:

Within the webVTT file, developers can incorporate comments (using NOTE) to explain the new format to users who might be unfamiliar.

NOTE
In this transcript, speakers are differentiated using both region IDs and Agora UIDs. 
The format is RegionID_UID. E.g., RegionID1_UID_941847 represents the speaker1 (therapist) with UID 941847.

Labels: enhancement, documentation, RTT
Assignees: Sid Sharma sid.sharma@agora.io, Iain Iain@agora.io

@Catherinesjkim Catherinesjkim changed the title Integration of 'uid' Labeling for Speaker Differentiation in RTT Transcripts Integration of Agora 'uid' Labeling for Speaker Differentiation in RTT Offline Transcripts in webVTT Format Sep 16, 2023
@atovpeko atovpeko added the improvement Additions that make docs better label Sep 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Additions that make docs better
Projects
None yet
Development

No branches or pull requests

3 participants