-
Notifications
You must be signed in to change notification settings - Fork 129
Agent SDK - ported on top of components-js primatives #1207
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
ce80b15
to
ef0fed7
Compare
packages/core/src/messages/types.ts
Outdated
type ReceivedMessageWithType< | ||
Type extends string, | ||
Metadata extends {} = {}, | ||
> = { | ||
id: string; | ||
timestamp: number; | ||
|
||
type: Type; | ||
|
||
from?: Participant; | ||
attributes?: Record<string, string>; | ||
} & Metadata; | ||
|
||
/** @public */ | ||
export type ReceivedChatMessage = ReceivedMessageWithType<'chatMessage', ChatMessage & { | ||
from?: Participant; | ||
attributes?: Record<string, string>; | ||
}>; | ||
|
||
export type ReceivedUserTranscriptionMessage = ReceivedMessageWithType<'userTranscript', { | ||
message: string; | ||
}>; | ||
|
||
export type ReceivedAgentTranscriptionMessage = ReceivedMessageWithType<'agentTranscript', { | ||
message: string; | ||
}>; | ||
|
||
/** @public */ | ||
export type ReceivedMessage = | ||
| ReceivedUserTranscriptionMessage | ||
| ReceivedAgentTranscriptionMessage | ||
| ReceivedChatMessage |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ported the existing ReceivedMessage
abstraction from here on top of the pre-existing ReceivedChatMessage
- this means that now,ReceivedChatMessage
is now a ReceivedMessage
subtype.
Note ReceivedChatMessage
has one new type
field addition which acts as the discriminant key in ReceivedMessage
, but otherwise is identical. So this should be a fully backwards compatible change even though behind the scenes a lot has been updated.
how about proxying some of these on the return value of |
I opted to leave that out for now because of the hesitancy from ben/dz around new track abstractions. That being said, I mentioned in the comment above that I had been proposing a new hook, It sounds like you are pushing for that to exist now vs deferring it? If so I can add that new hook to this branch or possibly figure out how to fit it into |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice, i like the direction where this is going.
This allows functions to limit whether they just want to take in TrackReferences from a given source - ie, the VideoTrack could be made to only accept TrackReference<Track.Source.Camera | Track.Source.Screenshare | Track.Source.Unknown>.
Note that just the return values are changing, not the argument definitions in other spots, so this shouldn't be a backwards compatibility issue.
…viously didn't accept it
The pre-existing state was broken.
…in useParticipantTracks
…s going to be important for future multi-agent type use cases
Now that livekit/client-sdk-js#1645 has been merged, this pull request has been updated to build on top of it. Some other recent decisions worth noting down:
function OtherComponent() {
const messages = useSessionMessages();
// ... use messages for something
return null;
}
function Main() {
const session = useSession(...);
return (
<SessionProvider session={session}>
{/* example component that uses "session" from context */}
<OtherComponent />
{/* example component that uses "room" from context */}
<RoomAudioRenderer />
</SessionProvider>
);
} |
size-limit report 📦
|
…rnings Warning: src/hooks/useAgent.ts:31:18 - (tsdoc-escape-greater-than) The ">" character should be escaped using a backslash to avoid confusion with an HTML tag Warning: src/hooks/useAgent.ts:31:31 - (tsdoc-escape-greater-than) The ">" character should be escaped using a backslash to avoid confusion with an HTML tag Warning: src/hooks/useAgent.ts:34:18 - (tsdoc-escape-greater-than) The ">" character should be escaped using a backslash to avoid confusion with an HTML tag Warning: src/hooks/useAgent.ts:34:34 - (tsdoc-escape-greater-than) The ">" character should be escaped using a backslash to avoid confusion with an HTML tag Warning: src/hooks/useAgent.ts:37:18 - (tsdoc-escape-greater-than) The ">" character should be escaped using a backslash to avoid confusion with an HTML tag Warning: src/hooks/useAgent.ts:37:44 - (tsdoc-escape-greater-than) The ">" character should be escaped using a backslash to avoid confusion with an HTML tag Warning: src/hooks/useAgent.ts:40:20 - (tsdoc-escape-greater-than) The ">" character should be escaped using a backslash to avoid confusion with an HTML tag Warning: src/hooks/useAgent.ts:40:34 - (tsdoc-escape-greater-than) The ">" character should be escaped using a backslash to avoid confusion with an HTML tag Warning: src/hooks/useAgent.ts:40:50 - (tsdoc-escape-greater-than) The ">" character should be escaped using a backslash to avoid confusion with an HTML tag
I need to include the useSession docstring twice, apparently?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, definitely should wait for @lukasIO review.
id: transcription.streamInfo.id, | ||
timestamp: transcription.streamInfo.timestamp, | ||
from: Array.from(room.remoteParticipants.values()).find( | ||
(p) => p.identity === transcription.participantInfo.identity, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what will happen if Array.from(room.remoteParticipants.values()).find() returns undefined here ?
want to make sure we handle this use case in an expected way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question - the from
property on ReceivedMessageWithType
(this data's type) is optional, so I think this is fine:
type ReceivedMessageWithType<Type extends string, Metadata extends object = object> = {
id: string;
timestamp: number;
type: Type;
from?: Participant;
attributes?: Record<string, string>;
} & Metadata;
That being said, the default:
case in this switch is out of the happy path and hasn't been deeply thought through (see the FIXME
comment above it), so it's possible it might be a better idea to throw an error or something instead of trying to fail gracefully and (maybe) resulting in incorrect data.
const room = useRoomContext(); | ||
const speakerObserver = React.useMemo(() => activeSpeakerObserver(room), [room]); | ||
const activeSpeakers = useObservableState(speakerObserver, room.activeSpeakers); | ||
export function useSpeakingParticipants(room?: Room) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd really like to avoid passing an optional room to all hooks.
For components it's not a big deal as the props are objects anyways.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We chatted about this in slack a bit - this is being done in support of the single file example use case where creating and managing a context is burdensome. Assuming this continues to be something worth prioritizing, I'm not sure exactly how else to accomplish this other than adding an explicit room
or session
prop / option to both components and hooks.
Maybe a compromise could be introducing an options
parameter rather than an explicit new parameter in cases where a brand new parameter would be required?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming this continues to be something worth prioritizing,
assuming "single-component" is a goal (single file would still be possible with contexts and multiple components defined within), I definitely prefer an options object over explicitly passing the room, yeah.
Longer term I think it might be worth discussing migrating away from that pattern since react won't be tree-shaken properly in end projects
This was causing prepareConnection to get run constantly since its reference depends on restOptions which changes every call. The way I see it, if a user is changing their options and wants prepareconnection to be run for something beyond the initial set (probably unusual), they can call it themselves on the returned session object.
This change comprises the new client agents sdk, a set of react hooks that are being built to make interaction with the livekit agents framework less complex.
This is version 3 - version 1 can be found here, and version 2 can be found here. Each step it has evolved significantly based on comments and perspectives from people who have taken a look!
Single file example
New API surface area
useSession(tokenSource: TokenSourceFixed | TokenSourceConfigurable, options: UseSessionConfigurableOptions | UseSessionFixedOptions): UseSessionReturn
A thin wrapper around a
Room
which handles connecting to a room and dispatching a given agent into that room (or in the future, maybe multiple agents?). In the future it will probably become thicker as more global agent state is required.useAgent(session: UseSessionReturn): Agent
A much more advanced version of the previously existing
useVoiceAssistant
hook - tracks the agent's state within the conversation, manages agent connection timeouts / other failures, and largely maintains backwards compatibility with existing interfaces.useSessionMessages
A mechanism for interacting with
ReceivedMessage
s across the whole conversation. AReceivedMessage
can be aReceivedChatMessage
(already exists today), or aReceivedUserTranscriptionMessage
/ReceivedAgentTranscriptionMessage
(both brand new). This is exposed at the conversation level so in a future world where multiple agents are within a conversation, this hook will return messages from all of themAdditional refactoring / cleanup
ParticipantAgentAttributes
constant and ported all usages oflk.
-prefixed attributes (which previously were just magic strings in the code) to refer to this enum.handleMediaDeviceError
callback function inuseLiveKitRoom
room
parameter to a few hooks and components that didn't support it previously, to make single file example type scenarios easier:RoomAudioRenderer
StartAudio
useChat
useTextStream
useTrackToggle
useTranscriptions