-
Notifications
You must be signed in to change notification settings - Fork 4.1k
feat(firebaseai): add bidi transcript #17700
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
/gemini summarize |
This pull request introduces bidirectional transcription capabilities for Firebase AI. Key changes include:
This feature enhances the live generation experience by providing real-time text representations of both user input and model output audio. |
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request adds support for bidirectional transcription in Firebase AI. The changes are well-structured, introducing new configurations and handling transcription messages. I've identified a few areas for improvement, including an unused variable, opportunities to reduce code duplication, and a suggestion to maintain immutability in a data class. My detailed comments and code suggestions aim to enhance code quality and maintainability.
if (message.inputTranscription?.text != null) { | ||
final transcription = message.inputTranscription!; | ||
if (_inputTranscriptionMessageIndex != null) { | ||
// TODO(cynthia): find a better way to update the message | ||
_messages[_inputTranscriptionMessageIndex!].text = | ||
'${_messages[_inputTranscriptionMessageIndex!].text}${transcription.text!}'; | ||
} else { | ||
_messages.add(MessageData( | ||
text: 'Input transcription: ${transcription.text!}', | ||
fromUser: true)); | ||
_inputTranscriptionMessageIndex = _messages.length - 1; | ||
} | ||
if (transcription.finished ?? false) { | ||
_inputTranscriptionMessageIndex = null; | ||
} | ||
setState(_scrollDown); | ||
} | ||
if (message.outputTranscription?.text != null) { | ||
final transcription = message.outputTranscription!; | ||
if (_outputTranscriptionMessageIndex != null) { | ||
_messages[_outputTranscriptionMessageIndex!].text = | ||
'${_messages[_outputTranscriptionMessageIndex!].text}${transcription.text!}'; | ||
} else { | ||
_messages.add( | ||
MessageData( | ||
text: 'Output transcription: ${transcription.text!}', | ||
fromUser: false, | ||
), | ||
); | ||
_outputTranscriptionMessageIndex = _messages.length - 1; | ||
} | ||
if (transcription.finished ?? false) { | ||
_outputTranscriptionMessageIndex = null; | ||
} | ||
setState(_scrollDown); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
}); | ||
final Uint8List? imageBytes; | ||
final String? text; | ||
String? text; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changing text
to be a mutable field makes the MessageData
class mutable. In Flutter, it's a best practice to use immutable data classes to prevent unexpected side effects and make state management more predictable.
I recommend reverting this to a final
field and adding a copyWith
method to MessageData
. This allows you to create a new instance with updated data, which aligns better with declarative UI principles.
Here's an example of how you could implement copyWith
:
class MessageData {
// ... existing final properties
MessageData copyWith({
String? text,
// ... other properties
}) {
return MessageData(
text: text ?? this.text,
// ... other properties
);
}
}
String? text; | |
final String? text; |
Transcription? inputTranscription; | ||
if (serverContentJson.containsKey('inputTranscription')) { | ||
final transcriptionJson = | ||
serverContentJson['inputTranscription'] as Map<String, dynamic>; | ||
inputTranscription = Transcription( | ||
text: transcriptionJson['text'] as String?, | ||
finished: transcriptionJson['finished'] as bool?, | ||
); | ||
} | ||
Transcription? outputTranscription; | ||
if (serverContentJson.containsKey('outputTranscription')) { | ||
final transcriptionJson = | ||
serverContentJson['outputTranscription'] as Map<String, dynamic>; | ||
outputTranscription = Transcription( | ||
text: transcriptionJson['text'] as String?, | ||
finished: transcriptionJson['finished'] as bool?, | ||
); | ||
} | ||
return LiveServerContent( | ||
modelTurn: modelTurn, | ||
turnComplete: turnComplete, | ||
interrupted: interrupted, | ||
inputTranscription: inputTranscription, | ||
outputTranscription: outputTranscription, | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic for parsing inputTranscription
and outputTranscription
is duplicated. To improve code quality and reduce redundancy, you can extract this logic into a local helper function within _parseServerMessage
.
Transcription? inputTranscription; | |
if (serverContentJson.containsKey('inputTranscription')) { | |
final transcriptionJson = | |
serverContentJson['inputTranscription'] as Map<String, dynamic>; | |
inputTranscription = Transcription( | |
text: transcriptionJson['text'] as String?, | |
finished: transcriptionJson['finished'] as bool?, | |
); | |
} | |
Transcription? outputTranscription; | |
if (serverContentJson.containsKey('outputTranscription')) { | |
final transcriptionJson = | |
serverContentJson['outputTranscription'] as Map<String, dynamic>; | |
outputTranscription = Transcription( | |
text: transcriptionJson['text'] as String?, | |
finished: transcriptionJson['finished'] as bool?, | |
); | |
} | |
return LiveServerContent( | |
modelTurn: modelTurn, | |
turnComplete: turnComplete, | |
interrupted: interrupted, | |
inputTranscription: inputTranscription, | |
outputTranscription: outputTranscription, | |
); | |
Transcription? parseTranscription(String key) { | |
if (serverContentJson.containsKey(key)) { | |
final transcriptionJson = | |
serverContentJson[key] as Map<String, dynamic>; | |
return Transcription( | |
text: transcriptionJson['text'] as String?, | |
finished: transcriptionJson['finished'] as bool?, | |
); | |
} | |
return null; | |
} | |
return LiveServerContent( | |
modelTurn: modelTurn, | |
turnComplete: turnComplete, | |
interrupted: serverContentJson['interrupted'] as bool?, | |
inputTranscription: parseTranscription('inputTranscription'), | |
outputTranscription: parseTranscription('outputTranscription'), | |
); |
if (_liveGenerationConfig != null && | ||
_liveGenerationConfig.inputAudioTranscription != null) | ||
'input_audio_transcription': | ||
_liveGenerationConfig.inputAudioTranscription!.toJson(), | ||
if (_liveGenerationConfig != null && | ||
_liveGenerationConfig.outputAudioTranscription != null) | ||
'output_audio_transcription': | ||
_liveGenerationConfig.outputAudioTranscription!.toJson(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The check for _liveGenerationConfig != null
is repeated. You can make this code cleaner and more readable by using a local variable and a collection-if with a spread operator.
final liveConfig = _liveGenerationConfig;
if (liveConfig != null) ...{
if (liveConfig.inputAudioTranscription != null)
'input_audio_transcription':
liveConfig.inputAudioTranscription!.toJson(),
if (liveConfig.outputAudioTranscription != null)
'output_audio_transcription':
liveConfig.outputAudioTranscription!.toJson(),
},
Description
Replace this paragraph with a description of what this PR is doing. If you're modifying existing behavior, describe the existing behavior, how this PR is changing it, and what motivated the change.
Related Issues
Replace this paragraph with a list of issues related to this PR from the issue database. Indicate, which of these issues are resolved or fixed by this PR. Note that you'll have to prefix the issue numbers with flutter/flutter#.
Checklist
Before you create this PR confirm that it meets all requirements listed below by checking the relevant checkboxes (
[x]
).This will ensure a smooth and quick review process. Updating the
pubspec.yaml
and changelogs is not required.///
).melos run analyze
) does not report any problems on my PR.Breaking Change
Does your PR require plugin users to manually update their apps to accommodate your change?