-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added agent to get video transcripts. #72
base: main
Are you sure you want to change the base?
Added agent to get video transcripts. #72
Conversation
Getting the error:
|
|
||
logger = logging.getLogger(__name__) | ||
|
||
class VideoTranscriptionAgent(BaseAgent): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Naming should be consistent.
1 - Existing agent file doesn't have _agent
transcript_agent
-> transcription.py
2 - Agent name VideoTranscriptionAgent
-> TranscriptionAgent
(since in future we may reuse this for audio transcription as well)
3 - agent_name video_transcription
-> transcription
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sarfarazsiddiquii agent name is still video_transcription can you make it simply transcription
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has been resolved in the recent commit. Sorry for overlooking it.
Hey @ashish-spext, the error |
Awesome! Let me test it in a while. |
data={"video_id": video_id, "transcript": output_text}, | ||
) | ||
|
||
def _group_transcript_with_timestamps(self, transcript_text: str, time_range: int) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The grouping logic is not correct.
If you will test it you will get only one block like this:
Reason: There are no new lines in transcription text, and even if they were new line representing the given range (2 minutes in case of default) is wrong.
Correct way would be to use the transcription dictionary that VideoDB tool is sending it has timing information unlike the transcription text that is being used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ashish-spext I’m having a hard time understanding the structure of the transcription dictionary returned by the VideoDB tool.
I think the output of the get_transcript()
method, when called with text=False
, will give us transcript details.
However, I’m unable to see any changes I’ve made or test the app due to API key limitations:
can you please provide more information about how timing information is stored in transcription dictionary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This issue can be resolved by adding free LLM models. Merging this PR will fix the problem and it would be helpful for solving similar issues in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sarfarazsiddiquii We have resolved this issue by adding an OpenAI proxy. An OpenAI key is no longer required. Please pull the latest changes from the main branch, ensure that no OpenAI key is present in the .env file, and test the transcript.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ankit-v2-3, Thank you for the update, the code is now testable.
I’ve fixed the grouping logic in the latest commit, the agent will now properly group the transcription text into 2 minute intervals by default unless time interval is defined.
Let me know if any change is required.
|
||
output_text_content.text = output_text | ||
output_text_content.status = MsgStatus.success | ||
output_text_content.status_message = "Transcription completed successfully." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Message like "Here is your transcription" would be better since we are using that for title of the trascription.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense. I’ve corrected the changes in the recent commit.
…into TranscriptAgent
Hi @ashish-spext, I've made the requested changes. |
Thanks for the changes. |
WalkthroughThe changes introduce a new Changes
Possibly related PRs
Suggested reviewers
Poem
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (2)
backend/director/agents/transcription.py
(1 hunks)backend/director/handler.py
(2 hunks)
🔇 Additional comments (1)
backend/director/handler.py (1)
25-25
: Integration of TranscriptionAgent
is implemented correctly
The TranscriptionAgent
is properly imported (line 25) and added to the agents list (line 61) in the ChatHandler
class. This allows the agent to be utilized within the application as intended.
Also applies to: 61-61
self.output_message.actions.append("Trying to get the video transcription...") | ||
output_text_content = TextContent( | ||
agent_name=self.agent_name, | ||
status_message="Processing the transcription...", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please keep the ellipses (..) to a max of two in all status messages and actions?
Added an agent (in reference to issue #70) to retrieve video transcripts in raw format or with timestamps.
Summary by CodeRabbit
New Features
TranscriptionAgent
, enabling video transcription with optional timestamp formatting.TranscriptionAgent
into theChatHandler
for enhanced functionality.Bug Fixes