Skip to content

Commit

Permalink
Merge pull request #97 from video-db/ashu/fix-launch-day
Browse files Browse the repository at this point in the history
Improved docs and readme
  • Loading branch information
codeAshu authored Dec 3, 2024
2 parents a2de5b8 + b79690f commit c685fc9
Show file tree
Hide file tree
Showing 5 changed files with 61 additions and 26 deletions.
14 changes: 5 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,15 +31,13 @@
</p>
<br />
<p align="center">
<a href="https://www.youtube.com/playlist?list=PLhxAMFLSSK039xl1UgcZmoFLnb-qNRYQw"><strong>⚡️Watch Agent Demos</strong></a>
<a href="https://www.youtube.com/playlist?list=PLhxAMFLSSK039xl1UgcZmoFLnb-qNRYQw" target="_blank" rel="noopener noreferrer"><strong>⚡️Watch Agent Demos</strong></a>
&nbsp;&nbsp;&nbsp;
<a href="https://chat.videodb.io"><strong>✨Try Hosted Version</strong></a>
<a href="https://chat.videodb.io" target="_blank" rel="noopener noreferrer"><strong>✨Try Hosted Version</strong></a>
<br /><br />
<a href="https://github.com/video-db/Director/issues/new?assignees=&labels=bug&projects=&template=bug_report.yml">Report Bug</a>
·
<a href="https://github.com/video-db/Director/issues/new?assignees=&labels=enhancement&projects=&template=feature_request.yml">Request Feature</a>
·
<a href="https://github.com/video-db/Director/issues/new?assignees=&labels=enhancement&projects=&template=agent_request.yml">New Agent Request</a>
<a href="https://docs.director.videodb.io/index.html" target="_blank" rel="noopener noreferrer">📖 Documentation</a>
&nbsp;&nbsp;&nbsp;
<a href="https://github.com/video-db/Director/issues/new?assignees=&labels=enhancement&projects=&template=agent_request.yml" target="_blank" rel="noopener noreferrer">👩‍💻New Agent Request</a>
</p>
</p>
</p>
Expand Down Expand Up @@ -135,8 +133,6 @@ The Reasoning Engine works in tandem with the chat-based UI, making video intera
For a closer look, check out the detailed architecture diagram below:
![Reasoning Engine Architecture](https://github.com/user-attachments/assets/13a92f0d-5b66-4a95-a2d4-0b73aa359ca6)

Explore how the Reasoning Engine powers The Director to simplify and supercharge your media workflows.



## 🏃 Getting Started
Expand Down
45 changes: 35 additions & 10 deletions docs/concepts/overview.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,30 @@
## ⚙️ Architecture Overview
Director's architecture brings together:

**Backend Reasoning Engine:** Handles workflows and decision-making. Check out the <a href="https://github.com/video-db/Director/tree/main/backend" target="_blank" rel="noopener noreferrer">backend folder</a> in director codebase.
**Chat-Based UI:** Engage with your media library conversationally. Check <a href="https://github.com/video-db/videodb-chat" target="_blank" rel="noopener noreferrer">videodb-chat</a> for the source code.
**Video Player:** Advanced playback and interaction tools. Check <a href="https://github.com/video-db/videodb-player" target="_blank" rel="noopener noreferrer">videodb-player</a> for the details about the multi-platform video player.
- **Collection View:** Organize and browse your media effortlessly.

![Director architecture](https://github.com/user-attachments/assets/9afb2783-66db-4899-9308-03cbd12e74d7)

## Reasoning Engine

The Reasoning Engine is the core component that directly interfaces with the user. It interprets natural language input in any conversation and orchestrates agents to fulfill the user's requests. The primary functions of the Reasoning Engine are:

* Maintain Context of Conversational History: Manage memory, context limits, input, and output experiences to ensure coherent and context-aware interactions.
* Natural Language Understanding (NLU): Uses LLMs of your choice to have understanding of the task.
* Intelligent Reference Deduction: Intelligently deduce references to previous messages, outputs, files, agents, etc., to provide relevant and accurate responses.
* Agent Orchestration: Decide on agents and their workflows to fulfill requests. Multiple strategies can be employed to create agent workflows, such as step-by-step processes or chaining of agents provided by default.
* Final Control Over Conversation Flow: Maintain ultimate control over the flow of conversation with the user, ensuring coherence and goal alignment.
* **Maintain Context of Conversational History:** Manage memory, context limits, input, and output experiences to ensure coherent and context-aware interactions.
* **Natural Language Understanding (NLU):** Uses LLMs of your choice to have understanding of the task.
* **Intelligent Reference Deduction:** Intelligently deduce references to previous messages, outputs, files, agents, etc., to provide relevant and accurate responses.
* **Agent Orchestration:** Decide on agents and their workflows to fulfill requests. Multiple strategies can be employed to create agent workflows, such as step-by-step processes or chaining of agents provided by default.
* **Final Control Over Conversation Flow:** Maintain ultimate control over the flow of conversation with the user, ensuring coherence and goal alignment.

### **See It in Action**
The Reasoning Engine works in tandem with the chat-based UI, making video interaction intuitive and efficient. For example:
- **Input**: "Create a clip of the funniest scene in this video and share it on Slack."
- **Output**: The engine orchestrates upload, scene detection, clipping, and sharing agents to deliver results seamlessly. Watch the video [here](https://www.youtube.com/watch?v=fxhMgQf7v8s&list=PLhxAMFLSSK039xl1UgcZmoFLnb-qNRYQw&index=3)

For a closer look, check out the detailed architecture diagram below:
![Reasoning Engine Architecture](https://github.com/user-attachments/assets/13a92f0d-5b66-4a95-a2d4-0b73aa359ca6)


## Agents
Expand All @@ -22,17 +40,24 @@ For example, the task "Give me a summary of this video" can be accomplished by c



Key aspects of Agents include:
### Key aspects of Agents include:

* **Task Autonomy:** Agents perform tasks independently, utilizing tools to achieve their objectives.
* **Unique User Experiences (UX):** Each agent offers a distinct user experience, enhancing engagement and satisfaction. Multiple agents for the same task offer personalized interactions and cater to different user preferences like loading a specific UI or just a text message.
* **Standardized Agent Interface:** Agents communicate with the Reasoning Engine through a common API or protocol, ensuring consistent integration and interaction.

### Agent Examples

1. Highlight Creator: <a href="https://www.youtube.com/watch?v=Dncn_0RWrro&list=PLhxAMFLSSK039xl1UgcZmoFLnb-qNRYQw&index=11" target="_blank" rel="noopener noreferrer">link</a>
2. Text to Movie: <a href="https://www.youtube.com/watch?v=QpnRxuEBDCc&list=PLhxAMFLSSK039xl1UgcZmoFLnb-qNRYQw&index=2" target="_blank" rel="noopener noreferrer">link</a>
3. Video Search: <a href="https://www.youtube.com/watch?v=kCiCI2KCnC8&list=PLhxAMFLSSK039xl1UgcZmoFLnb-qNRYQw&index=4" target="_blank" rel="noopener noreferrer">link</a>

* Task Autonomy: Agents perform tasks independently, utilizing tools to achieve their objectives.
* Unique User Experiences (UX): Each agent offers a distinct user experience, enhancing engagement and satisfaction. Multiple agents for the same task offer personalized interactions and cater to different user preferences like loading a specific UI or just a text message.
* Standardized Agent Interface: Agents communicate with the Reasoning Engine through a common API or protocol, ensuring consistent integration and interaction.

## Tools

Tools are functional building blocks that can be created from any library and used within agents. They are the functions that enable agents to perform their tasks. For example, we have created an upload tool that is a wrapper around the videodb upload function, another one is an index function with parameters.

Key aspects of Tools include:
### Key aspects of Tools include:

* Functional Building Blocks: Serve as modular functions that agents can utilize to perform tasks efficiently.
* Wrapper Functions: Act as wrappers for existing functions or libraries, enhancing modularity and reusability.
Expand Down
10 changes: 9 additions & 1 deletion docs/get_started/contributing.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,11 @@
# Guidlines for contributing to the project

We welcome contributions to the Video Agents project from developers, researchers, and enthusiasts interested in video processing, AI, and related fields. This document outlines the guidelines for contributing to the project, including the process for submitting issues, feature requests, and pull requests.
We welcome contributions to the Director from developers, researchers, and enthusiasts interested in video processing, AI, and related fields. This document outlines the guidelines for contributing to the project, including the process for submitting issues, feature requests, and pull requests.

Any contributions you make are **greatly appreciated**. Here's the process:

1. Fork the Project
2. Create your Feature Branch (`git checkout -b feature/AmazingFeature`)
3. Commit your Changes (`git commit -m 'Add some AmazingFeature'`)
4. Push to the Branch (`git push origin feature/AmazingFeature`)
5. Open a Pull Request
14 changes: 12 additions & 2 deletions docs/index.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,13 @@
# Welcome to Director
# Welcome to The Director

The Director project is an advanced video processing and analysis platform that utilizes a range of AI agents and language models to handle diverse video management needs and tasks. It features a modular architecture that supports easy expansion and integration of new functionalities. Core components include specialized agents for distinct processing tasks, multiple language models for natural language processing, and a flexible database interface for data storage and retrieval. The project emphasizes ease of installation and setup through a streamlined Makefile, catering to developers looking to deploy or extend its capabilities efficiently.
**The Director** is an innovative, AI-driven video processing and analysis platform designed to revolutionize how you interact with video content. Built on the powerful infrastructure of **VideoDB**, The Director leverages a suite of specialized AI agents and advanced language models to manage and execute a wide range of video-related tasks seamlessly.

With its **modular architecture**, The Director makes it easy to expand and integrate new functionalities, adapting effortlessly to diverse use cases. Key features include:

- **Intelligent Agents:** Purpose-built agents handle tasks such as video upload, summarization, chapter creation, search, dubbing, dynamic editing, branding, and publishing.
- **Language Model Integration:** Advanced natural language processing capabilities enable intuitive interaction through chat-based workflows.
- **Flexible Database Interface:** Efficient storage, retrieval, and indexing for video content ensure seamless data management.

To simplify adoption, The Director provides a streamlined setup process using a **Makefile**, enabling developers to deploy or customize the platform effortlessly.

Whether you're building AI-powered video editors, automating video workflows, or exploring new applications of video intelligence, The Director empowers developers to push the boundaries of what's possible with video.
4 changes: 0 additions & 4 deletions docs/tools/interface.md

This file was deleted.

0 comments on commit c685fc9

Please sign in to comment.