You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As outlined in our treasury proposal during phase 2 of the AI SPE roadmap, we are collaborating closely with seven startups. These startups, who share our dedication to decentralized AI, serve as design partners, providing valuable feedback to enhance the AI subnet user experience and prepare it for onboarding more mature scale-ups and applications.
One of these startups aims to enable users to bring their static artistic photos to life using AI. To support this, the AI SPE team is working on integrating a new LipSync pipeline into the AI subnet. This pipeline will allow users to provide audio that will be lip-synced to their images.
We are calling on the community to lead the first step of this integration by researching this new pipeline and creating a proof of concept (POC) in the AI worker 🔧. Once this stage is complete, we can begin integrating the pipeline into the go-livepeer subnet and subsequently implement a text-to-speech pipeline to provide the full experience the startup is seeking 🚀 .
Bounty Requirements
To successfully complete this bounty, the participant should:
Create a brief report comparing several LipSync models and provide a recommendation on which model to implement first, including a rationale for the choice. The report should be concise, demonstrating a well-informed decision.
Implement a working /lipsync route and pipeline in the AI worker repository, making this capability available on port 8005. This pipeline should accept an audio file and a photo, and provide the user with a talking avatar. While we may expand to support video files in the future, we will start with photos to create talking heads.
This bounty does NOT include:
The full end-to-end implementation of this pipeline on the go-livepeer side, including the payment logic and job routing. This will be tackled either by the AI SPE team or in a subsequent bounty.
Required skilset
Bounty Level: Intermediate
Experience in understanding and interpreting generative AI research papers.
Proficiency in implementing generative AI models in Python using pre-trained weights.
In this section you will find some tips to get you started but since this bounty is more involved you will have a direct access to the engineering team to ask questions about code that is unclear or implementation decisions.
Lipsync Pipeline example
You can see the proposed lipsync (or talking heads) pipeline in action by going to:
The steps to add a new pipeline on the AI worker side are relatively straightforward:
Add the model weights download command for your chosen model to the dl_models.sh file.
If needed add the requirements for running the model inference in the AI runner docker file.
Copy the image-to-video route, and implement the required request and response types. Since the LipSync pipeline is also expected to return frames that can be transcoded into an MP4, much of the image-to-video code can be reused.
Copy the image-to-video pipeline, and replace the logic in the __init__ method to load the chosen LipSync model on the GPU.
Replace the logic in the __call__ method with the actual LipSync inference logic.
Ensure your new lipsync route is attached to the FastAPI server in the main.py file.
If everything was done correctly, you can start the FastAPI server, specify the new pipeline and model, and interact with your new pipeline at the /docs path on port 8005.
Express Your Interest: Comment on this issue to indicate your interest and explain why you're the ideal candidate for the task.
Wait for Review: Our team will review expressions of interest and select the best candidate.
Get Assigned: If selected, we'll assign the GitHub issue to you.
Start Working: Dive into your task! If you need assistance or guidance, comment on the issue or join the discussions in the #🛋│developer-lounge channel on our Discord server.
Submit Your Work: Create a pull request in the relevant repository and request a review.
Notify Us: Comment on this GitHub issue when your pull request is ready for review.
Receive Your Bounty: We'll arrange the bounty payment once your pull request is approved.
Gain Recognition: Your valuable contributions will be showcased in our project's changelog.
Thank you for your interest in contributing to our project 💛!
Warning
Please wait for the issue to be assigned to you before starting work. To prevent duplication of effort, submissions for unassigned issues will not be accepted.
The text was updated successfully, but these errors were encountered:
This bounty was assigned to @pschroedl and was completed last week. It appears the bounty was not initially posted on this repository, so I have posted it retroactively for visibility.
As outlined in our treasury proposal during phase 2 of the AI SPE roadmap, we are collaborating closely with seven startups. These startups, who share our dedication to decentralized AI, serve as design partners, providing valuable feedback to enhance the AI subnet user experience and prepare it for onboarding more mature scale-ups and applications.
One of these startups aims to enable users to bring their static artistic photos to life using AI. To support this, the AI SPE team is working on integrating a new LipSync pipeline into the AI subnet. This pipeline will allow users to provide audio that will be lip-synced to their images.
We are calling on the community to lead the first step of this integration by researching this new pipeline and creating a proof of concept (POC) in the AI worker 🔧. Once this stage is complete, we can begin integrating the pipeline into the go-livepeer subnet and subsequently implement a text-to-speech pipeline to provide the full experience the startup is seeking 🚀 .
Bounty Requirements
To successfully complete this bounty, the participant should:
/lipsync
route and pipeline in the AI worker repository, making this capability available on port 8005. This pipeline should accept an audio file and a photo, and provide the user with a talking avatar. While we may expand to support video files in the future, we will start with photos to create talking heads.This bounty does NOT include:
Required skilset
Bounty Level: Intermediate
Implementation Tips
In this section you will find some tips to get you started but since this bounty is more involved you will have a direct access to the engineering team to ask questions about code that is unclear or implementation decisions.
Lipsync Pipeline example
You can see the proposed lipsync (or talking heads) pipeline in action by going to:
Example Lipsync Models
A quick search provided us with the following example LipSync models that could be used:
However, feel free to suggest any LipSync model you deem fit.
Tips for creating the AI Worker Pipeline
To understand how to create a new AI worker pipeline, you can refer to recent pull requests where new pipelines were added:
The steps to add a new pipeline on the AI worker side are relatively straightforward:
__init__
method to load the chosen LipSync model on the GPU.__call__
method with the actual LipSync inference logic.lipsync
route is attached to the FastAPI server in the main.py file./docs
path on port 8005.The ai-worker repository contains a development guide that can help you get started debuging your changes.
How to Apply
Thank you for your interest in contributing to our project 💛!
Warning
Please wait for the issue to be assigned to you before starting work. To prevent duplication of effort, submissions for unassigned issues will not be accepted.
The text was updated successfully, but these errors were encountered: