Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add lip sync pipeline [50 LPT] #35

Closed
rickstaa opened this issue Jul 12, 2024 · 3 comments
Closed

Add lip sync pipeline [50 LPT] #35

rickstaa opened this issue Jul 12, 2024 · 3 comments
Labels
AI AI SPE bounties bounty Software bounies.

Comments

@rickstaa
Copy link
Collaborator

rickstaa commented Jul 12, 2024

As outlined in our treasury proposal during phase 2 of the AI SPE roadmap, we are collaborating closely with seven startups. These startups, who share our dedication to decentralized AI, serve as design partners, providing valuable feedback to enhance the AI subnet user experience and prepare it for onboarding more mature scale-ups and applications.

One of these startups aims to enable users to bring their static artistic photos to life using AI. To support this, the AI SPE team is working on integrating a new LipSync pipeline into the AI subnet. This pipeline will allow users to provide audio that will be lip-synced to their images.

We are calling on the community to lead the first step of this integration by researching this new pipeline and creating a proof of concept (POC) in the AI worker 🔧. Once this stage is complete, we can begin integrating the pipeline into the go-livepeer subnet and subsequently implement a text-to-speech pipeline to provide the full experience the startup is seeking 🚀 .

Bounty Requirements

To successfully complete this bounty, the participant should:

  • Create a brief report comparing several LipSync models and provide a recommendation on which model to implement first, including a rationale for the choice. The report should be concise, demonstrating a well-informed decision.
  • Implement a working /lipsync route and pipeline in the AI worker repository, making this capability available on port 8005. This pipeline should accept an audio file and a photo, and provide the user with a talking avatar. While we may expand to support video files in the future, we will start with photos to create talking heads.

This bounty does NOT include:

  • The full end-to-end implementation of this pipeline on the go-livepeer side, including the payment logic and job routing. This will be tackled either by the AI SPE team or in a subsequent bounty.

Required skilset

Bounty Level: Intermediate

  • Experience in understanding and interpreting generative AI research papers.
  • Proficiency in implementing generative AI models in Python using pre-trained weights.
  • Familiarity with FastAPI.
  • Strong Python programming skills.

Implementation Tips

In this section you will find some tips to get you started but since this bounty is more involved you will have a direct access to the engineering team to ask questions about code that is unclear or implementation decisions.

Lipsync Pipeline example

You can see the proposed lipsync (or talking heads) pipeline in action by going to:

Example Lipsync Models

A quick search provided us with the following example LipSync models that could be used:

However, feel free to suggest any LipSync model you deem fit.

Tips for creating the AI Worker Pipeline

To understand how to create a new AI worker pipeline, you can refer to recent pull requests where new pipelines were added:

The steps to add a new pipeline on the AI worker side are relatively straightforward:

  1. Add the model weights download command for your chosen model to the dl_models.sh file.
  2. If needed add the requirements for running the model inference in the AI runner docker file.
  3. Copy the image-to-video route, and implement the required request and response types. Since the LipSync pipeline is also expected to return frames that can be transcoded into an MP4, much of the image-to-video code can be reused.
  4. Copy the image-to-video pipeline, and replace the logic in the __init__ method to load the chosen LipSync model on the GPU.
  5. Replace the logic in the __call__ method with the actual LipSync inference logic.
  6. Ensure your new lipsync route is attached to the FastAPI server in the main.py file.
  7. If everything was done correctly, you can start the FastAPI server, specify the new pipeline and model, and interact with your new pipeline at the /docs path on port 8005.

The ai-worker repository contains a development guide that can help you get started debuging your changes.

How to Apply

  1. Express Your Interest: Comment on this issue to indicate your interest and explain why you're the ideal candidate for the task.
  2. Wait for Review: Our team will review expressions of interest and select the best candidate.
  3. Get Assigned: If selected, we'll assign the GitHub issue to you.
  4. Start Working: Dive into your task! If you need assistance or guidance, comment on the issue or join the discussions in the #🛋│developer-lounge channel on our Discord server.
  5. Submit Your Work: Create a pull request in the relevant repository and request a review.
  6. Notify Us: Comment on this GitHub issue when your pull request is ready for review.
  7. Receive Your Bounty: We'll arrange the bounty payment once your pull request is approved.
  8. Gain Recognition: Your valuable contributions will be showcased in our project's changelog.

Thank you for your interest in contributing to our project 💛!

Warning

Please wait for the issue to be assigned to you before starting work. To prevent duplication of effort, submissions for unassigned issues will not be accepted.

@rickstaa rickstaa added the AI AI SPE bounties label Jul 12, 2024
@rickstaa
Copy link
Collaborator Author

This bounty was assigned to @pschroedl and was completed last week. It appears the bounty was not initially posted on this repository, so I have posted it retroactively for visibility.

@rickstaa
Copy link
Collaborator Author

Submission is found here and will be reviewed in the coming week.

@rickstaa rickstaa added the bounty Software bounies. label Jul 12, 2024
@rickstaa
Copy link
Collaborator Author

This was implemented in livepeer/ai-worker#120 and has been paid out on chain already 🎉. All bounty transactions can be found back on the AI SPE wallet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
AI AI SPE bounties bounty Software bounies.
Projects
None yet
Development

No branches or pull requests

1 participant