We create a project that transforms audio to blendshape weights,and drives the digital human,xiaomei,in UE project.
The framework we used contains three parts.In Formant network step,we perform fixed-function analysis of the input audio clip.In the articulation network,we concatenate an emotional state vector to the output of each convolution layer after the ReLU activation. The fully-connected layers at the end expand the 256+E abstract features to blendshape weights .
this pipeline shows how we use FACEGOOD Audio2Face.
- step1: record voice and video ,and create animation from video in maya. note: the voice must contain vowel ,exaggerated talking and normal talking.Dialogue covers as many pronunciations as possible.
- step2: we deal the voice with LPC,to split the voice into segment frames corresponding to the animation frames in maya.
Use ExportBsWeights.py to export weights file from Maya.Then we can get BS_name.npy and BS_value.npy .
Use step1_LPC.py to deal with wav file to get lpc_*.npy . Preprocess the wav to 2d data.
we recommand that uses FACEGOOD avatary to produces trainning data.its fast and accurate. http://www.avatary.com
the data for train is stored in dataSet1
python step14_train.py --epochs 8 --dataSet dataSet1
In folder /test,we supply a test application named AiSpeech.
wo provide a pretrained model,zsmeif.pb
In floder /example/ueExample, we provide a packaged ue project that contains a digit human created by FACEGOOD can drived by /AiSpeech/zsmeif.py.
you can follow the steps below to use it:
- make sure you connect the microphone to computer.
- run the script in terminal.
python zsmeif.py
- when the terminal show the message "run main", please run FaceGoodLiveLink.exe which is placed in /example/ueExample/ folder.
- click and hold on the left mouse button on the screen in UE project, then you can talk with the AI model and wait for the voice and animation response.
tersorflow-gpu 1.15
python-libs: pyaudio requests websocket websocket-client
The testing data, Maya model, and ue4 test project can be downloaded from the link below.
data_all code : n6ty
Audio-Driven Facial Animation by Joint End-to-End Learning of Pose and Emotion
Wechat: FACEGOOD_CHINA
Email:support@facegood.cc
Discord: https://discord.gg/V46y6uTdw8
Audio2Face Core is released under the terms of the MIT license.See COPYING for more information or see https://opensource.org/licenses/MIT.