Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

uLipSync 3.0.2 not working with WebGL in Unity 2022.3 #57

Open
horeatrinca opened this issue Nov 6, 2023 · 12 comments
Open

uLipSync 3.0.2 not working with WebGL in Unity 2022.3 #57

horeatrinca opened this issue Nov 6, 2023 · 12 comments

Comments

@horeatrinca
Copy link

Hi! I'm trying to use the plugin in a WebGL build, but the lips are not moving.

Test procedure:

  1. Start a brand new project, install the 3.0.2 unity package.
  2. Comment out anything microphone related, so the build can succeed.
  3. Do a WebGL build of the scene called 01-1. Play Audio Clip

Result:
Lips are not moving. No errors are shown in the browser console, apart from a warning:

An AudioContext was prevented from starting automatically. It must be created or resumed after a user gesture on the page

Tested on the latest Firefox and Chrome browsers (Windows 10).

Notes:

  • Lips are moving correctly in the editor and Windows standalone builds
  • I've read the internet high and low and it looks like the job system should be supported, it's just that the job count is set to 0 so that everything runs on the main thread
  • I tried setting max job count to 0 in the unity editor to see if that makes any difference, but the editor works well with that setting as well. I did it by setting JobsUtility.JobWorkerCount = 0; in uLipSync.cs's awake method.

Has anyone solved this issue?

@horeatrinca
Copy link
Author

horeatrinca commented Nov 6, 2023

Actually found some very useful on the Hecomi website! Thank you!
https://tips.hecomi.com/entry/2023/05/30/215345

It looks like the plugin is not working in WebGL because OnAudioFilteredRead() is not supported on this platform.
A workaround is provided here:
https://github.com/uezo/uLipSyncWebGL
(Thanks to uezo)

Over the next few days I will be trying to use an alternative method. This is what I have so far in case someone in the future finds it useful. In update I have added:

    void Update()
    {
        if (!profile) return;
        if (!_jobHandle.IsCompleted) return;

#if UNITY_WEBGL
            ReadAudioDataWebGL();
#endif

then

private void ReadAudioDataWebGL()
{
	var currentTimeSamples = audioSource.timeSamples;
	var sampleCount = (currentTimeSamples + audioSource.clip.samples - _lastTimeSamples) % audioSource.clip.samples;
    if (sampleCount == 0)
        return;

	var readSamples = new float[sampleCount];


	audioSource.clip.GetData(readSamples, _lastTimeSamples);
	
	_lastTimeSamples = currentTimeSamples % audioSource.clip.samples;
	OnDataReceived(readSamples, audioSource.clip.channels);
}

There are some issues with this so far:

  1. On WebGL it gets out of sync if the browser doesn't have auto playback of audio.
  2. We introduce a ~16 ms delay.
  3. I don't think this works well for stereo audio clips.

Benefits over the uezo method would be that we don't go to javascript for sampling the audio data, but due to issue 1) this might not be avoidable, especially for looping clips.

@uezo
Copy link
Contributor

uezo commented Nov 29, 2023

Hi @horeatrinca ,

My solution (workaround😅) with uLipSyncWebGL:

Step1: Solve compilation error around Microphone

Add preprocessor directives for conditional compilation to MicUtil and uLipSyncMicrophone.

MicUtil

using UnityEngine;
using System.Collections.Generic;

namespace uLipSync
{
    public struct MicDevice
    {
        public string name;
        public int index;
        public int minFreq;
        public int maxFreq;
    }

    public static class MicUtil
    {
        public static List<MicDevice> GetDeviceList()
        {
            var list = new List<MicDevice>();

#if !UNITY_WEBGL || UNITY_EDITOR
            for (int i = 0; i < Microphone.devices.Length; ++i)
            {
                var info = new MicDevice
                {
                    name = Microphone.devices[i],
                    index = i
                };
                Microphone.GetDeviceCaps(info.name, out info.minFreq, out info.maxFreq);
                list.Add(info);
            }
#endif
            return list;
        }
    }
}

uLipSyncMicrophone

using UnityEngine;

namespace uLipSync
{

    [RequireComponent(typeof(AudioSource))]
    public class uLipSyncMicrophone : MonoBehaviour
    {
#if !UNITY_WEBGL || UNITY_EDITOR
        :
        :
#endif
    }
}

Step2: Overwrite uLipSyncWebGL.jslib

Overwrite whole code of uLipSyncWebGL.jslib.

NOTE: This uses some deprecated WebAudio APIs. Please let me know if you improve it!

mergeInto(LibraryManager.library, {
    InitWebGLuLipSync: function(targetObjectNamePtr, targetMethodNamePtr) {
        const targetObjectName = UTF8ToString(targetObjectNamePtr);
        const targetMethodName = UTF8ToString(targetMethodNamePtr);

        const outputHookNode = WEBAudio.audioContext.createScriptProcessor();
        outputHookNode.onaudioprocess = function (stream) {
            SendMessage(targetObjectName, targetMethodName, event.inputBuffer.getChannelData(0).join(','));
        };

        const connectAudioNodes = function (audioInstance) {
            if (audioInstance != null && audioInstance.hasOwnProperty("gain")) {
                // connect gain -> outputHookNode
                audioInstance.gain.connect(outputHookNode);
                // connect outputHookNode -> dest (dummy: no output data will go to dest)
                outputHookNode.connect(WEBAudio.audioContext.destination);
                console.log("Connected audio nodes successfully");
                return true;
            } else {
                return false;
            }
        };

        const jobId = setInterval(function() {
            for (var key in WEBAudio.audioInstances) {
                if (connectAudioNodes(WEBAudio.audioInstances[key])) {
                    // Continuously reconnect gain -> outputHookNode (they will be disconnected silently, I don't know why...)
                    setInterval(function() {
                        WEBAudio.audioInstances[key].gain.connect(outputHookNode);
                    }, 200);
                    clearInterval(jobId);
                    break;
                }
            }
        }, 200);
    },
});

It works on Chrome and Safari on MacOS.

@SameelNawaz
Copy link

thanks @uezo , I have tried the above solution. It works for me on WebGL. I am using Unity 2022.3.13f1 :)

@hecomi
Copy link
Owner

hecomi commented Dec 9, 2023

Hello everyone involved in this discussion,

I wanted to provide an update regarding the WebGL compatibility issues with uLipSync that have been reported. First, I appreciate the detailed feedback and the information shared here. They are invaluable for navigating these challenges.

Currently, I am actively working on finding good solutions for the WebGL support. I'd like to meet the goal of both supporting various use cases and keeping good maintainability. I wrote an article about my research here:

Based on the research, I'm working on the GetData() method as @horeatrinca introduced and it seems that it works well for now.

  • #if UNITY_WEBGL && !UNITY_EDITOR
    void UpdateWebGL()
    {
    if (!_audioSource) return;
    var clip = _audioSource.clip;
    if (!clip || clip.loadState != AudioDataLoadState.Loaded) return;
    int ch = clip.channels;
    int n = inputSampleCount * ch;
    if (_audioBuffer == null || _audioBuffer.Length != n)
    {
    _audioBuffer = new float[n];
    }
    int offset = _audioSource.timeSamples;
    offset = math.min(clip.samples - n - 1, offset);
    clip.GetData(_audioBuffer, offset);
    OnDataReceived(_audioBuffer, ch);
    }
    #endif

I will check more use cases and, at the same time, will try to support the microphone input.

Thank you for your continued support.

@hecomi
Copy link
Owner

hecomi commented Jan 2, 2024

Happy New Year to everyone.

At the end of last month, I released version v3.1.0, which includes the partial support for WebGL.

I have resolved several issues that were arising with the AudioClip.GetData() method, so if you are interested in the solution, please refer to the following blog article for more details:

I will now be moving on to work on microphone support.

@HunterProduction
Copy link

Thank you very much for your contribution with this tool @hecomi. I'm still having some issues in WebGL about syncronization between audio and lipsync. My application generates speech and plays it at runtime, and the lipsync should follow along to show speech animation (in this case, lipsync data cannot be baked).

The problem I'm facing is that even with AutoAudioSyncOnWebGL turned on, depending on the length and structure of the sentence, the clip often stops when the lipsync is still a bit late, so when the audioclip ends, the last blendshape values are not able to get back to the rest position (silence) in time.

I thought about the possibility to reproduce in loop a "silent" audioclip when my avatar does't talk, but it seems a not very clean workaround to me, so I first would like to understand if I'm getting something wrong with the tool setup, or if there is something that could be easily improved (I don't know, maybe a temporary workaround could be a method in the api that force the blendshapes to reset?)

@SameelNawaz
Copy link

Thank you very much for your contribution with this tool @hecomi. I'm still having some issues in WebGL about syncronization between audio and lipsync. My application generates speech and plays it at runtime, and the lipsync should follow along to show speech animation (in this case, lipsync data cannot be baked).

The problem I'm facing is that even with AutoAudioSyncOnWebGL turned on, depending on the length and structure of the sentence, the clip often stops when the lipsync is still a bit late, so when the audioclip ends, the last blendshape values are not able to get back to the rest position (silence) in time.

I thought about the possibility to reproduce in loop a "silent" audioclip when my avatar does't talk, but it seems a not very clean workaround to me, so I first would like to understand if I'm getting something wrong with the tool setup, or if there is something that could be easily improved (I don't know, maybe a temporary workaround could be a method in the api that force the blendshapes to reset?)

I am having the same issue, @HunterProduction have you found any solution ?

@hecomi
Copy link
Owner

hecomi commented Mar 30, 2024

I apologize for the delayed response!

Lip Sync Delay Issue

To address the synchronization issue in WebGL, you can use the Audio Sync Offset Time setting to adjust the timing.

This feature allows you to set an offset for the buffer index, which can help align the audio with the lipsync. Generally, setting this offset to around 0.1 seconds tends to yield satisfactory results.

Screen Shot 2024-03-30 at 22 05 32

In WebGL builds, the process involves fetching the currently playing AudioClip, obtaining the current playback position (AudioSource.timeSamples), retrieving the buffer for the analysis segment using AudioClip.GetData(), processing this buffer through the analysis algorithm synchronously, and returning the results via a callback. The timeSamples returned at this point significantly influence the lipsync timing, and fortunately, unlike the MonoBehaviour.OnAudioFilterRead() method used in non-WebGL builds, you can manually set an offset. Auto Audio Sync On WebGL is an option to fix the other issue that is caused only on WebGL.

In WebGL, due to the Autoplay Policy, audio will not play unless there is an interaction with the page content (such as a click). However, as Unity is still playing the sound internally, this results in a desynchronization of audio that was supposed to be playing from the start. Enabling Auto Audio Sync On Web GL will correct this discrepancy when user interaction occurs.

Issue with Mouth Remaining Open After AudioClip Playback in WebGL Build

This was a bug and I fixed them in the above commits. In the WebGL version, analysis occurs only while the AudioSource is playing. Once playback stops, the analysis ceases, and the last set of analysis results was used. The issue has been fixed by introducing a flag to check whether an analysis has occurred in the frame; if no analysis is performed, the volume is set to zero to skip processing.

I will soon release a new version that includes these fixes.

@hecomi
Copy link
Owner

hecomi commented Mar 30, 2024

@Yichiron
Copy link

Yichiron commented Aug 8, 2024

Thank you for developing an excellent lip-sync tool. Similar to previous requests, I am experiencing an issue where audio generated by an external server (VoiceVOX) stops partway when using WebRequest in a WebGL app to fetch the audio. During development in the Unity Editor, everything works perfectly, and it also works fine when using a static AudioClip after building to WebGL. Could you please provide a solution for this issue? Below is a similar code example for fetching and playing the audio.

[YouTube video link: https://www.youtube.com/watch?v=nEfc5X_4FRo]

Thank you.

This is code for requesting audioclip data.

    string jsonString = JsonUtility.ToJson(synthesisRequest);
    Debug.Log("JSON Payload: " + jsonString); // デバッグログ追加
    logtext.text += "JSON Payload: " + jsonString + "\n";

    // HTTPリクエストを作成
    UnityWebRequest request = new UnityWebRequest(apiUrl, "POST");
    byte[] bodyRaw = System.Text.Encoding.UTF8.GetBytes(jsonString);
    request.uploadHandler = new UploadHandlerRaw(bodyRaw);
    request.downloadHandler = new DownloadHandlerBuffer();
    request.SetRequestHeader("Content-Type", "application/json");

    // リクエストを送信
    yield return request.SendWebRequest();

    if (request.result != UnityWebRequest.Result.Success)
    {
        Debug.LogError("Error: " + request.error);
        Debug.LogError("Response Code: " + request.responseCode); // レスポンスコードを追加
        Debug.LogError("Response: " + request.downloadHandler.text); // レスポンステキストを追加
    }
    else
    {
        // 音声データを取得して再生
        byte[] audioData = request.downloadHandler.data;
        AudioClip audioClip = WavUtility.ToAudioClip(audioData, 0, "SynthesisClip");
        audioSource.clip = audioClip;
        audioSource.Play();          
    }

@uezo
Copy link
Contributor

uezo commented Aug 8, 2024

Hi @Yichiron ,
First of all, I think it would be better to consider this issue separately from LipSync.

As a way to isolate the issue, try using a different Text-to-Speech service (such as Google) to see if it works correctly. VOICEVOX likely requires CORS (Cross-Origin Resource Sharing) settings.

@Yichiron
Copy link

Yichiron commented Aug 9, 2024

Thank you Uezo, for the advice to use Google TTS! I've already resolved the CORS issue with VoiceVOX using a relay software hosted on Heroku. If it's just about playing the audio, everything works fine even on WebGL. However, I'm having trouble with lip-sync stopping midway. Even though the audio plays, the lip-sync seems to speed up and then stop. I tried Google TTS as well, but the result was the same as with VoiceVOX—the lip-sync moves like at double speed and then stops midway...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants