Voice channel audio from bot is choppy on Windows #978

KontraCity · 2023-10-30T11:57:51Z

Git commit reference
3d1318c

Describe the bug
When Bot is run on Windows, audio sent in voice channel is choppy. It sounds like bot is late every time it needs to send next Opus audio frame. It may be an issue with Microsoft's C++ STD library implementation of std::high_resolution_clock.
When the same exact code is run on Linux, audio is crystal clear.

To Reproduce
Steps to reproduce the behavior:

Run bot on Windows computer
Send some audio with send_audio_opus() or send_audio_raw()
Observe the problem

Expected behavior
Bot should play audio on Windows the same way it does on Linux

System Details:

OS: Microsoft Windows Version 22H2 (OS build 19045.3570)
Desktop Discord Client is used for testing

The text was updated successfully, but these errors were encountered:

KontraCity · 2023-10-30T11:58:16Z

The issue was initially discovered here

Mishura4 · 2023-10-31T21:03:57Z

Run bot on Windows computer

What bot? Do you have code or a sample repository we can try that with? What audio? Do you have a specific file? How do you load it?

Jaskowicz1 · 2023-11-02T11:15:51Z

Audio sent this way via DPP-UE doesn't seem to suffer from this issue (you can view this here). I will look into this to try find a solution (and a fix if needed).

Jaskowicz1 · 2023-11-02T11:21:39Z

However, do take into account what @Mishura4 said. It really could be a setup issue on your side and we need a test case (a small bot that can replicate the issue). Check that it happens with other files too.

Jaskowicz1 · 2023-11-02T13:53:49Z

Tried to replicate this, but I can't. Please follow what @Mishura4 said!

KontraCity · 2023-11-02T14:43:34Z

Going through some testing I found out that there is almost no stuttering when data is send via send_audio_raw(), though it is still worse than Linux. I will send send_audio_opus() example soon.

KontraCity · 2023-11-02T18:27:23Z

Here is how I am able to produce the problem:

// STL modules
#include <iostream>
#include <string>
#include <fstream>

// Library DPP
#include <dpp/dpp.h>

extern "C" {
    // FFmpeg libraries
    #include <libavformat/avformat.h>
    #include <libavcodec/avcodec.h>
}

/*
*   My setup:
* 
*   Windows:
*       > vcpkg install dpp:x64-windows
*       > vcpkg install ffmpeg:x64-windows
*       > vcpkg integrate project
*       > vcpkg integrate install
*       [compiling in Visual Studio 17]
*
*   Debian Linux:
*       [dpp built from source]
*       $ apt install libavformat-dev
*       $ apt install libavcodec-dev
*       $ g++ main.cpp -std=c++17 -ldpp -lavformat -lavcodec -o main
*/

constexpr const char* BotToken = "<token>";
constexpr const char* AudioFilename = "audio.opus";

int main()
{
    dpp::cluster bot(BotToken);

    bot.on_slashcommand([&bot](const dpp::slashcommand_t& event)
    {
        if (event.command.get_command_name() == "join")
        {
            dpp::guild* guild = dpp::find_guild(event.command.guild_id);

            if (!guild->connect_member_voice(event.command.get_issuing_user().id))
            {
                event.reply("You're not in voice channel!");
                return;
            }

            event.reply("Joined your channel");
        }
        else if (event.command.get_command_name() == "play")
        {
            dpp::voiceconn* voiceConnection = event.from->get_voice(event.command.guild_id);

            if (!voiceConnection || !voiceConnection->voiceclient || !voiceConnection->voiceclient->is_ready())
            {
                event.reply("I'm not in a voice channel!");
                return;
            }

            AVFormatContext* formatContext = nullptr;
            if (avformat_open_input(&formatContext, AudioFilename, nullptr, nullptr) < 0)
            {
                event.reply("Couldn't open file");
                return;
            }

            if (avformat_find_stream_info(formatContext, nullptr) < 0)
            {
                avformat_close_input(&formatContext);

                event.reply("Couldn't find stream info");
                return;
            }

            int bestStreamId = av_find_best_stream(formatContext, AVMEDIA_TYPE_AUDIO, -1, -1, nullptr, 0);
            if (bestStreamId < 0)
            {
                avformat_close_input(&formatContext);

                event.reply("Couldn't find best stream ID");
                return;
            }

            AVStream* stream = formatContext->streams[bestStreamId];
            const AVCodec* decoder = avcodec_find_decoder(stream->codecpar->codec_id);
            if (!decoder)
            {
                avformat_close_input(&formatContext);

                event.reply("Couldn't find decoder");
                return;
            }

            AVCodecContext* codecContext = avcodec_alloc_context3(decoder);
            if (!codecContext)
            {
                avformat_close_input(&formatContext);

                event.reply("Couldn't allocate codec context");
                return;
            }

            if (avcodec_open2(codecContext, decoder, nullptr) != 0)
            {
                avformat_close_input(&formatContext);
                avcodec_close(codecContext);

                event.reply("Couldn't open decoder context");
                return;
            }

            size_t bytesSent = 0;
            while (true)
            {
                AVPacket packet;
                if (av_read_frame(formatContext, &packet) < 0)
                {
                    av_packet_unref(&packet);
                    break;
                }

                voiceConnection->voiceclient->send_audio_opus(packet.data, packet.size);
                bytesSent += packet.size;
                av_packet_unref(&packet);
            }

            avformat_close_input(&formatContext);
            avcodec_close(codecContext);

            event.reply(std::to_string(bytesSent) + " bytes sent");
        }
    });

    bot.on_ready([&bot](const dpp::ready_t& event)
    {
        if (dpp::run_once<struct register_bot_commands>())
        {
            dpp::slashcommand joinCommand("join", "Join your voice channel.", bot.me.id);
            dpp::slashcommand playCommand("play", "Play the file.", bot.me.id);

            bot.global_bulk_command_create({ joinCommand, playCommand });
        }
    });

    bot.start(false);
    return 0;
}

FFmpeg libraries produce the same frames on both platforms, so it looks like a DPP problem.
The file to play can be downloaded here. (https://opus-codec.org/examples/)

braindigitalis · 2023-11-03T00:34:06Z

It appears we may already have a fix for this.

Please read this awfully wordy but detailed description of send_audio_type_t in the discordvoiceclient.

Switching the value of this member variable to overlapped audio will get rid of the stutter. Others working with voice have already had, and created workarounds for these issues which seem to be as you said problems with accuraccy or resolution of some systems chrono implementations.

braindigitalis · 2023-11-03T00:36:10Z

in short the important note is here at the bottom:

There are some inaccuracies in the throttling method used by the recorded audio mode on some systems (mainly Windows) which causes gaps and stutters in the resulting audio stream. The overlap audio mode provides a different implementation that fixes the issue. This method is slightly more CPU intensive, and should only be used if you encounter issues with recorded audio on your system.

KontraCity · 2023-11-03T07:12:47Z

Yes, it fixes the issue. Could be nice if this function was mentioned in examples for new Windows developers in the future.

Jaskowicz1 · 2023-11-03T07:25:48Z

We could potentially pin this issue 🤔

braindigitalis · 2023-11-03T09:39:35Z

why don't we just default the audio type to overlapped on windows? the cpu usage talked about in the comment is minimal enough that nobody will notice, when they talk about it using more CPU they mean on the game dev kind of levels of CPU usage, like a few extra nanoseconds...
we could just wrap the default in #ifdef _WIN32

Jaskowicz1 · 2023-11-03T09:43:20Z

why don't we just default the audio type to overlapped on windows? the cpu usage talked about in the comment is minimal enough that nobody will notice, when they talk about it using more CPU they mean on the game dev kind of levels of CPU usage, like a few extra nanoseconds...

we could just wrap the default in #ifdef _WIN32

Sounds even better to me! More than happy to approve that.

Jaskowicz1 · 2023-11-14T12:17:06Z

Sorry to necro, the PR #1004 now implements what we discussed @braindigitalis.

KontraCity added the bug Something isn't working label Oct 30, 2023

KontraCity assigned braindigitalis Oct 30, 2023

Jaskowicz1 added the audio audio issues or feature requests label Oct 31, 2023

Jaskowicz1 self-assigned this Nov 2, 2023

Jaskowicz1 removed their assignment Nov 2, 2023

KontraCity closed this as completed Nov 3, 2023

Jaskowicz1 mentioned this issue Nov 14, 2023

fix: changed default audio type for windows #1004

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Voice channel audio from bot is choppy on Windows #978

Voice channel audio from bot is choppy on Windows #978

KontraCity commented Oct 30, 2023

KontraCity commented Oct 30, 2023

Mishura4 commented Oct 31, 2023 •

edited

Loading

Jaskowicz1 commented Nov 2, 2023

Jaskowicz1 commented Nov 2, 2023

Jaskowicz1 commented Nov 2, 2023

KontraCity commented Nov 2, 2023

KontraCity commented Nov 2, 2023

braindigitalis commented Nov 3, 2023 •

edited

Loading

braindigitalis commented Nov 3, 2023

KontraCity commented Nov 3, 2023

Jaskowicz1 commented Nov 3, 2023

braindigitalis commented Nov 3, 2023

Jaskowicz1 commented Nov 3, 2023

Jaskowicz1 commented Nov 14, 2023

Voice channel audio from bot is choppy on Windows #978

Voice channel audio from bot is choppy on Windows #978

Comments

KontraCity commented Oct 30, 2023

KontraCity commented Oct 30, 2023

Mishura4 commented Oct 31, 2023 • edited Loading

Jaskowicz1 commented Nov 2, 2023

Jaskowicz1 commented Nov 2, 2023

Jaskowicz1 commented Nov 2, 2023

KontraCity commented Nov 2, 2023

KontraCity commented Nov 2, 2023

braindigitalis commented Nov 3, 2023 • edited Loading

braindigitalis commented Nov 3, 2023

KontraCity commented Nov 3, 2023

Jaskowicz1 commented Nov 3, 2023

braindigitalis commented Nov 3, 2023

Jaskowicz1 commented Nov 3, 2023

Jaskowicz1 commented Nov 14, 2023

Mishura4 commented Oct 31, 2023 •

edited

Loading

braindigitalis commented Nov 3, 2023 •

edited

Loading