whisperfile server: convert files without ffmpeg #568

cjpais · 2024-09-27T23:15:48Z

This PR allows the whisperfile server to convert .wav, .mp3, .flac, and .ogg into the appropriate .wav file for whisper (16-bit 16000Hz) without any dependency on ffmpeg.

The ffmpeg support still remains under the --convert flag.

The main change here is giving read_wav a file instead of a buffer. Before it was given a buffer when run through the server, and a filename when run through the cli. Now it is unified to always use a filename.

In addition is_wav_buffer was removed, as the codepath is dead with the changes to use a filename throughout. This function was always expecting a buffer, but was receiving both filenames and buffers.

…filename

jart

Thanks for doing this. Several people have requested this feature. read_wav() isn't ready to be used in this manner, but I can make it ready after merging this. I anticipate that it won't work if concurrent requests are sent to the server until I get rid of the global variable in common.cpp. This will obviously be fixed before the next release. So please don't publish any whisperfiles until we've had a chance to fix that.

We now have a new function slurp_audio_file() which replaces read_wav(). This function has simpler code, and allows us to avoid a temporary file. See #568

jart · 2024-09-28T05:52:05Z

whisper.cpp/server.cpp

            {
                fprintf(stderr, "error: failed to read WAV file\n");
                const std::string error_resp = "{\"error\":\"failed to read WAV file\"}";
                res.set_content(error_resp, "application/json");
                return;
            }
        }
+        // remove temp file
+        std::remove(temp_filename.c_str());


The function you want here is unlink().

We now have a new function slurp_audio_file() which replaces read_wav(). This function has simpler code, and allows us to avoid a temporary file. See #568

cjpais added 2 commits September 27, 2024 15:36

basic fix

f100f1a

remove is_wav_buffer which was asumming a buffer input, but it was a …

5f9d52a

…filename

jart approved these changes Sep 28, 2024

View reviewed changes

jart merged commit 7517a5f into Mozilla-Ocho:main Sep 28, 2024
2 checks passed

jart added a commit that referenced this pull request Sep 28, 2024

Rewrite audio file loader code

beb2f19

We now have a new function slurp_audio_file() which replaces read_wav(). This function has simpler code, and allows us to avoid a temporary file. See #568

jart reviewed Sep 28, 2024

View reviewed changes

jart added a commit that referenced this pull request Sep 28, 2024

Rewrite audio file loader code

74dfd21

We now have a new function slurp_audio_file() which replaces read_wav(). This function has simpler code, and allows us to avoid a temporary file. See #568

cjpais mentioned this pull request Oct 7, 2024

Bug: Failed to read audio file #556

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

whisperfile server: convert files without ffmpeg #568

whisperfile server: convert files without ffmpeg #568

cjpais commented Sep 27, 2024

jart left a comment

jart Sep 28, 2024

whisperfile server: convert files without ffmpeg #568

whisperfile server: convert files without ffmpeg #568

Conversation

cjpais commented Sep 27, 2024

jart left a comment

Choose a reason for hiding this comment

jart Sep 28, 2024

Choose a reason for hiding this comment