Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Piping to stdin is failing on Windows #2023

Closed
rotemdan opened this issue Apr 5, 2024 · 2 comments · Fixed by #2025
Closed

Piping to stdin is failing on Windows #2023

rotemdan opened this issue Apr 5, 2024 · 2 comments · Fixed by #2025

Comments

@rotemdan
Copy link
Contributor

rotemdan commented Apr 5, 2024

I can't get stdin pipe to work correctly on windows.

Trying type timit1.wav | main - does read the first 3898 bytes, but then it ends unexpectedly:

...
read_wav: read 3898 bytes from stdin

system_info: n_threads = 4 / 8 | AVX = 1 | AVX2 = 1 | AVX512 = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | METAL = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 0 | VSX = 0 | CUDA = 1 | COREML = 0 | OPENVINO = 0 |

main: processing '-' (1949 samples, 0.1 sec), 4 threads, 1 processors, 5 beams + best of 5, lang = en, task = transcribe, timestamps = 1 ...

...

The process tried to write to a nonexistent pipe.

There are also errors with with ffmpeg (same command line and wave file works on Linux):

ffmpeg -i timit1.wav -f wav -ar 16000 - | main -
error: failed to open WAV file from stdin
error: failed to read WAV file '-'

...

av_interleaved_write_frame(): Broken pipe
[out#0/wav @ 0000026847b05f40] Error muxing a packet
[out#0/wav @ 0000026847b05f40] Error writing trailer: Broken pipe
size=       4kB time=00:00:01.28 bitrate=  26.1kbits/s speed=3.41x
video:0kB audio:8kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown
Conversion failed!

I've also experienced the same problem when piping to stdin via Node.js process. I can see that the pipe is closed too early, after only a part of the data is written.

Does succeed when given a a 1 second silent wave file

type empty.wav | main -
read_wav: read 35890 bytes from stdin

system_info: n_threads = 4 / 8 | AVX = 1 | AVX2 = 1 | AVX512 = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | METAL = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 0 | VSX = 0 | CUDA = 1 | COREML = 0 | OPENVINO = 0 |

main: processing '-' (17945 samples, 1.1 sec), 4 threads, 1 processors, 5 beams + best of 5, lang = en, task = transcribe, timestamps = 1 ...


[00:00:00.000 --> 00:00:10.000]   [BLANK_AUDIO]

Possibly related to having all zero samples (not sure).

Similar behavior seen when run from Node.js.

Possible cause

I've located this code that may be related:

In common.cpp -> read_wav():

    if (fname == "-") {
        {
            uint8_t buf[1024];
            while (true)
            {
                const size_t n = fread(buf, 1, sizeof(buf), stdin);
                if (n == 0) {
                    break;
                }
                wav_data.insert(wav_data.end(), buf, buf + n);
            }
        }

        if (drwav_init_memory(&wav, wav_data.data(), wav_data.size(), nullptr) == false) {
            fprintf(stderr, "error: failed to open WAV file from stdin\n");
            return false;
        }

        fprintf(stderr, "%s: read %zu bytes from stdin\n", __func__, wav_data.size());
    }

Based on conversation with Microsoft Copilot chatbot, it suggested that it could be that on Windows, the pipe is interpreted as having a text mode by default. It suggests changing to binary mode.

Copilot's suggestion

Given the behavior you’re describing, it does seem like the issue could be related to the way stdin is being handled on Windows. The fact that whisper.cpp reads a certain amount of data and then stops without an error suggests that it might be interpreting a part of the binary data as an EOF marker due to Windows’ default text mode for stdin.

Switching stdin to binary mode is a common solution for this kind of problem on Windows because it prevents the system from misinterpreting binary data as control characters (like the Ctrl+Z EOF marker in text mode). Since you’ve confirmed that _setmode, O_BINARY, and fileno are not present in the codebase, adding the binary mode setting could potentially resolve the issue.

Here’s how you can modify the code to set stdin to binary mode:

#ifdef _WIN32
#include <fcntl.h>
#include <io.h>
#endif

// ...

if (fname == "-") {
    #ifdef _WIN32
    _setmode(_fileno(stdin), _O_BINARY);
    #endif

    // ... rest of the code ...
}

This may not be the actual cause of the issue, but it's still a possibility worth checking.

@rotemdan
Copy link
Contributor Author

rotemdan commented Apr 5, 2024

I've made the suggested changes locally and verified that Copilot's code does seem to fix the issue.

I've tried longer wave files using type wavefile.wav | main -. They all succeeded with the fix and failed without.

I also tested ffmpeg -i wavefile.wav -f wav -ar 16000 - | main -. It now also works with various wave file lengths.

I'll need to investigate more thoroughly before I submit a pull request.

@tamo
Copy link
Contributor

tamo commented Apr 5, 2024

Without #2025, fread on windows stops reading stdin at 0x1a (EOF)

$ xxd samples/jfk.wav | head
00000000: 5249 4646 465f 0500 5741 5645 666d 7420  RIFFF_..WAVEfmt 
00000010: 1000 0000 0100 0100 803e 0000 007d 0000  .........>...}..
00000020: 0200 1000 4c49 5354 1a00 0000 494e 464f  ....LIST....INFO
                              ^^

Note: Even with the proposed change, powershell < 7.4 mangles piped data.
For "-f -", you have to use cmd.exe or pwsh >= 7.4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants