Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up reading from in_stream #983

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

meshy
Copy link

@meshy meshy commented Jan 20, 2024

By only sleeping when the input stream is waiting, this change ensures that we don't delay when there is still data available to read.

This provides a significant speed improvement to scripts which are passing a populated stream of data, rather than awaiting user input from stdin.

I had to modify an existing test to ensure that it continued to see sleeps.

I have also changed the default chunk read size to match Python's default, which is the "“Preferred” blocksize for efficient file system I/O." (Of course, I'm happy to revert this if it makes the PR more palatable.)

Fixes #774

This change means that we don't delay reading the input stream when
there is still data available to read from it.

This provides a significant speed improvement to scripts which are
passing a populated stream of data, rather than awaiting user input from
stdin.

See pyinvoke#774
@meshy meshy force-pushed the in_stream-sleep-speed branch from 52255a5 to 75fbed7 Compare January 20, 2024 19:24
I observed minor performance improvements by using this larger chunk
read size.

Python's docs describe `io.DEFAULT_BUFFER_SIZE` as:

> An int containing the default buffer size used by the module’s
> buffered I/O classes. open() uses the file’s blksize ...

The docs on `blksize` say:

> “Preferred” blocksize for efficient file system I/O. Writing to a file
> in smaller chunks may cause an inefficient read-modify-rewrite.

References:

- https://docs.python.org/3/library/io.html#io.DEFAULT_BUFFER_SIZE
- https://docs.python.org/3/library/os.html#os.stat_result.st_blksize
@meshy meshy changed the title Only sleep when input stream is waiting Speed up reading from in_stream Jan 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Reading from in_stream is extremely slow by default
1 participant