gh-90102: Remove isatty call during regular open #124922
Merged
+47
−5
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
TTYs are always character devices. If the interpreter knows a file is not a character device when it would call
isatty
, skip the isatty call. Insideopen()
in the same python library call there is a fresh stat result that contains that information. Use the stat result to skip a system call.This shortcut was suggested by @eryksun in 2021.
isatty
is not necessarily constant over time, but inside the python library call which opened the fd and has yet to return the fd, it seems reasonable to rely on the fd not changing. The fd may be visible to caller code which passes in anopener
or intercepts specific calls.For gh-120754, with this change reading text with
open('README.md').read()
is down to 6 system calls:When reading bytes with
buffering=0
(disabling BufferedIO which remoevs thelseek
), reading a regular file is down to 5 system calls (strace python -c "from pathlib import Path; Path('README.rst').read_bytes()"
)Benchmark
Run on a 2024 MacBook Air running Squoia 15.0
Benchmark code
Before:
After:
For files which are recently read / in OS filecache (or on fast devices), there is a ~15% performance overhead with
BufferedIO
(see: #122111 where I updatedPath.read_bytes()
and measured the change). Unfortunately multiple attempts I've made to do small reworks to make thelseek
unnecessary inBufferedIO
have resulted in no performance change and a lot of complexity.I have been developing a more broad refactoring that could reduce the
BufferedIO
overhead as well as several other I/O overheads while meeting current API expectations (ex. each layer of the stack re-figures outreadable
andwriteable
, every call must re-validate with many branches the fd state and arguments,_pyio
needs to copy at least once in user-space becauseos.read
can't be passed a buffer / always allocates a new one, etc.).I'm planning to put together a talk on "Journey to the center of
open().read()
" and submit it to present at a San Francisco bay area python meetup since as I've been working on this I've found it's a very intricate set of operations which didn't match my mental image in some interesting ways. Hope is to then do a second talk with sample working implementation which shows how could rework internals while keeping the existing Python I/O API to reduce overheads, increase readability, solve some longstanding bugs, and possibly enable usage ofio_uring
for more performance improvement. Overarching goal would be to get down to one largely python native I/O implementation with improved performance from the optimizations as well as opening new performance improvement avenues.@vstinner This replaces #121593 on top of #123412