Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid calling isatty() for most open() calls #90102

Closed
collinanderson mannequin opened this issue Dec 1, 2021 · 8 comments
Closed

Avoid calling isatty() for most open() calls #90102

collinanderson mannequin opened this issue Dec 1, 2021 · 8 comments
Labels
3.13 bugs and security fixes performance Performance or resource usage topic-IO

Comments

@collinanderson
Copy link
Mannequin

collinanderson mannequin commented Dec 1, 2021

BPO 45944
Nosy @pitrou, @benjaminp, @serhiy-storchaka, @eryksun, @collinanderson
PRs
  • bpo-45944: Avoid calling isatty() for most open() calls #29870
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2021-12-01.05:08:15.763>
    labels = ['3.11', 'expert-IO', 'performance']
    title = 'Avoid calling isatty() for most open() calls'
    updated_at = <Date 2021-12-01.11:53:13.388>
    user = 'https://github.com/collinanderson'

    bugs.python.org fields:

    activity = <Date 2021-12-01.11:53:13.388>
    actor = 'eryksun'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['IO']
    creation = <Date 2021-12-01.05:08:15.763>
    creator = 'collinanderson'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 45944
    keywords = ['patch']
    message_count = 3.0
    messages = ['407427', '407434', '407444']
    nosy_count = 6.0
    nosy_names = ['pitrou', 'benjamin.peterson', 'stutzbach', 'serhiy.storchaka', 'eryksun', 'collinanderson']
    pr_nums = ['29870']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'performance'
    url = 'https://bugs.python.org/issue45944'
    versions = ['Python 3.11']

    Linked PRs

    @collinanderson collinanderson mannequin added 3.7 (EOL) end of life 3.8 (EOL) end of life 3.9 only security fixes 3.10 only security fixes 3.11 only security fixes topic-IO performance Performance or resource usage labels Dec 1, 2021
    @collinanderson
    Copy link
    Mannequin Author

    collinanderson mannequin commented Dec 1, 2021

    isatty() is a system call on linux. Most open()s are files, and we're already getting the size of the file. If it has a size, then we know it's not a atty, and can avoid calling it.

    @serhiy-storchaka
    Copy link
    Member

    What if change FileIO.isatty() instead? If make it returning False without invoking a system call if the file size is non-zero it will eliminate the need to expose _size.

    @serhiy-storchaka serhiy-storchaka removed 3.7 (EOL) end of life 3.8 (EOL) end of life 3.9 only security fixes 3.10 only security fixes labels Dec 1, 2021
    @eryksun
    Copy link
    Contributor

    eryksun commented Dec 1, 2021

    make it returning False without invoking a system call if the file
    size is non-zero it will eliminate the need to expose _size.

    I suggest using the file type instead of the size. There's no reason to call isatty() if it's not an S_IFCHR file. This will avoid calling isatty() on regular files that happen to be empty.

    In Windows, isatty(fd) is based solely on the file type, which is flagged in the fd record when a file descriptor is opened for a native file handle. It's not a system call, but it's also nearly worthless for how isatty() is typically used, since it's true for any S_IFCHR file (e.g. con, nul, com1).

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    @iritkatriel iritkatriel added 3.12 bugs and security fixes and removed 3.11 only security fixes labels Sep 7, 2022
    serhiy-storchaka added a commit to serhiy-storchaka/cpython that referenced this issue Nov 28, 2023
    Only invoke the isatty() system call for character devices and
    cache the result.
    @serhiy-storchaka
    Copy link
    Member

    #112495 is a simpler PR that implements the idea proposed by @eryksun. It also caches the isatty() result for subsequent calls.

    @serhiy-storchaka serhiy-storchaka added 3.13 bugs and security fixes and removed 3.12 bugs and security fixes labels Nov 28, 2023
    @serhiy-storchaka
    Copy link
    Member

    One test fails with #112495 if the stdin is not a terminal. For example:

    $ ./python -m test -v test_builtin -m test_input_no_stdout_fileno <python
    ...
    test test_builtin failed -- Traceback (most recent call last):
      File "/home/serhiy/py/cpython/Lib/test/test_builtin.py", line 2393, in test_input_no_stdout_fileno
        self.assertSequenceEqual(lines, expected)
        ~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
    AssertionError: Sequences differ: ['stdin.isatty(): False', "captured: 'prompt'"] != ('stdin.isatty(): True', "captured: 'prompt'")
    
    First differing element 0:
    'stdin.isatty(): False'
    'stdin.isatty(): True'
    
    - ['stdin.isatty(): False', "captured: 'prompt'"]
    ? ^                 ^^^^                        ^
    
    + ('stdin.isatty(): True', "captured: 'prompt'")
    ? ^                 ^^^                        ^
    

    This is because the result of isatty() is not constant during the lifetime of the file object. The file descriptor can be reassigned to other file or device, so it should not be cached.

    #29870 uses the cached value only in open(), immediately after creating the FileIO object, so it is not affected by this. But it is more complex, and making it more portable will make it even more complex. How much time it saves?

    @vstinner
    Copy link
    Member

    PR gh-112495 was closed. @serhiy-storchaka wrote:

    It does not work, because the file descriptor can be reassigned from tty to non-tty and vice versa. The result of isatty() can not be cached.

    Should we close the issue?

    @cmaloney
    Copy link
    Contributor

    I think it's okay to rely on the fd not being modified inside of open() which is what #121593 does. It doesn't cache isatty generally (any future call still does a isatty libc call), just says "we can skip this call/check during open if we already have enough information from stat that the fd isn't a TTY".

    cmaloney added a commit to cmaloney/cpython that referenced this issue Sep 10, 2024
    cmaloney added a commit to cmaloney/cpython that referenced this issue Sep 10, 2024
    cmaloney added a commit to cmaloney/cpython that referenced this issue Oct 3, 2024
    cmaloney added a commit to cmaloney/cpython that referenced this issue Oct 3, 2024
    vstinner added a commit that referenced this issue Oct 8, 2024
    Co-authored-by: Victor Stinner <vstinner@python.org>
    cmaloney added a commit to cmaloney/cpython that referenced this issue Oct 8, 2024
    Spotted by @ngnpope.
    
    `isatty` returns False to indicate the file is not a TTY. The C
    implementation of _io does that (`Py_RETURN_FALSE`) but I got the
    bool backwards in the _pyio implementaiton.
    vstinner pushed a commit that referenced this issue Oct 8, 2024
    Spotted by @ngnpope.
    
    `isatty` returns False to indicate the file is not a TTY. The C
    implementation of _io does that (`Py_RETURN_FALSE`) but I got the
    bool backwards in the _pyio implementaiton.
    @vstinner
    Copy link
    Member

    vstinner commented Oct 8, 2024

    Optimization implemented by change cc9b9be.

    @vstinner vstinner closed this as completed Oct 8, 2024
    cmaloney added a commit to cmaloney/cpython that referenced this issue Oct 8, 2024
    efimov-mikhail pushed a commit to efimov-mikhail/cpython that referenced this issue Oct 9, 2024
    Co-authored-by: Victor Stinner <vstinner@python.org>
    efimov-mikhail pushed a commit to efimov-mikhail/cpython that referenced this issue Oct 9, 2024
    Spotted by @ngnpope.
    
    `isatty` returns False to indicate the file is not a TTY. The C
    implementation of _io does that (`Py_RETURN_FALSE`) but I got the
    bool backwards in the _pyio implementaiton.
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.13 bugs and security fixes performance Performance or resource usage topic-IO
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants