Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

better name for re.error Exception class. #83162

Closed
Carreau mannequin opened this issue Dec 5, 2019 · 22 comments
Closed

better name for re.error Exception class. #83162

Carreau mannequin opened this issue Dec 5, 2019 · 22 comments
Labels
3.13 bugs and security fixes easy topic-regex type-feature A feature request or enhancement

Comments

@Carreau
Copy link
Mannequin

Carreau mannequin commented Dec 5, 2019

BPO 38981
Nosy @gpshead, @ezio-melotti, @serhiy-storchaka, @Carreau, @Sourabh025
PRs
  • bpo-38981: Rename re.error to re.ReCompileError for better readability. #17501
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2019-12-05.16:56:07.880>
    labels = ['expert-regex', 'easy', 'type-feature', '3.9']
    title = 'better name for re.error Exception class.'
    updated_at = <Date 2020-09-18.07:39:36.123>
    user = 'https://github.com/Carreau'

    bugs.python.org fields:

    activity = <Date 2020-09-18.07:39:36.123>
    actor = 'serhiy.storchaka'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Regular Expressions']
    creation = <Date 2019-12-05.16:56:07.880>
    creator = 'mbussonn'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 38981
    keywords = ['patch', 'easy']
    message_count = 11.0
    messages = ['357867', '357878', '357882', '357883', '357884', '357886', '357962', '357982', '357986', '358059', '358124']
    nosy_count = 6.0
    nosy_names = ['gregory.p.smith', 'ezio.melotti', 'mrabarnett', 'serhiy.storchaka', 'mbussonn', 'sourabh025']
    pr_nums = ['17501']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue38981'
    versions = ['Python 3.9']

    Linked PRs

    @Carreau
    Copy link
    Mannequin Author

    Carreau mannequin commented Dec 5, 2019

    better error/exception name for re.compile error.

    Currently the error raise by re.compile when it fails to compile is error defined in sre_constants.py:

    class error(Exception):
        """Exception raised for invalid regular expressions.
    
    

    This is quite disturbing as most exception start with an uppercase and have a tiny bit more descriptive name.

    Would it be possible to have it renamed as something more explicit like ReCompileError, and still keeping the potential error alias as deprecated ?

    @Carreau Carreau mannequin added the topic-regex label Dec 5, 2019
    @serhiy-storchaka
    Copy link
    Member

    It is common practice that the module specific exception is called just "error". There is nothing wrong with this.

    I do not see a need to introduce a different alias.

    @serhiy-storchaka
    Copy link
    Member

    See also the discussion about renaming json.loads(): https://mail.python.org/archives/list/python-ideas@python.org/thread/EJTIVQ2ZFSVHALTLRGFCOMOYGZYMKGQU/

    @Carreau
    Copy link
    Mannequin Author

    Carreau mannequin commented Dec 5, 2019

    Most of the module specific classes are Error, not error, at least with an uppercase E you know it's a class.

    if a novice sees :

    error: missing ), unterminated subpattern at position 0

    It will be relatively tough or them to figure out that error is the type of the exception.

    Also it's not because something works that you can't improve it ...

    @Carreau Carreau mannequin reopened this Dec 5, 2019
    @serhiy-storchaka
    Copy link
    Member

    Since it affects more than one module I suggest to discuss the idea about renaming exceptions of the Python-Ideas maillist first. Until different decision be made I am closing. Personally I think this is a duplicate of just discussed and rejected idea.

    @Carreau
    Copy link
    Mannequin Author

    Carreau mannequin commented Dec 5, 2019

    Thanks for the advice I've done that !

    Have a good day.

    @serhiy-storchaka
    Copy link
    Member

    @serhiy-storchaka serhiy-storchaka added 3.9 only security fixes type-feature A feature request or enhancement labels Dec 7, 2019
    @Carreau
    Copy link
    Mannequin Author

    Carreau mannequin commented Dec 7, 2019

    Thanks Serhiy,

    Here is a rough idea of how many places would be touched by renaming in the re module:

    Carreau@59e4c51 (50 additions and 42 deletions.).

    I haven't found any places that need changes in the C code, and need to polish documentation, rebuild and run the test suite before sending a PR.

    @gpshead
    Copy link
    Member

    gpshead commented Dec 7, 2019

    Strictly speaking not all of those _need_ to be touched given the old name is always going to exist for backwards compatibility. But I agree that we should update them as part of this regardless.

    I'd go forward with a PR.

    The only fallout I expect a change like this to have on users is in the very odd test scenario where someone has hardcoded the error name in a string. That is rare, especially for an re.error which is generally not an expected exception.

    @serhiy-storchaka
    Copy link
    Member

    I am not sure about the new name. "re" is an abbreviation, so if include it in the exception name it should be "RE". I am not sure what name is better: RECompileError, REParseError, RESyntaxError, REError, CompileError, ParseError, SyntaxError or Error.

    json raises JSONDecodeError, ElementTree raises ParseError, other xml modules raise ExpatError, csv raises Error, configparser raises subclasses of Error.

    Many modules (at least 18: aifc, binhex, concurrent.futures, configparser, copy, cvs, ftplib, locale, mailbox, shutil, sqlite, sunau, test.support, uu, wave, webbrowser, xdrlib, xmlrpc.client) have an exception named just Error for module-specific errors.

    @Carreau
    Copy link
    Mannequin Author

    Carreau mannequin commented Dec 9, 2019

    RECompileError, REParseError, RESyntaxError, REError, CompileError, ParseError, SyntaxError or Error,

    Many modules [...] have an exception named just Error

    RECompileError, REParseError, RESyntaxError, REError, CompileError, ParseError are all fine with me.

    SyntaxError would be super confusing IMHO.
    Remember the StackTrace does not get the fully qualified name, so it would be hard to distinguish from a SyntaxError with the Python Syntax.

    aifc, binhex, sunau, uu, xdrlib might be removed with PEP-594, And I find Error not informative enough. It suffers from the same issues as above, as stack traces do not have the full qualified name.

    I would also add that being able to search and find all occurrences of a given exceptions is useful, and that Error is too generic.

    Let me know your choice and I can rename.

    @sourabh025 sourabh025 mannequin added type-security A security issue and removed type-feature A feature request or enhancement labels Sep 16, 2020
    @serhiy-storchaka serhiy-storchaka added type-feature A feature request or enhancement and removed type-security A security issue labels Sep 18, 2020
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    @hugovk
    Copy link
    Member

    hugovk commented Apr 11, 2022

    aifc, binhex, sunau, uu, xdrlib might be removed with PEP-594

    Yes, aifc, sunau, uu and xdrlib made the list and will be deprecated in 3.11 and removed in 3.13.

    See PEP 594 – Removing dead batteries from the standard library and #91217.

    binhex isn't part of the PEP 594 because it was already deprecated in 3.9 and has been removed from 3.11:

    https://docs.python.org/3.11/whatsnew/3.11.html?highlight=binhex#removed


    Edit: pinging the nosy list after the bpo migration:

    @gpshead, @ezio-melotti, @serhiy-storchaka, @Carreau, @Sourabh025

    @AA-Turner AA-Turner added 3.12 bugs and security fixes and removed 3.9 only security fixes labels Jun 7, 2022
    @achhina
    Copy link
    Contributor

    achhina commented Jan 31, 2023

    @Carreau I see that you've closed your PR; would you still like to work on this? It seems that your PR was mostly done, and just required finalizing the actual exception name.

    @achhina
    Copy link
    Contributor

    achhina commented Feb 9, 2023

    I've brought the above patch in line with the current main branch as a draft PR - thanks again @Carreau. However, before I make any changes I'd like to see if we can agree on the new exception name.

    Looking at other popular libraries across several languages, there are several variations such as Java's PatternSyntaxException, but the ones that appear the most often seem to be Error, SyntaxError, and CompileError. I believe these without prefixes would not be much clearer than the original exception, so they should be prefixed with uppercase RE as it's an abbreviation.

    This leaves us with REError, RESyntaxError, and RECompileError. Admittedly subjectively, I prefer RESyntaxError and RECompileError over REError, as they seem to read better and as such provide more clarity.

    I imagine users will most likely see this exception due to syntax issues with the expression, and some libraries make the distinction that a compilation error is related to an invalid encoding vs. a syntax error. So with that in mind, plus @barneygale's point that it could be read as recompile error, I'd like to propose using RESyntaxError.

    cc @serhiy-storchaka, @gpshead as you've participated in these discussions before as well.

    @gpshead
    Copy link
    Member

    gpshead commented Feb 9, 2023

    I'm not feeling like there is a lot of value in doing this anymore.

    If I were choosing a new name my logic would look like:

    • I expect code to always access the exception from the module itself, I'd leave the "RE" prefix off of the exception name. Which suggests a simple re.Error to gain the uppercase spelling consistent with exception classes elsewhere.
    • But doing that change now is disruptive to code that unfortunately makes the poor choice to from re import * and does so later in life... so Error would then import and overwrite their own Error class. Plausibly breaking existing code. So loop back and put the prefix on it again.
    • We're back to the needlessly repetitive re.RESomeError spelling again, which is objectively ugly and unpleasing to type compared to re.error or re.Error. infinite decision loop.

    So I expect many will continue to type re.error regardless of the new name... My gut feeling is just to close this issue as "not planned". It's a historical oddity of the language from before a consistent style was well realized.

    @achhina
    Copy link
    Contributor

    achhina commented Feb 9, 2023

    I think for me when looking at this issue, the benefit of introducing this would be the additional clarity for beginners, as an error like this

    RESyntaxError: nothing to repeat at position 0

    would be more clear than what we have currently

    error: nothing to repeat at position 0

    I expect code to always access the exception from the module itself

    Currently that would be correct as from re import error would seem like a poor choice, however I've quite often seen from json import JSONDecodeError, and I'd imagine similarly people can make that decision with re.

    which is objectively ugly and unpleasing to type

    I can understand that, but as you mentioned for those folks re.error will always be available, so this should have a very low impact for them while providing a more informative exception name.

    @barneygale
    Copy link
    Contributor

    We have re.Pattern, why not re.PatternError? :)

    achhina added a commit to achhina/cpython that referenced this issue May 29, 2023
    @achhina
    Copy link
    Contributor

    achhina commented May 29, 2023

    We have re.Pattern, why not re.PatternError? :)

    I think PatternError would work well as well :). Both SyntaxError/PatternError would be good proxies to imply that there is an issue with the regular expression syntax (the most likely reason to see an error). IIRC when I was looking at other regex libraries in/outside of Python, some form of SyntaxError seemed to be more popular. So, that combined with if we're suggesting there's an issue with the regular expression syntax, I leaned towards SyntaxError.

    What I do like about PatternError is the possibility it adds to not have a prefix of RE, which makes it more pleasant to type and read. Although I'm leaning towards keeping the prefix, as I find it adds more clarity and seems to be in line with the popular exception format of <library><error_type>.

    So what I'm thinking is this:

    1. If we want to keep the prefix RE, then I'd suggest going with RESyntaxError.
    2. Otherwise, PatternError.

    Thoughts?

    achhina added a commit to achhina/cpython that referenced this issue Jun 4, 2023
    @terryjreedy
    Copy link
    Member

    I prefer PatternError over RESyntaxError over RECompileError

    @gpshead gpshead added 3.13 bugs and security fixes and removed 3.12 bugs and security fixes labels Jun 5, 2023
    achhina added a commit to achhina/cpython that referenced this issue Jun 6, 2023
    achhina added a commit to achhina/cpython that referenced this issue Jun 7, 2023
    achhina added a commit to achhina/cpython that referenced this issue Aug 27, 2023
    achhina added a commit to achhina/cpython that referenced this issue Aug 27, 2023
    achhina added a commit to achhina/cpython that referenced this issue Dec 3, 2023
    achhina added a commit to achhina/cpython that referenced this issue Dec 3, 2023
    achhina added a commit to achhina/cpython that referenced this issue Dec 3, 2023
    terryjreedy added a commit that referenced this issue Dec 11, 2023
    Renamed re.error for clarity, and kept re.error for backward compatibility.
    Updated idlelib files at TJR's request.
    ---------
    
    Co-authored-by: Matthias Bussonnier <mbussonnier@ucmerced.edu>
    Co-authored-by: Hugo van Kemenade <hugovk@users.noreply.github.com>
    Co-authored-by: Terry Jan Reedy <tjreedy@udel.edu>
    @terryjreedy
    Copy link
    Member

    I merged the near-minimal PR with re.error 'updated' only in idlelib. I will do adjusted backports of the idlelib changes to keep the main bodies of its files in sync and then close.

    terryjreedy added a commit to terryjreedy/cpython that referenced this issue Dec 12, 2023
    terryjreedy added a commit that referenced this issue Dec 12, 2023
    Backport idlelib part of #101677 with simple rename.
    miss-islington pushed a commit to miss-islington/cpython that referenced this issue Dec 12, 2023
    …ythonGH-112987)
    
    Backport idlelib part of pythonGH-101677 with simple rename.
    (cherry picked from commit fd3b894)
    
    Co-authored-by: Terry Jan Reedy <tjreedy@udel.edu>
    terryjreedy added a commit that referenced this issue Dec 12, 2023
    …2987) (#113013)
    
    Backport idlelib part of GH-101677 with simple rename.
    (cherry picked from commit fd3b894)
    
    Co-authored-by: Terry Jan Reedy <tjreedy@udel.edu>
    aisk pushed a commit to aisk/cpython that referenced this issue Feb 11, 2024
    Renamed re.error for clarity, and kept re.error for backward compatibility.
    Updated idlelib files at TJR's request.
    ---------
    
    Co-authored-by: Matthias Bussonnier <mbussonnier@ucmerced.edu>
    Co-authored-by: Hugo van Kemenade <hugovk@users.noreply.github.com>
    Co-authored-by: Terry Jan Reedy <tjreedy@udel.edu>
    Glyphack pushed a commit to Glyphack/cpython that referenced this issue Sep 2, 2024
    Renamed re.error for clarity, and kept re.error for backward compatibility.
    Updated idlelib files at TJR's request.
    ---------
    
    Co-authored-by: Matthias Bussonnier <mbussonnier@ucmerced.edu>
    Co-authored-by: Hugo van Kemenade <hugovk@users.noreply.github.com>
    Co-authored-by: Terry Jan Reedy <tjreedy@udel.edu>
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.13 bugs and security fixes easy topic-regex type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    8 participants