Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an autocommit property to sqlite3.Connection with a PEP 249 compliant manual commit mode and migrate #83638

Closed
maggyero mannequin opened this issue Jan 26, 2020 · 40 comments · Fixed by #93823
Closed
Labels
stdlib Python modules in the Lib dir topic-sqlite3 type-feature A feature request or enhancement

Comments

@maggyero
Copy link
Mannequin

maggyero mannequin commented Jan 26, 2020

BPO 39457
Nosy @malemburg, @bitdancer, @maggyero, @erlend-aasland

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2020-01-26.19:17:09.219>
labels = ['type-feature', 'library', '3.9']
title = 'Add an autocommit property to sqlite3.Connection with a PEP 249 compliant manual commit mode and migrate'
updated_at = <Date 2021-01-06.13:21:56.616>
user = 'https://github.com/maggyero'

bugs.python.org fields:

activity = <Date 2021-01-06.13:21:56.616>
actor = 'zzzeek'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['Library (Lib)']
creation = <Date 2020-01-26.19:17:09.219>
creator = 'maggyero'
dependencies = []
files = []
hgrepos = []
issue_num = 39457
keywords = []
message_count = 12.0
messages = ['360732', '360734', '360735', '384383', '384397', '384414', '384417', '384421', '384425', '384485', '384499', '384501']
nosy_count = 7.0
nosy_names = ['lemburg', 'ghaering', 'r.david.murray', 'zzzeek', 'maggyero', 'erlendaasland', 'james.oldfield']
pr_nums = []
priority = 'normal'
resolution = None
stage = None
status = 'open'
superseder = None
type = 'enhancement'
url = 'https://bugs.python.org/issue39457'
versions = ['Python 3.9']

@maggyero
Copy link
Mannequin Author

maggyero mannequin commented Jan 26, 2020

In non-autocommit mode (manual commit mode), the sqlite3 database driver implicitly issues a BEGIN statement before each DML statement (INSERT, UPDATE, DELETE, REPLACE) not already in a database transaction, BUT NOT before DDL statements (CREATE, DROP) nor before DQL statements (SELECT) (cf. https://github.com/python/cpython/blob/master/Modules/_sqlite/cursor.c#L480):

    /* We start a transaction implicitly before a DML statement.
       SELECT is the only exception. See python/cpython#54133. */
    if (self->connection->begin_statement && self->statement->is_dml) {
        if (sqlite3_get_autocommit(self->connection->db)) {
            result = _pysqlite_connection_begin(self->connection);
            if (!result) {
                goto error;
            }
            Py_DECREF(result);
        }
    }

Like Mike Bayer explained in issue bpo-9924, this is not what other database drivers do, and this is not PEP-249 compliant (Python Database API Specification v2.0), as its author Marc-André Lemburg explained (cf. https://mail.python.org/pipermail/db-sig/2010-September/005645.html):

Randall Nortman wrote:
PEP-249 says that transactions end on commit() or rollback(), but it
doesn't explicitly state when transactions should begin, and there is
no begin() method.

Transactions start implicitly after you connect and after you call
.commit() or .rollback(). They are not started for each statement.

I think the implication is that transactions begin
on the first execute(), but that's not explicitly stated. At least
one driver, pysqlite2/sqlite3, does not start a transaction for a
SELECT statement. It waits for a DML statement (INSERT, UPDATE,
DELETE) before opening a transaction. Other drivers open transactions
on any statement, including SELECT.

My question for the DB-SIG is: Can I call it a bug in pysqlite2 that
it does not open transactions on SELECT? Should the spec be amended
to make this explicit? Or are both behaviors acceptable, in which
case perhaps a begin() method needs to be added for when the user
wants control over opening transactions?

I should probably add a note to PEP-249 about this.

Aymeric Augustin said in issue bpo-10740:

While you're there, it would be cool to provide "connection.autocommit = True" as an API to enable autocommit, because "connection.isolation_level = None" isn't a good API at all -- it's very obscure and has nothing to do with isolation level whatsoever.

So I suggest that we introduce a new autocommit property and use it to enable a truly PEP-249 compliant manual commit mode (that is to say with transactions starting implicitly after connect(), commit() and rollback() calls, allowing transactional DDL and DQL):

autocommit = True  # enable the autocommit mode
autocommit = False  # disable the autocommit mode (enable the new PEP 249 manual commit mode)
autocommit = None  # fallback to the commit mode set by isolation_level

I also suggest that we use this new PEP-249 manual commit mode (with transactional DDL and DQL) by default and drop the old manual commit mode (without transactional DDL and DQL). We could use the following migration strategy:

  1. During the deprecation period:
  • Add the new autocommit property with the value None by default, so that the old manual commit mode is still the default.
  • Add a deprecation warning for the value None of the autocommit property, in favor of the other values True and False. It will prompt users who enabled the autocommit mode with isolation_level = None to use autocommit = True instead, and users who disabled the autocommit mode (that is to say users who enabled the old manual commit mode) with isolation_level = DEFERRED/IMMEDIATE/EXCLUSIVE to use autocommit = False instead AND add to their code the potentially missing commit() calls required by the new PEP-249 manual commit mode.
  1. After the deprecation period:
  • Set the value of the autocommit property to False by default, so that the new PEP-249 manual commit mode becomes the new default.
  • Remove the value None of the autocommit property and its deprecation warning.
  • Remove the value None of the isolation_level property, so that the old manual commit mode disappears.

@maggyero maggyero mannequin added 3.8 (EOL) end of life stdlib Python modules in the Lib dir type-feature A feature request or enhancement labels Jan 26, 2020
@maggyero maggyero mannequin changed the title Add an autocommit property to sqlite3.Connection with truly PEP 249 compliant manual commit mode and migrate Add an autocommit property to sqlite3.Connection with a PEP 249 compliant manual commit mode and migrate Jan 26, 2020
@maggyero
Copy link
Mannequin Author

maggyero mannequin commented Jan 26, 2020

  • Remove the value None of the isolation_level property, so that the old manual commit mode disappears.

Correction:

  • Remove the value None of the isolation_level property.

@SilentGhost SilentGhost mannequin added 3.9 only security fixes and removed 3.8 (EOL) end of life labels Jan 26, 2020
@maggyero
Copy link
Mannequin Author

maggyero mannequin commented Jan 26, 2020

  • Remove the value None of the autocommit property and its deprecation warning.

Correction:

  • Remove the value None of the autocommit property and its deprecation warning, so that the old manual commit mode disappears.

@jamesoldfield
Copy link
Mannequin

jamesoldfield mannequin commented Jan 5, 2021

If this ever gets implemented, "autocommit" would be a terrible name for it. That word has a very specific meaning in SQLite, which is essentially the same as "not in a transaction started with BEGIN ...". At the moment, if you want to explicitly control when transactions start (a good idea considering how confusing the current behaviour is) then you would set isolation_mode to None and manually start a transaction with execute("BEGIN") - at which point you are NOT in autocommit mode, until you commit or rollback. According to this proposal, if I want manual control over transactions, I would set conn.autocommit = True, even though I don't want autocommit mode (according to SQLite's definition)!

@maggyero
Copy link
Mannequin Author

maggyero mannequin commented Jan 5, 2021

If this ever gets implemented, "autocommit" would be a terrible name for it. That word has a very specific meaning in SQLite, which is essentially the same as "not in a transaction started with BEGIN ...".

Yes if you are talking about SQLite, the database ENGINE: the SQL statements inside BEGIN and COMMIT are said to be in manual commit mode, while SQL statements outside are said to be in autocommit mode. So the autocommit mode is the default mode for database ENGINES.

But here I am talking about SQLite3, the Python database DRIVER. You do not issue BEGIN statements with database DRIVERS, they are issued implicitly, so that the manual mode is the default mode for database DRIVERS.

Cf. this Stack Overflow answer for more details: https://stackoverflow.com/a/48391535/2326961

At the moment, if you want to explicitly control when transactions start (a good idea considering how confusing the current behaviour is)

No, you do not want that at the database DRIVER level. Because like Mike Bayer explained in issue bpo-9924, this is not what other database DRIVERS do, and this is not PEP-249 compliant (Python Database API Specification v2.0), as its author Marc-André Lemburg explained (cf. https://mail.python.org/pipermail/db-sig/2010-September/005645.html):

Randall Nortman wrote:

PEP-249 says that transactions end on commit() or rollback(), but it

doesn't explicitly state when transactions should begin, and there is

no begin() method.

Transactions start implicitly after you connect and after you call .commit() or .rollback(). They are not started for each statement.

@jamesoldfield
Copy link
Mannequin

jamesoldfield mannequin commented Jan 5, 2021

Yes if you are talking about SQLite, the database ENGINE

I sure was! In this comment I will stick to saying either "SQLite engine" or "sqlite3 driver" as appropriate, hopefully that will be clearer.

But here I am talking about SQLite3, the Python database DRIVER

Yep, I was aware of that. I was trying to say, please don't use the word "autocommit" in the sqlite3 driver when that word has a related but different meaning in the SQLite engine.

You do not issue BEGIN statements with database DRIVERS, they are issued implicitly, so that the manual mode is the default mode for database DRIVERS.

This sentence isn't literally true for several reasons (you say "you do not" but I certainly do, you use of "with database drivers" is dubious, and you seem to have causality in the wrong direction). I think there might be a bit of a language barrier here, so I hope you don't mind if I leave this to one side.

Cf. this Stack Overflow answer for more details: https://stackoverflow.com/a/48391535/2326961

I am fully, and painfully, aware of when the sqlite3 driver code will automatically issue BEGIN statements to the engine. I have no need to read StackOverflow answers about it, I have read the C source code to sqlite3 (and pysqlite) directly. I spent more time than I care to admit recently doing that! In fact that happened as a result of reading several confusing StackOverflow answers about transactions (maybe I'll write my own and add to the confusion...)

What that answer doesn't mention is that, even with even with isolation_mode=None, it's perfectly possible to start a transaction, which takes the SQLite engine out of autocommit mode. This is fully and intentionally supported by the sqlite3 driver, and the original author has said so and even recommended. For example, let's look at this code:

    conn = sqlite3.connect(path, isolation_mode=None)
    conn.execute("INSERT INTO test (i) VALUES (?)", (1,))  # stmt 1
    foo = conn.execute("SELECT * FROM test").fetchall()    # stmt 2
    conn.execute("BEGIN")                                  # stmt 3
    conn.execute("INSERT INTO test (i) VALUES (?)", (4,))  # stmt 4
    bar = conn.execute("SELECT * FROM test").fetchall()    # stmt 5
    conn.execute("COMMIT")                                 # stmt 6

Statement 1 and statement 2 execute using the SQLite engine's autocommit mode. Statements 3 through to 5 execute in a single transaction and do *not* use the SQLite engine's autocommit mode. (Technically statement 6 actually does use autocommit because COMMIT uses the autocommit mechanism under the hood ... but let's forget about that!)

Under your proposal, the first line would be changed to say "autocommit=True", even though not all the code below is in autocommit mode (according to the SQLite engine's definition). What's more, I could insert this line of code between statements 3 and 6:

    print("Autocommit mode?", conn.autocommit)

And it would print True even though autocommit mode is off!

Now, maybe your reaction is that "autocommit mode *in the driver*" can have a different meaning from "autocommit mode *in the engine*". Yes, it can, but that doesn't mean it should! Please, just pick a different name! For example, say "manual mode" (instead of autocommit=True) or "auto-start-transaction mode" (instead of autocommit=False).

No, you do not want that at the database DRIVER level. Because like Mike Bayer explained in issue bpo-9924, this is not what other database DRIVERS do, and this is not PEP-249 compliant

The "that" you are referring to here was when I said that I prefer to set isolation_level = None, like the above code snippet. Do not tell me that it is not what I want; it certainly IS what I want! I do not want the sqlite3 driver getting in the way between me and the SQLite engine. Many future users of the sqlite3 driver are likely to feel the same way, and the API should allow that to happen clearly.

@malemburg
Copy link
Member

I think there's a bit of a misunderstanding here. When relying on
a DB-API driver's transaction API, you are not allowed to issue
separate transaction commands to the DB backend via the .execute()
methods. You have to use conn.commit() and conn.rollback().

The DB-API wants drivers to have connections default to transactional
behavior and implicitly start transactions when opening the connection
as well as when ending one (via .commit() or .rollback()) - provided the
backend does support transactions.

It also suggests that a method may be used to set the transactional
behavior.

Now, historically, many drivers have not always used methods for
this, but instead (or in addition) provide a property
connection.autocommit to allow setting or querying the current
state. Over time, this has become more or less a standard.

Aside: This is a bit unfortunate, since users would not expect
exceptions from such properties (e.g. network errors), but this where
things have moved and it's hard to change.

I guess the SQLite driver does not start a new transaction for
SELECTs, since these are usually read-only, but don't know whether
this still holds today (e.g. think of UDFs running INSERTs or UPDATEs).

For the same reason, removing the SELECT "optimization" may cause
a backwards incompatible change, which can be tricky to identify
and cause corruption of data (in this case, data not written to
the database, where it previously was written).

@maggyero
Copy link
Mannequin Author

maggyero mannequin commented Jan 5, 2021

@james.oldfield

What that answer doesn't mention is that, even with even with isolation_mode=None, it's perfectly possible to start a transaction, which takes the SQLite engine out of autocommit mode.

Exactly, so since autocommit=True is equivalent to isolation_mode=None I do not see why you the name ‘autocommit’ would be a problem. As you said, when you issue BEGIN, you leave autocommit mode.

Under your proposal, the first line would be changed to say "autocommit=True", even though not all the code below is in autocommit mode (according to the SQLite engine's definition).

What is the difference with isolation_mode=None which also means autocommit mode?

What's more, I could insert this line of code between statements 3 and 6:
print("Autocommit mode?", conn.autocommit)
And it would print True even though autocommit mode is off!

No, because the autocommit property would be automatically updated to False at conn.execute("BEGIN"), which is the standard behaviour as @lemburg explained.

@lemburg

I guess the SQLite driver does not start a new transaction for SELECTs, since these are usually read-only

Nor for DDL statements (CREATE, DROP).

For the same reason, removing the SELECT "optimization" may cause
a backwards incompatible change, which can be tricky to identify
and cause corruption of data (in this case, data not written to
the database, where it previously was written).

Since DQL statements (SELECT) are read-only, maybe we could keep the optimization and start transactions implicitly only for DDL statements (CREATE, DROP)?

@zzzeek
Copy link
Mannequin

zzzeek mannequin commented Jan 5, 2021

Under your proposal, the first line would be changed to say "autocommit=True", even though not all the code below is in autocommit mode (according to the SQLite engine's definition). What's more, I could insert this line of code between statements 3 and 6:

> print("Autocommit mode?", conn.autocommit)

As Marc-Andre indicated, this is in fact how "autocommit" behaves on other drivers, including:

psycopg2 - uses either connection.autocommit, or extensions.ISOLATION_LEVEL_AUTOCOMMIT
pymysql - uses conn.autocommit(True|False)
mysqldb - uses conn.autocommit(True|False)
cx_Oracle - uses conn.autocommit
pyodbc - uses conn.autocommit

With all of the above drivers, one can emit "BEGIN" and "COMMIT" using
connection.execute(), and within the scope of that BEGIN/COMMIT pair, the
database "engine" itself is in a transaction. The "driver" however remains in
"autocommit" mode. This mode specifically means the driver is not getting
involved in starting and stopping transactions.

As Marc mentions, we're not supposed to be emitting "BEGIN" and "COMMIT" on
the driver, but none of them get in the way of us doing so, and additionally
most databases support additional options for the "BEGIN/START TRANSACTION" phase
that aren't available in the DBAPIs so sometimes we don't have much choice at least for the "BEGIN" command.

Here's an example using psycopg2, where the timestamp now() will freeze
when we're in a transaction started using manual "BEGIN"/ "COMMIT", while .autocommit stays True, and otherwise match statement execution time if we're not:

    >>> import psycopg2
    >>> conn = psycopg2.connect(user="scott", password="tiger", host="localhost", database="test")
    >>> conn.autocommit = True
    >>> cursor = conn.cursor()
    >>> cursor.execute("SELECT 1")
    >>> cursor.execute("select now() = statement_timestamp()")
    >>> cursor.fetchall()
    [(True,)]
    >>> cursor.execute("BEGIN")
    >>> cursor.execute("select now() = statement_timestamp();")
    >>> cursor.fetchall()
    [(False,)]  # we're in a transaction
    >>> conn.autocommit   # still in driver-level autocommit
    True
    >>> cursor.execute("COMMIT")
    >>> cursor.execute("select now() = statement_timestamp();")
    >>> cursor.fetchall()
    [(True,)]

For SQLAlchemy we already support pysqlite's "isolation_level=None" to implement "autocommit" so this issue does not affect us much, but the meaning of the term "autocommit" at the driver level shouldn't be controversial at this point as there's a lot of precedent. "connection.autocommit" does not refer to the current transactional state of the database, just the current preference set upon the driver itself.

@malemburg
Copy link
Member

On 05.01.2021 19:04, Géry wrote:

@lemburg

> I guess the SQLite driver does not start a new transaction for SELECTs, since these are usually read-only

Nor for DDL statements (CREATE, DROP).

Those are definitely changing the database and AFAIK SQLite
does support DDLs in transactions (including rolling them back
if needed).

Looking at the _sqlite code, the module does indeed only start
a transaction for INSERT, UPDATE, DELETE and REPLACE, with
"starting a transaction" meaning that it inserts a "BEGIN"
(or one of the txn isolation alternatives) before the statement:

https://github.com/python/cpython/blob/3.9/Modules/_sqlite/cursor.c#L489

This is also documented:

https://docs.python.org/3/library/sqlite3.html#controlling-transactions

I wonder why the module does not implement this properly, but I also
believe it's too late to change.

I guess what could be done is to add a connection.autocommit,
defaulting to None, meaning "use the pre-3.10 behavior".

If this is set to False, the module could then implement the
correct way of handling transactions, which means:

a) start a new transaction when the connection is opened
b) start a new transaction after .commit() and .rollback()
c) don't start new transactions anywhere else
d) run an implicit .rollback() when the connection closes

The code could even check for "BEGIN", "ROLLBACK" and "COMMIT"
text in the .execute() and issues a warning when connection.autocommit
is set to True or False.

When set to True, the module would set the SQLite autocommit
flag and also issues warnings for the txn statements. .rollback()
would issue an exception and .commit() pass silently.

> For the same reason, removing the SELECT "optimization" may cause
> a backwards incompatible change, which can be tricky to identify
> and cause corruption of data (in this case, data not written to
> the database, where it previously was written).

Since DQL statements (SELECT) are read-only, maybe we could keep the optimization and start transactions implicitly only for DDL statements (CREATE, DROP)?

See https://sqlite.org/c3ref/stmt_readonly.html.

SELECT are usually read-only, but not always. Since SQLite does
support UDFs (user defined functions), it is possible that a call
to such a function does change the database.

@jamesoldfield
Copy link
Mannequin

jamesoldfield mannequin commented Jan 6, 2021

There's some confusion here over what autocommit=True would do. I believe the last three comments give three different interpretations! Géry said conn.autocommit would change to False when I start a transaction with execute("BEGIN"), Mike said it wouldn't (because it represents the driver's state, not the engine's, by analogy with other DB API drivers), and Marc-Andre says execute("BEGIN") wouldn't be allowed in the first place (or at least it would issue a warning).

To reiterate, the ability to control transactions manually is already supported in the sqlite3 driver, in the form of isolation_mode=None. My first request is simply that **this ability continues to exist**. This functionality was implemented deliberately - the original author of pysqlite recommended this usage, and care has been taken over the years not to break it. Please do not point out that this is not DB API compliant; I know that, and I just don't care! So long as DB API compliant usage is _also_ supported, even the default, that doesn't prevent this other mode from existing too. Many others are using the mode, even if they are not commenters here, so I don't believe it is feasible to break or remove this functionality, even if you're not a fan.

My second request was: feel free to rename this option from "isolation_mode=None" to something else if you wish, but please don't call it "autocommit=True" because that's just too confusing. I feel like the confusion in the comments above justifies this point of view.

As I see it, that leaves two options:

Option 1: Suck it up and use autocommit=True as the option name. It's confusing, but there's so much precedent that it has to be so. This is Mike Bayer's suggestion (except he didn't say it was confusing, that's just my commentary). I think that this option is only feasible if conn.autocommit only refer's the driver's state, not the underlying engine's state, confusing though that is i.e. once set to true it would *always* be true, even if a transaction is started.

Option 2: Reserve autocommit=True for the underlying SQLite engine autocommit mode. That means detecting when there's an attempt to use execute("BEGIN") or similar, and then issuing a warning or error. It also means supplying some other, third, option for what I'm asking (like today's isolation_mode=None).

Although option 2 is closer to what I originally requested, I do worry it means that the non-DBAPI mode will appear unsupported and fall into neglect. If the API for accessing it is to set autocommit=None, to mean legacy behaviour, and then also isolation_mode=None to mean the type of legacy behaviour, then it doesn't look like the most recommended API ever. And yet, for those that don't care about DB API (which I imagine is most users of the sqlite3 driver), this is probably the best API to use.

So I reluctantly agree that option 1, using autocommit=True, is actually best overall. I would ask that there is at least a note in the documentation so that it's clear this is allowed to work. Something like this:

If autocommit=True then the sqlite3 module will never automatically start transactions. The underlying SQLite database engine operates in autocommit mode whenever no transactions are active, so the net effect of this is to use SQLite's autocommit mode [1].

Note that, when autocommit=True, the sqlite3 module will not intercept and stop a statement to explicitly start a transaction, such as with execute("BEGIN"). In that case, a transaction is started and the underlying SQLite engine is no longer in autocommit mode. (The sqlite3 Connection object will still report autocommit=True; this does not indicate that the SQLite engine is autocommit mode, just that the sqlite3 module is not going to implicitly start any transactions.)

The connection commit() and rollback() methods may be used for transactions started explictly when autocommit=True, and the connection may be used as a context manager, just as it can be when autocommit=False. If no transaction is currently active then those methods silent pass with no effect.

[1] https://sqlite.org/lang_transaction.html#implicit_versus_explicit_transactions

Side note: When I started down this rabbit hole several weeks ago, I repeatedly came across the extremely confusing phrase "SQLite operates in autocommit mode by default". It took me a while to realise that autocommit is not a flag that it is possible to turn off on a connection *when you open it*. The text I used above, "The underlying SQLite database engine operates in autocommit mode whenever no transactions are active" was carefully chosen and I consider it to be much clearer, regardless of whatever else ends up happening.

@zzzeek
Copy link
Mannequin

zzzeek mannequin commented Jan 6, 2021

I think this issue just discusses the naming of an attribute called ".autocommit". for the discussion for SQLite's unusual starting of transactions, that's all in two other issues:

https://bugs.python.org/issue9924

https://bugs.python.org/issue10740

so I would encourage folks to read those discussions. at issue is the limitation of SQLite that it locks the whole file for transactions, which is the main rationale for why SQLite is hesitant to begin a transaction. however, without configurability, this means it's not compatible with SAVEPOINT or serializable isolation levels. when users want to use those two features we have them set isolation_level=None and emit "BEGIN" on the connection directly. the connection.commit() and connection.rollback() methods continue to be functional

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
@erlend-aasland erlend-aasland added topic-sqlite3 and removed 3.9 only security fixes labels May 16, 2022
@erlend-aasland
Copy link
Contributor

FTR, I've changed the formatting of #83638 (comment) so it does not eat up multiple screenfuls using heading formatting for every line.

@erlend-aasland erlend-aasland moved this to Decision needed in sqlite3 issues May 21, 2022
@erlend-aasland erlend-aasland moved this to Backwards incompatible in sqlite3 issues May 21, 2022
@erlend-aasland
Copy link
Contributor

@zzzeek, would the solution outlined by MAL in #83638 (comment) solve SQLAlchemy's problems? If there is any confusion regarding an attribute named autocommit, it can be resolved in the docs.

We could add the new attribute and introduce deprecation warnings for isolation_level in 3.12. I don't think we should remove isolation_level anytime soon; a possibility is to keep it forever as deprecated.

This issue should be focused only on adding the new property, and not on the deprecation of isolation_level. Resolving it should be possible using a single PR.

@zzzeek
Copy link

zzzeek commented Jun 10, 2022

@erlend-aasland that comment looks like it wants to solve multiple things at once, which is, adding an autocommit attribute, but also changing the BEGIN behavior that I noted in #54133 when it states "start a new transaction when the connection is opened" (Which I assume applies to the case that "autocommit" would be passed to the sqlite3 connect() method).

I haven't read the whole set of comments above but if you added an autocommit parameter that's fine, but it should be readable also, which means if I have a regular sqlite3.connect() connection, autocommit should be False, not None. I think syncing autocommit and isolation_level parameters together, for as long as isolation_level is present, would be least confusing, that is, if I set isolation_level to None, that implies connection.autocommit now returns True.

I guess you also need to make another new parameter to allow control of "DEFERRED", "IMMEDIATE", "EXCLUSIVE". SQLAlchemy doesn't have direct API for these parameters. our API right now does what you see below, so it would not be difficult to change it for an .autocommit parameter. A bigger issue would be any default behavioral changes, like, files get locked sooner, things like that, we'd get a lot of user complaints blaming us for that sort of thing if it just changed.

# sqlalchemy's sqlite3 isolation level pseudocode
_isolation_lookup = {"READ UNCOMMITTED": 1, "SERIALIZABLE": 0}

def set_isolation_level(self, dbapi_connection, level):
    """set the isolation level for a sqlite3 connection"""

    if level == "AUTOCOMMIT":
        dbapi_connection.isolation_level = None
    else:
        # i believe this is eqvuialent to DEFERRED
        dbapi_connection.isolation_level = ""

        # convert from "READ UNCOMMITTED" or "SERIALIZABLE" to an int
        # read_uncommitted level
        int_level = _isolation_lookup[level]
        cursor = dbapi_connection.cursor()
        cursor.execute("PRAGMA read_uncommitted = %d" % int_level)
        cursor.close()

@erlend-aasland
Copy link
Contributor

that comment looks like it wants to solve multiple things at once, which is, adding an autocommit attribute, but also changing the BEGIN behavior [...]

Yes, that is also how I interpret MAL's comment. As I understand it, the only way to make the sqlite3 extension module behave as expected (PEP 249 compliant), is to control behaviour via a new attribute. So, yes, it would solve multiple things at once.

The alternative is to change the sqlite3 extension module's behaviour wrt. isolation_level, which would be backwards incompatible in any kind of way.

[...] if you added an autocommit parameter that's fine, but it should be readable also [...]

+1

[...] which means if I have a regular sqlite3.connect() connection, autocommit should be False, not None. I think syncing autocommit and isolation_level parameters together, for as long as isolation_level is present, would be least confusing, that is, if I set isolation_level to None, that implies connection.autocommit now returns True.

How could we introduce new behaviour if they are to be sync'ed? autocommit would just be another way to control isolation_level, so we end up keeping the current behaviour, which is buggy.

I think MAL's suggestion to how a hypothetical autocommit attribute could work, sounds good to me. Setting it to None to fall back to pre 3.12 behaviour also sounds good to me.

I'm not sure how the interplay between autocommit and isolation_level should be, yet.

I guess you also need to make another new parameter to allow control of "DEFERRED", "IMMEDIATE", "EXCLUSIVE".

Yeah, we could add such an attribute. For example .transaction_control?

A bigger issue would be any default behavioral changes, like, files get locked sooner, things like that, we'd get a lot of user complaints blaming us for that sort of thing if it just changed.

Quoting the SQLite locking docs:

Note that the BEGIN command does not acquire any locks on the database. After a BEGIN command, a SHARED lock will be acquired when the first SELECT statement is executed. A RESERVED lock will be acquired when the first INSERT, UPDATE, or DELETE statement is executed. No EXCLUSIVE lock is acquired until either the memory cache fills up and must be spilled to disk or until the transaction commits. In this way, the system delays blocking read access to the file file until the last possible moment.

Also:

The SQL command "COMMIT" does not actually commit the changes to disk. It just turns autocommit back on.

But yeah, we need to keep such behavioural changes in mind.

@zzzeek
Copy link

zzzeek commented Jun 10, 2022

that comment looks like it wants to solve multiple things at once, which is, adding an autocommit attribute, but also changing the BEGIN behavior [...]

Yes, that is also how I interpret MAL's comment. As I understand it, the only way to make the sqlite3 extension module behave as expected (PEP 249 compliant), is to control behaviour via a new attribute. So, yes, it would solve multiple things at once.

The alternative is to change the sqlite3 extension module's behaviour wrt. isolation_level, which would be backwards incompatible in any kind of way.

[...] if you added an autocommit parameter that's fine, but it should be readable also [...]

+1

[...] which means if I have a regular sqlite3.connect() connection, autocommit should be False, not None. I think syncing autocommit and isolation_level parameters together, for as long as isolation_level is present, would be least confusing, that is, if I set isolation_level to None, that implies connection.autocommit now returns True.

How could we introduce new behaviour if they are to be sync'ed? autocommit would just be another way to control isolation_level, so we end up keeping the current behaviour, which is buggy.

I think MAL's suggestion to how a hypothetical autocommit attribute could work, sounds good to me. Setting it to None to fall back to pre 3.12 behaviour also sounds good to me.

I wouldn't link the "autocommit" flag to be hardcoded to the "new" behavior, basically, if we are talking about when sqlite3 emits BEGIN, whether it's at the first SQL statement or it's at the first DML (or non DDL, im not sure how that works), that should be a separate argument which selects among different "autobegin" styles. that way you can add "autocommit" and have it work consistently for everyone right up front without it ever changing, and you have the "sqlite3 autobegin behavior" thing separate, where you can eventually change its default.

I'm not sure how the interplay between autocommit and isolation_level should be, yet.

I guess you also need to make another new parameter to allow control of "DEFERRED", "IMMEDIATE", "EXCLUSIVE".

Yeah, we could add such an attribute. For example .transaction_control?

More than that is needed. I think you should make a matrix of all transactional behaviors and work out combinations of parameters for all of them, combining the concepts of "autocommit", "autobegin (when BEGIN is emitted)", "transaction control" (what keyword is included after BEGIN), and I would include "read_committed" as well. then work out an API that cleanly handles all combinations. I dont think SQLite's "BEGIN only on DML/non DDL" feature should be removed because certainly people rely upon it and if you change the default someday, that's fine but people will likely ask for the old behavior.

A bigger issue would be any default behavioral changes, like, files get locked sooner, things like that, we'd get a lot of user complaints blaming us for that sort of thing if it just changed.

Quoting the SQLite locking docs:

Note that the BEGIN command does not acquire any locks on the database. After a BEGIN command, a SHARED lock will be acquired when the first SELECT statement is executed. A RESERVED lock will be acquired when the first INSERT, UPDATE, or DELETE statement is executed. No EXCLUSIVE lock is acquired until either the memory cache fills up and must be spilled to disk or until the transaction commits. In this way, the system delays blocking read access to the file file until the last possible moment.

beyond locking issues (which I thought was the original rationale for this behavior?) note also that DDL with sqlite3 is not transactional right now. so if you change that behavior, all the scripts that connect() and then execute() DDL statements without calling commit(), assuming the DDL is now persistent, will break.

@erlend-aasland
Copy link
Contributor

erlend-aasland commented Jun 13, 2022

I created a draft PR for this:

As of now, it ignores the context manager1. Feel free to play with it.

Footnotes

  1. which has its own problems: The sqlite3 context manager does not work with isolation_level=None #61162

@erlend-aasland
Copy link
Contributor

agree the ship has sailed on isolation_mode and also these constants really dont resemble at all the traditional notion of "transaction isolation"; SQLite's own documentation does not use this term for them (it doesn't seem to have a term, which is unfortunate).

What about transaction_behaviour? It aligns pretty well with the SQLite docs:

Transactions can be DEFERRED, IMMEDIATE, or EXCLUSIVE. The default transaction behavior is DEFERRED.

If it helps typing, we could add constants for each of those.

cx.transaction_behaviour = sqlite3.DEFERRED

@erlend-aasland
Copy link
Contributor

erlend-aasland commented Jun 14, 2022

I suggest we create a topic on Discourse to try to raise more awareness on this change feature. EDIT: Posted here.

Is autocommit=False equals strict PEP 249 compliancy is a hard showstopper for you, @zzzeek?

@zzzeek
Copy link

zzzeek commented Jun 14, 2022

agree the ship has sailed on isolation_mode and also these constants really dont resemble at all the traditional notion of "transaction isolation"; SQLite's own documentation does not use this term for them (it doesn't seem to have a term, which is unfortunate).

What about transaction_behaviour? It aligns pretty well with the SQLite docs:

Transactions can be DEFERRED, IMMEDIATE, or EXCLUSIVE. The default transaction behavior is DEFERRED.

If it helps typing, we could add constants for each of those.

cx.transaction_behaviour = sqlite3.DEFERRED

well you'd have to choose between british and US spelling there, is the only wart I can see ... :)

@zzzeek
Copy link

zzzeek commented Jun 14, 2022

I suggest we create a topic on Discourse to try to raise more awareness on this change feature. EDIT: Posted here.

Is autocommit=False equals strict PEP 249 compliancy is a hard showstopper for you, @zzzeek?

nah. I can work with whatever, I was just suggesting. A sudden change in default behavior, and even with the old behavior no longer available, might be inconvenient on this end, but even then, we are pretty much locking out any patterns where "there's no transaction" unless DBAPI level autocommit is requested, so we could adapt and maybe even without difficulty.

@erlend-aasland
Copy link
Contributor

All right, thanks! I'm following MAL's suggestion for now, then.

I'm not sure how we should handle these things yet:

  • the context manager: do as now, and just commit (or rollback) if there is an open transaction?
  • autocommit=True, followed by execute("BEGIN"), followed by autocommit=False: commit and begin? keep the transaction open? Leaning towards the former.

I don't think we should deny manual transaction control in autocommit=True mode. I'm afraid it will be difficult to enforce, and I don't want to add more code that tries to guess what the user does.

@erlend-aasland erlend-aasland moved this from Backwards compatibility issues to In Progress in sqlite3 issues Jun 14, 2022
@erlend-aasland
Copy link
Contributor

I added docs to my WIP branch, and created a draft PR against the CPython repo. Please try it out, and chime in with comments regarding stuff that I forgot :)

@erlend-aasland
Copy link
Contributor

Although I've marked isolation_level as deprecated in the docs of gh-93823, I think we should create a new PR for adding the deprecation warnings to isolation_level. Possibly also as a separate issue.

@erlend-aasland
Copy link
Contributor

erlend-aasland commented Jun 15, 2022

  • the context manager: do as now, and just commit (or rollback) if there is an open transaction?

For now, the autocommit attribute does not affect how the context manager behaves. Suggesting to add an implicit BEGIN after the implicit COMMIT or ROLLBACK, if autocommit=False.

Looking again at what I've implemented, I see that the context manager now of course adds an implicit BEGIN after the already implicit COMMIT/ROLLBACK, if autocommit=False 😄

This also implies that if autocommit=True, the (current) context manager is a no-op. I guess that is ok.

  • autocommit=True, followed by execute("BEGIN"), followed by autocommit=False: commit and begin? keep the transaction open? Leaning towards the former.

I've implemented the former in the PR.

@erlend-aasland
Copy link
Contributor

Commenting on an older post in this thread, just to clarify one thing:

Option 2: Reserve autocommit=True for the underlying SQLite engine autocommit mode.

The underlying SQLite autocommit mode, which is not equal to the sqlite3 extension module autocommit mode, can be accessed using the in_transaction attribute of connection objects. This read-only attribute is the inverted result of the SQLite C API sqlite3_get_autocommit(). This implies:

  • sqlite3_get_autocommit() returns 0 => in_transaction == True
  • sqlite3_get_autocommit() returns non-zero => in_transaction == False

@erlend-aasland
Copy link
Contributor

Following up this concern:

I guess you also need to make another new parameter to allow control of "DEFERRED", "IMMEDIATE", "EXCLUSIVE". SQLAlchemy doesn't have direct API for these parameters.

I suggest we keep gh-93823 focused on the autocommit attribute only.

It is still very early in the 3.12 dev phase; we have lots of time to figure out how, and if, we should extend the new API.

@malemburg
Copy link
Member

Wow, what a long discussion :-) I must admit that I did not read all of it. Just some notes:

  • I'll kick of a discussion on the DB-SIG to get the connection.autocommit attribute added as a standard extension of PEP 249.
  • The definition will likely be similar to what I wrote on Discourse, and I'm leaning towards making the change behavior described as "usual" mandatory -- it should be possible for most database drivers to implement this and it causes fewer surprises. I'm mentioning this here, since the semantics of changing the attribute does not seem to have been discussed yet.
  • connection.autocommit == False is the default after opening a connection as per PEP 249 (unless the connection constructor has a parameter which allows starting with an autocommit setup right away).
  • In autocommit mode, calling connection.commit() or .rollback() should be no-ops - there is nothing to commit or rollback on the connection. Drivers emulating autocommit must take care that all operations are committed as soon as possible and prevent a call to .rollback() from dropping operations.
  • I'll probably copy-edit PEP 249 to use "autocommit" throughout the text - it currently uses the "auto-commit" spelling, which seems to be phasing out (even Wikipedia uses "autocommit" as name for the mode).

If you have questions, please let me know. Thanks.

@erlend-aasland erlend-aasland added the triaged The issue has been accepted as valid by a triager. label Jul 25, 2022
@erlend-aasland
Copy link
Contributor

Although I've marked isolation_level as deprecated in the docs of gh-93823, I think we should create a new PR for adding the deprecation warnings to isolation_level. Possibly also as a separate issue.

FTR, I've removed the docs deprecation of isolation_level from gh-93823; a slower path towards a new default is probably better for everyone involved. For 3.12, this means:

Then, emit deprecations in 3.14, and finally in 3.16 change the default behaviour to PEP 249-compliant transaction control.

@erlend-aasland erlend-aasland removed the triaged The issue has been accepted as valid by a triager. label Oct 30, 2022
Repository owner moved this from In Progress to Done in sqlite3 issues Nov 12, 2022
erlend-aasland added a commit that referenced this issue Nov 12, 2022
…aviour (#93823)

Introduce the autocommit attribute to Connection and the autocommit
parameter to connect() for PEP 249-compliant transaction handling.

Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
Co-authored-by: C.A.M. Gerlach <CAM.Gerlach@Gerlach.CAM>
Co-authored-by: Géry Ogam <gery.ogam@gmail.com>
@erlend-aasland
Copy link
Contributor

I've merged gh-83638, hopefully in time for the next 3.12 alpha release. IMO, we can handle the possible deprecation of isolation_level and how to change the autocommit default in a follow-up issue.

Please give it a try!

Thanks to everyone involved here and on previous issues (and on db-sig), for helping to carve out this functionality.

@erlend-aasland
Copy link
Contributor

FTR, PEP-249 was updated by python/peps#2887

@coleifer
Copy link

coleifer commented Apr 25, 2023

@erlend-aasland - can you summarize how this new property plays with the previously-supported method of setting conn.isolation_level = None to disable the python driver's transaction management? In peewee we run things in autocommit mode across sqlite3/mysql/postgres, and currently use isolation_level=None to perform this for Sqlite.

Am I correct that it should be sufficient to check sys.version_info for Python 3.12, and in those cases to set autocommit=True rather than use isolation_level? Or is it safe to simply continue using isolation_level=None?

@erlend-aasland
Copy link
Contributor

erlend-aasland commented Apr 25, 2023

@coleifer, can you take a look at the 3.12 docs, where this is (or at least should be) thoroughly explained; if you feel more explanation is needed, could you please let me know?

TLDR: yes, it is sufficient to set autocommit=True; and BTW, isolation_level still works as before.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stdlib Python modules in the Lib dir topic-sqlite3 type-feature A feature request or enhancement
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

4 participants