-
-
Notifications
You must be signed in to change notification settings - Fork 31.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add an autocommit property to sqlite3.Connection with a PEP 249 compliant manual commit mode and migrate #83638
Comments
In non-autocommit mode (manual commit mode), the sqlite3 database driver implicitly issues a BEGIN statement before each DML statement (INSERT, UPDATE, DELETE, REPLACE) not already in a database transaction, BUT NOT before DDL statements (CREATE, DROP) nor before DQL statements (SELECT) (cf. https://github.com/python/cpython/blob/master/Modules/_sqlite/cursor.c#L480):
Like Mike Bayer explained in issue bpo-9924, this is not what other database drivers do, and this is not PEP-249 compliant (Python Database API Specification v2.0), as its author Marc-André Lemburg explained (cf. https://mail.python.org/pipermail/db-sig/2010-September/005645.html):
Aymeric Augustin said in issue bpo-10740:
So I suggest that we introduce a new autocommit property and use it to enable a truly PEP-249 compliant manual commit mode (that is to say with transactions starting implicitly after connect(), commit() and rollback() calls, allowing transactional DDL and DQL):
I also suggest that we use this new PEP-249 manual commit mode (with transactional DDL and DQL) by default and drop the old manual commit mode (without transactional DDL and DQL). We could use the following migration strategy:
|
Correction:
|
Correction:
|
If this ever gets implemented, "autocommit" would be a terrible name for it. That word has a very specific meaning in SQLite, which is essentially the same as "not in a transaction started with BEGIN ...". At the moment, if you want to explicitly control when transactions start (a good idea considering how confusing the current behaviour is) then you would set isolation_mode to None and manually start a transaction with |
Yes if you are talking about SQLite, the database ENGINE: the SQL statements inside BEGIN and COMMIT are said to be in manual commit mode, while SQL statements outside are said to be in autocommit mode. So the autocommit mode is the default mode for database ENGINES. But here I am talking about SQLite3, the Python database DRIVER. You do not issue BEGIN statements with database DRIVERS, they are issued implicitly, so that the manual mode is the default mode for database DRIVERS. Cf. this Stack Overflow answer for more details: https://stackoverflow.com/a/48391535/2326961
No, you do not want that at the database DRIVER level. Because like Mike Bayer explained in issue bpo-9924, this is not what other database DRIVERS do, and this is not PEP-249 compliant (Python Database API Specification v2.0), as its author Marc-André Lemburg explained (cf. https://mail.python.org/pipermail/db-sig/2010-September/005645.html):
|
I sure was! In this comment I will stick to saying either "SQLite engine" or "sqlite3 driver" as appropriate, hopefully that will be clearer.
Yep, I was aware of that. I was trying to say, please don't use the word "autocommit" in the sqlite3 driver when that word has a related but different meaning in the SQLite engine.
This sentence isn't literally true for several reasons (you say "you do not" but I certainly do, you use of "with database drivers" is dubious, and you seem to have causality in the wrong direction). I think there might be a bit of a language barrier here, so I hope you don't mind if I leave this to one side.
I am fully, and painfully, aware of when the sqlite3 driver code will automatically issue BEGIN statements to the engine. I have no need to read StackOverflow answers about it, I have read the C source code to sqlite3 (and pysqlite) directly. I spent more time than I care to admit recently doing that! In fact that happened as a result of reading several confusing StackOverflow answers about transactions (maybe I'll write my own and add to the confusion...) What that answer doesn't mention is that, even with even with isolation_mode=None, it's perfectly possible to start a transaction, which takes the SQLite engine out of autocommit mode. This is fully and intentionally supported by the sqlite3 driver, and the original author has said so and even recommended. For example, let's look at this code: conn = sqlite3.connect(path, isolation_mode=None)
conn.execute("INSERT INTO test (i) VALUES (?)", (1,)) # stmt 1
foo = conn.execute("SELECT * FROM test").fetchall() # stmt 2
conn.execute("BEGIN") # stmt 3
conn.execute("INSERT INTO test (i) VALUES (?)", (4,)) # stmt 4
bar = conn.execute("SELECT * FROM test").fetchall() # stmt 5
conn.execute("COMMIT") # stmt 6 Statement 1 and statement 2 execute using the SQLite engine's autocommit mode. Statements 3 through to 5 execute in a single transaction and do *not* use the SQLite engine's autocommit mode. (Technically statement 6 actually does use autocommit because COMMIT uses the autocommit mechanism under the hood ... but let's forget about that!) Under your proposal, the first line would be changed to say "autocommit=True", even though not all the code below is in autocommit mode (according to the SQLite engine's definition). What's more, I could insert this line of code between statements 3 and 6: print("Autocommit mode?", conn.autocommit) And it would print True even though autocommit mode is off! Now, maybe your reaction is that "autocommit mode *in the driver*" can have a different meaning from "autocommit mode *in the engine*". Yes, it can, but that doesn't mean it should! Please, just pick a different name! For example, say "manual mode" (instead of autocommit=True) or "auto-start-transaction mode" (instead of autocommit=False).
The "that" you are referring to here was when I said that I prefer to set isolation_level = None, like the above code snippet. Do not tell me that it is not what I want; it certainly IS what I want! I do not want the sqlite3 driver getting in the way between me and the SQLite engine. Many future users of the sqlite3 driver are likely to feel the same way, and the API should allow that to happen clearly. |
I think there's a bit of a misunderstanding here. When relying on The DB-API wants drivers to have connections default to transactional It also suggests that a method may be used to set the transactional Now, historically, many drivers have not always used methods for Aside: This is a bit unfortunate, since users would not expect I guess the SQLite driver does not start a new transaction for For the same reason, removing the SELECT "optimization" may cause |
@james.oldfield
Exactly, so since autocommit=True is equivalent to isolation_mode=None I do not see why you the name ‘autocommit’ would be a problem. As you said, when you issue BEGIN, you leave autocommit mode.
What is the difference with isolation_mode=None which also means autocommit mode?
No, because the autocommit property would be automatically updated to False at conn.execute("BEGIN"), which is the standard behaviour as @lemburg explained.
Nor for DDL statements (CREATE, DROP).
Since DQL statements (SELECT) are read-only, maybe we could keep the optimization and start transactions implicitly only for DDL statements (CREATE, DROP)? |
> print("Autocommit mode?", conn.autocommit) As Marc-Andre indicated, this is in fact how "autocommit" behaves on other drivers, including:
With all of the above drivers, one can emit "BEGIN" and "COMMIT" using As Marc mentions, we're not supposed to be emitting "BEGIN" and "COMMIT" on Here's an example using psycopg2, where the timestamp now() will freeze >>> import psycopg2
>>> conn = psycopg2.connect(user="scott", password="tiger", host="localhost", database="test")
>>> conn.autocommit = True
>>> cursor = conn.cursor()
>>> cursor.execute("SELECT 1")
>>> cursor.execute("select now() = statement_timestamp()")
>>> cursor.fetchall()
[(True,)]
>>> cursor.execute("BEGIN")
>>> cursor.execute("select now() = statement_timestamp();")
>>> cursor.fetchall()
[(False,)] # we're in a transaction
>>> conn.autocommit # still in driver-level autocommit
True
>>> cursor.execute("COMMIT")
>>> cursor.execute("select now() = statement_timestamp();")
>>> cursor.fetchall()
[(True,)] For SQLAlchemy we already support pysqlite's "isolation_level=None" to implement "autocommit" so this issue does not affect us much, but the meaning of the term "autocommit" at the driver level shouldn't be controversial at this point as there's a lot of precedent. "connection.autocommit" does not refer to the current transactional state of the database, just the current preference set upon the driver itself. |
On 05.01.2021 19:04, Géry wrote:
Those are definitely changing the database and AFAIK SQLite Looking at the _sqlite code, the module does indeed only start https://github.com/python/cpython/blob/3.9/Modules/_sqlite/cursor.c#L489 This is also documented: https://docs.python.org/3/library/sqlite3.html#controlling-transactions I wonder why the module does not implement this properly, but I also I guess what could be done is to add a connection.autocommit, If this is set to False, the module could then implement the a) start a new transaction when the connection is opened The code could even check for "BEGIN", "ROLLBACK" and "COMMIT" When set to True, the module would set the SQLite autocommit
See https://sqlite.org/c3ref/stmt_readonly.html. SELECT are usually read-only, but not always. Since SQLite does |
There's some confusion here over what autocommit=True would do. I believe the last three comments give three different interpretations! Géry said conn.autocommit would change to False when I start a transaction with execute("BEGIN"), Mike said it wouldn't (because it represents the driver's state, not the engine's, by analogy with other DB API drivers), and Marc-Andre says execute("BEGIN") wouldn't be allowed in the first place (or at least it would issue a warning). To reiterate, the ability to control transactions manually is already supported in the sqlite3 driver, in the form of isolation_mode=None. My first request is simply that **this ability continues to exist**. This functionality was implemented deliberately - the original author of pysqlite recommended this usage, and care has been taken over the years not to break it. Please do not point out that this is not DB API compliant; I know that, and I just don't care! So long as DB API compliant usage is _also_ supported, even the default, that doesn't prevent this other mode from existing too. Many others are using the mode, even if they are not commenters here, so I don't believe it is feasible to break or remove this functionality, even if you're not a fan. My second request was: feel free to rename this option from "isolation_mode=None" to something else if you wish, but please don't call it "autocommit=True" because that's just too confusing. I feel like the confusion in the comments above justifies this point of view. As I see it, that leaves two options: Option 1: Suck it up and use autocommit=True as the option name. It's confusing, but there's so much precedent that it has to be so. This is Mike Bayer's suggestion (except he didn't say it was confusing, that's just my commentary). I think that this option is only feasible if conn.autocommit only refer's the driver's state, not the underlying engine's state, confusing though that is i.e. once set to true it would *always* be true, even if a transaction is started. Option 2: Reserve autocommit=True for the underlying SQLite engine autocommit mode. That means detecting when there's an attempt to use execute("BEGIN") or similar, and then issuing a warning or error. It also means supplying some other, third, option for what I'm asking (like today's isolation_mode=None). Although option 2 is closer to what I originally requested, I do worry it means that the non-DBAPI mode will appear unsupported and fall into neglect. If the API for accessing it is to set autocommit=None, to mean legacy behaviour, and then also isolation_mode=None to mean the type of legacy behaviour, then it doesn't look like the most recommended API ever. And yet, for those that don't care about DB API (which I imagine is most users of the sqlite3 driver), this is probably the best API to use. So I reluctantly agree that option 1, using autocommit=True, is actually best overall. I would ask that there is at least a note in the documentation so that it's clear this is allowed to work. Something like this:
[1] https://sqlite.org/lang_transaction.html#implicit_versus_explicit_transactions Side note: When I started down this rabbit hole several weeks ago, I repeatedly came across the extremely confusing phrase "SQLite operates in autocommit mode by default". It took me a while to realise that autocommit is not a flag that it is possible to turn off on a connection *when you open it*. The text I used above, "The underlying SQLite database engine operates in autocommit mode whenever no transactions are active" was carefully chosen and I consider it to be much clearer, regardless of whatever else ends up happening. |
I think this issue just discusses the naming of an attribute called ".autocommit". for the discussion for SQLite's unusual starting of transactions, that's all in two other issues: https://bugs.python.org/issue9924 https://bugs.python.org/issue10740 so I would encourage folks to read those discussions. at issue is the limitation of SQLite that it locks the whole file for transactions, which is the main rationale for why SQLite is hesitant to begin a transaction. however, without configurability, this means it's not compatible with SAVEPOINT or serializable isolation levels. when users want to use those two features we have them set isolation_level=None and emit "BEGIN" on the connection directly. the connection.commit() and connection.rollback() methods continue to be functional |
FTR, I've changed the formatting of #83638 (comment) so it does not eat up multiple screenfuls using heading formatting for every line. |
@zzzeek, would the solution outlined by MAL in #83638 (comment) solve SQLAlchemy's problems? If there is any confusion regarding an attribute named We could add the new attribute and introduce deprecation warnings for This issue should be focused only on adding the new property, and not on the deprecation of isolation_level. Resolving it should be possible using a single PR. |
@erlend-aasland that comment looks like it wants to solve multiple things at once, which is, adding an autocommit attribute, but also changing the BEGIN behavior that I noted in #54133 when it states "start a new transaction when the connection is opened" (Which I assume applies to the case that "autocommit" would be passed to the sqlite3 connect() method). I haven't read the whole set of comments above but if you added an autocommit parameter that's fine, but it should be readable also, which means if I have a regular I guess you also need to make another new parameter to allow control of "DEFERRED", "IMMEDIATE", "EXCLUSIVE". SQLAlchemy doesn't have direct API for these parameters. our API right now does what you see below, so it would not be difficult to change it for an .autocommit parameter. A bigger issue would be any default behavioral changes, like, files get locked sooner, things like that, we'd get a lot of user complaints blaming us for that sort of thing if it just changed. # sqlalchemy's sqlite3 isolation level pseudocode
_isolation_lookup = {"READ UNCOMMITTED": 1, "SERIALIZABLE": 0}
def set_isolation_level(self, dbapi_connection, level):
"""set the isolation level for a sqlite3 connection"""
if level == "AUTOCOMMIT":
dbapi_connection.isolation_level = None
else:
# i believe this is eqvuialent to DEFERRED
dbapi_connection.isolation_level = ""
# convert from "READ UNCOMMITTED" or "SERIALIZABLE" to an int
# read_uncommitted level
int_level = _isolation_lookup[level]
cursor = dbapi_connection.cursor()
cursor.execute("PRAGMA read_uncommitted = %d" % int_level)
cursor.close()
|
Yes, that is also how I interpret MAL's comment. As I understand it, the only way to make the sqlite3 extension module behave as expected (PEP 249 compliant), is to control behaviour via a new attribute. So, yes, it would solve multiple things at once. The alternative is to change the sqlite3 extension module's behaviour wrt.
+1
How could we introduce new behaviour if they are to be sync'ed? I think MAL's suggestion to how a hypothetical I'm not sure how the interplay between
Yeah, we could add such an attribute. For example
Quoting the SQLite locking docs:
Also:
But yeah, we need to keep such behavioural changes in mind. |
I wouldn't link the "autocommit" flag to be hardcoded to the "new" behavior, basically, if we are talking about when sqlite3 emits BEGIN, whether it's at the first SQL statement or it's at the first DML (or non DDL, im not sure how that works), that should be a separate argument which selects among different "autobegin" styles. that way you can add "autocommit" and have it work consistently for everyone right up front without it ever changing, and you have the "sqlite3 autobegin behavior" thing separate, where you can eventually change its default.
More than that is needed. I think you should make a matrix of all transactional behaviors and work out combinations of parameters for all of them, combining the concepts of "autocommit", "autobegin (when BEGIN is emitted)", "transaction control" (what keyword is included after BEGIN), and I would include "read_committed" as well. then work out an API that cleanly handles all combinations. I dont think SQLite's "BEGIN only on DML/non DDL" feature should be removed because certainly people rely upon it and if you change the default someday, that's fine but people will likely ask for the old behavior.
beyond locking issues (which I thought was the original rationale for this behavior?) note also that DDL with sqlite3 is not transactional right now. so if you change that behavior, all the scripts that connect() and then execute() DDL statements without calling commit(), assuming the DDL is now persistent, will break. |
I created a As of now, it ignores the context manager1. Feel free to play with it. Footnotes
|
What about
If it helps typing, we could add constants for each of those. cx.transaction_behaviour = sqlite3.DEFERRED |
I suggest we create a topic on Discourse to try to raise more awareness on this Is |
well you'd have to choose between british and US spelling there, is the only wart I can see ... :) |
nah. I can work with whatever, I was just suggesting. A sudden change in default behavior, and even with the old behavior no longer available, might be inconvenient on this end, but even then, we are pretty much locking out any patterns where "there's no transaction" unless DBAPI level autocommit is requested, so we could adapt and maybe even without difficulty. |
All right, thanks! I'm following MAL's suggestion for now, then. I'm not sure how we should handle these things yet:
I don't think we should deny manual transaction control in |
I added docs to my WIP branch, and created a draft PR against the CPython repo. Please try it out, and chime in with comments regarding stuff that I forgot :) |
Although I've marked |
Looking again at what I've implemented, I see that the context manager now of course adds an implicit BEGIN after the already implicit COMMIT/ROLLBACK, if This also implies that if
I've implemented the former in the PR. |
Commenting on an older post in this thread, just to clarify one thing:
The underlying SQLite autocommit mode, which is not equal to the sqlite3 extension module autocommit mode, can be accessed using the
|
Following up this concern:
I suggest we keep gh-93823 focused on the autocommit attribute only. It is still very early in the 3.12 dev phase; we have lots of time to figure out how, and if, we should extend the new API. |
Wow, what a long discussion :-) I must admit that I did not read all of it. Just some notes:
If you have questions, please let me know. Thanks. |
FTR, I've removed the docs deprecation of
Then, emit deprecations in 3.14, and finally in 3.16 change the default behaviour to PEP 249-compliant transaction control. |
…aviour (#93823) Introduce the autocommit attribute to Connection and the autocommit parameter to connect() for PEP 249-compliant transaction handling. Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com> Co-authored-by: C.A.M. Gerlach <CAM.Gerlach@Gerlach.CAM> Co-authored-by: Géry Ogam <gery.ogam@gmail.com>
I've merged gh-83638, hopefully in time for the next 3.12 alpha release. IMO, we can handle the possible deprecation of Please give it a try! Thanks to everyone involved here and on previous issues (and on db-sig), for helping to carve out this functionality. |
FTR, PEP-249 was updated by python/peps#2887 |
@erlend-aasland - can you summarize how this new property plays with the previously-supported method of setting Am I correct that it should be sufficient to check |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: