- 
                Notifications
    You must be signed in to change notification settings 
- Fork 434
Batch executemany #295
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Batch executemany #295
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other than the transaction status check, LGTM! Thanks!
| Update - I've met a racing condition issue during fixing tests. Trying to find time to fix that. | 
| Hey @fantix! Are you still working on this? | 
| Hi! Yes I’m trying to find some time to fix the racing issue, sorry for the delay! | 
| Working on this now - rebased and I'll review the changes, and deal with the racing issue. | 
f62a347    to
    06b412b      
    Compare
  
    | Sorry for the delay! This PR is updated now. Benchmark shows the improvement is consistent. I've also added a diagram to explain the details. | 
46ecc07    to
    6c206e9      
    Compare
  
    | @fantix The  | 
| @elprans any chance this can get merged in after the conflict in asyncpg/protocol/protocol.pyx is fixed? | 
| Checking in to see if this PR is still planning on being merged | 
| @fantix, are you interested in getting this in shape to be merged? | 
| bumping this PR. it would be awesome to see it finally merged!! lets goooo ^^ | 
| Let me try this weekend! | 
4be5198    to
    a3bfcff      
    Compare
  
    | Only rebased - I do want to revisit this to make sure everything is still alright. | 
a3bfcff    to
    557f515      
    Compare
  
    Now `Bind` and `Execute` pairs are batched into 4 x 32KB buffers to take advantage of `writelines()`. A single `Sync` is sent at last, so that all args live in the same transaction. Closes: MagicStack#289
557f515    to
    b137184      
    Compare
  
    b137184    to
    39dfbb4      
    Compare
  
    | I've rewritten the core part to be more readable, and included @elprans 's commit to fix the timeout, and added one more test. If the Travis CI passes, I think it'll be good for another review. | 
| Looks like all tests passed! Can we get this merged in? | 
| LGTM and merged. @fantix, thank you very much for working on this! | 
A new asyncpg release is here, just in time for Christmas. Notable additions include Python 3.9 support, support for recently added PostgreSQL types like `jsonpath`, and last but not least, vastly improved `executemany()` performance. Importantly, `executemany()` is also now _atomic_, which means that either all iterations succeed, or none at all, whereas previously partial results would have remained in place, unless `executemany()` was called in a transaction. There is also the usual assortment of improvements and bugfixes, see the details below. This is the last release of asyncpg that supports Python 3.5, which is has reached EOL in September. Improvements ------------ * Vastly speedup executemany by batching protocol messages (#295) (by @fantix in 690048d for #295) * Allow using custom Record class (by @elprans in db4f1a6 for #577) * Add Python 3.9 support (#610) (by @elprans in c05d726 for #610) * Prefer SSL connections by default (#660) (by @elprans in 16183aa for #660) * Add codecs for a bunch of new builtin types (#665) (by @elprans in b53f038 for #665) * Expose Pool as asyncpg.Pool (#669) (by @rugleb in 0e0eb8d for #669) Fixes ----- * Add a workaround for bpo-37658 (by @elprans in 2bac166 for #21894) * Fix wrong default transaction isolation level (#622) (by @fantix in 4a627d5 for #622) * Fix set_type_codec() to accept standard SQL type names (#619) (by @elprans in 68b40cb for #619) * Ignore custom data codec for internal introspection (#618) (by @fantix in e064f59 for #618) * Fix null/NULL quoting in array text encoder (#627) (by @fantix in 92aa806 for #627) * Fix link in connect docstring (#653) (by @samuelcolvin in 8b313bd for #653) * Make asyncpg work with pyinstaller (#651) (by @Atem18 in 5ddabb1 for #651) * Fix possible AttributeError exception in `ConnectionSettings` (#632) (by @petriborg in 0d23182 for #632) * Prohibit custom codecs on domains (by @elprans in 50f964f for #457) * Raise proper error on anonymous composite input (tuple arguments) (#664) (by @elprans in 7252dbe for #664) * Fix incorrect application of custom codecs in some cases (#662) (by @elprans in 50f65fb for #662)
A new asyncpg release is here, just in time for Christmas. Notable additions include Python 3.9 support, support for recently added PostgreSQL types like `jsonpath`, and last but not least, vastly improved `executemany()` performance. Importantly, `executemany()` is also now _atomic_, which means that either all iterations succeed, or none at all, whereas previously partial results would have remained in place, unless `executemany()` was called in a transaction. There is also the usual assortment of improvements and bugfixes, see the details below. This is the last release of asyncpg that supports Python 3.5, which is has reached EOL in September. Improvements ------------ * Vastly speedup executemany by batching protocol messages (#295) (by @fantix in 690048d for #295) * Allow using custom `Record` class (by @elprans in db4f1a6 for #577) * Add Python 3.9 support (#610) (by @elprans in c05d726 for #610) * Prefer SSL connections by default (#660) (by @elprans in 16183aa for #660) * Add codecs for a bunch of new builtin types (#665) (by @elprans in b53f038 for #665) * Expose Pool as `asyncpg.Pool` (#669) (by @rugleb in 0e0eb8d for #669) Fixes ----- * Add a workaround for bpo-37658 (by @elprans in 2bac166 for #21894) * Fix wrong default transaction isolation level (#622) (by @fantix in 4a627d5 for #622) * Fix `set_type_codec()` to accept standard SQL type names (#619) (by @elprans in 68b40cb for #619) * Ignore custom data codec for internal introspection (#618) (by @fantix in e064f59 for #618) * Fix null/NULL quoting in array text encoder (#627) (by @fantix in 92aa806 for #627) * Fix link in connect docstring (#653) (by @samuelcolvin in 8b313bd for #653) * Make asyncpg work with pyinstaller (#651) (by @Atem18 in 5ddabb1 for #651) * Fix possible `AttributeError` exception in `ConnectionSettings` (#632) (by @petriborg in 0d23182 for #632) * Prohibit custom codecs on domains (by @elprans in 50f964f for #457) * Raise proper error on anonymous composite input (tuple arguments) (#664) (by @elprans in 7252dbe for #664) * Fix incorrect application of custom codecs in some cases (#662) (by @elprans in 50f65fb for #662)
A new asyncpg release is here, just in time for Christmas. Notable additions include Python 3.9 support, support for recently added PostgreSQL types like `jsonpath`, and last but not least, vastly improved `executemany()` performance. Importantly, `executemany()` is also now _atomic_, which means that either all iterations succeed, or none at all, whereas previously partial results would have remained in place, unless `executemany()` was called in a transaction. There is also the usual assortment of improvements and bugfixes, see the details below. This is the last release of asyncpg that supports Python 3.5, which has reached EOL in September. Improvements ------------ * Vastly speedup executemany by batching protocol messages (#295) (by @fantix in 690048d for #295) * Allow using custom `Record` class (by @elprans in db4f1a6 for #577) * Add Python 3.9 support (#610) (by @elprans in c05d726 for #610) * Prefer SSL connections by default (#660) (by @elprans in 16183aa for #660) * Add codecs for a bunch of new builtin types (#665) (by @elprans in b53f038 for #665) * Expose Pool as `asyncpg.Pool` (#669) (by @rugleb in 0e0eb8d for #669) Fixes ----- * Add a workaround for bpo-37658 (by @elprans in 2bac166 for #21894) * Fix wrong default transaction isolation level (#622) (by @fantix in 4a627d5 for #622) * Fix `set_type_codec()` to accept standard SQL type names (#619) (by @elprans in 68b40cb for #619) * Ignore custom data codec for internal introspection (#618) (by @fantix in e064f59 for #618) * Fix null/NULL quoting in array text encoder (#627) (by @fantix in 92aa806 for #627) * Fix link in connect docstring (#653) (by @samuelcolvin in 8b313bd for #653) * Make asyncpg work with pyinstaller (#651) (by @Atem18 in 5ddabb1 for #651) * Fix possible `AttributeError` exception in `ConnectionSettings` (#632) (by @petriborg in 0d23182 for #632) * Prohibit custom codecs on domains (by @elprans in 50f964f for #457) * Raise proper error on anonymous composite input (tuple arguments) (#664) (by @elprans in 7252dbe for #664) * Fix incorrect application of custom codecs in some cases (#662) (by @elprans in 50f65fb for #662)
A new asyncpg release is here, just in time for Christmas. Notable additions include Python 3.9 support, support for recently added PostgreSQL types like `jsonpath`, and last but not least, vastly improved `executemany()` performance. Importantly, `executemany()` is also now _atomic_, which means that either all iterations succeed, or none at all, whereas previously partial results would have remained in place, unless `executemany()` was called in a transaction. There is also the usual assortment of improvements and bugfixes, see the details below. This is the last release of asyncpg that supports Python 3.5, which has reached EOL in September. Improvements ------------ * Vastly speedup executemany by batching protocol messages (#295) (by @fantix in 690048d for #295) * Allow using custom `Record` class (by @elprans in db4f1a6 for #577) * Add Python 3.9 support (#610) (by @elprans in c05d726 for #610) * Prefer SSL connections by default (#660) (by @elprans in 16183aa for #660) * Add codecs for a bunch of new builtin types (#665) (by @elprans in b53f038 for #665) * Expose Pool as `asyncpg.Pool` (#669) (by @rugleb in 0e0eb8d for #669) Fixes ----- * Add a workaround for bpo-37658 (by @elprans in 2bac166 for #21894) * Fix wrong default transaction isolation level (#622) (by @fantix in 4a627d5 for #622) * Fix `set_type_codec()` to accept standard SQL type names (#619) (by @elprans in 68b40cb for #619) * Ignore custom data codec for internal introspection (#618) (by @fantix in e064f59 for #618) * Fix null/NULL quoting in array text encoder (#627) (by @fantix in 92aa806 for #627) * Fix link in connect docstring (#653) (by @samuelcolvin in 8b313bd for #653) * Make asyncpg work with pyinstaller (#651) (by @Atem18 in 5ddabb1 for #651) * Fix possible `AttributeError` exception in `ConnectionSettings` (#632) (by @petriborg in 0d23182 for #632) * Prohibit custom codecs on domains (by @elprans in 50f964f for #457) * Raise proper error on anonymous composite input (tuple arguments) (#664) (by @elprans in 7252dbe for #664) * Fix incorrect application of custom codecs in some cases (#662) (by @elprans in 50f65fb for #662)
A new asyncpg release is here. Notable additions include Python 3.9 support, support for recently added PostgreSQL types like `jsonpath`, and last but not least, vastly improved `executemany()` performance. Importantly, `executemany()` is also now _atomic_, which means that either all iterations succeed, or none at all, whereas previously partial results would have remained in place, unless `executemany()` was called in a transaction. There is also the usual assortment of improvements and bugfixes, see the details below. This is the last release of asyncpg that supports Python 3.5, which has reached EOL last September. Improvements ------------ * Vastly speedup executemany by batching protocol messages (#295) (by @fantix in 690048d for #295) * Allow using custom `Record` class (by @elprans in db4f1a6 for #577) * Add Python 3.9 support (#610) (by @elprans in c05d726 for #610) * Prefer SSL connections by default (#660) (by @elprans in 16183aa for #660) * Add codecs for a bunch of new builtin types (#665) (by @elprans in b53f038 for #665) * Expose Pool as `asyncpg.Pool` (#669) (by @rugleb in 0e0eb8d for #669) * Avoid unnecessary overhead during connection reset (#648) (by @kitogo in ff5da5f for #648) Fixes ----- * Add a workaround for bpo-37658 (by @elprans in 2bac166 for #21894) * Fix wrong default transaction isolation level (#622) (by @fantix in 4a627d5 for #622) * Fix `set_type_codec()` to accept standard SQL type names (#619) (by @elprans in 68b40cb for #619) * Ignore custom data codec for internal introspection (#618) (by @fantix in e064f59 for #618) * Fix null/NULL quoting in array text encoder (#627) (by @fantix in 92aa806 for #627) * Fix link in connect docstring (#653) (by @samuelcolvin in 8b313bd for #653) * Make asyncpg work with pyinstaller (#651) (by @Atem18 in 5ddabb1 for #651) * Fix possible `AttributeError` exception in `ConnectionSettings` (#632) (by @petriborg in 0d23182 for #632) * Prohibit custom codecs on domains (by @elprans in 50f964f for #457) * Raise proper error on anonymous composite input (tuple arguments) (#664) (by @elprans in 7252dbe for #664) * Fix incorrect application of custom codecs in some cases (#662) (by @elprans in 50f65fb for #662)
A new asyncpg release is here. Notable additions include Python 3.9 support, support for recently added PostgreSQL types like `jsonpath`, and last but not least, vastly improved `executemany()` performance. Importantly, `executemany()` is also now _atomic_, which means that either all iterations succeed, or none at all, whereas previously partial results would have remained in place, unless `executemany()` was called in a transaction. There is also the usual assortment of improvements and bugfixes, see the details below. This is the last release of asyncpg that supports Python 3.5, which has reached EOL last September. Improvements ------------ * Vastly speedup executemany by batching protocol messages (MagicStack#295) (by @fantix in 690048d for MagicStack#295) * Allow using custom `Record` class (by @elprans in db4f1a6 for MagicStack#577) * Add Python 3.9 support (MagicStack#610) (by @elprans in c05d726 for MagicStack#610) * Prefer SSL connections by default (MagicStack#660) (by @elprans in 16183aa for MagicStack#660) * Add codecs for a bunch of new builtin types (MagicStack#665) (by @elprans in b53f038 for MagicStack#665) * Expose Pool as `asyncpg.Pool` (MagicStack#669) (by @rugleb in 0e0eb8d for MagicStack#669) * Avoid unnecessary overhead during connection reset (MagicStack#648) (by @kitogo in ff5da5f for MagicStack#648) Fixes ----- * Add a workaround for bpo-37658 (by @elprans in 2bac166 for #21894) * Fix wrong default transaction isolation level (MagicStack#622) (by @fantix in 4a627d5 for MagicStack#622) * Fix `set_type_codec()` to accept standard SQL type names (MagicStack#619) (by @elprans in 68b40cb for MagicStack#619) * Ignore custom data codec for internal introspection (MagicStack#618) (by @fantix in e064f59 for MagicStack#618) * Fix null/NULL quoting in array text encoder (MagicStack#627) (by @fantix in 92aa806 for MagicStack#627) * Fix link in connect docstring (MagicStack#653) (by @samuelcolvin in 8b313bd for MagicStack#653) * Make asyncpg work with pyinstaller (MagicStack#651) (by @Atem18 in 5ddabb1 for MagicStack#651) * Fix possible `AttributeError` exception in `ConnectionSettings` (MagicStack#632) (by @petriborg in 0d23182 for MagicStack#632) * Prohibit custom codecs on domains (by @elprans in 50f964f for MagicStack#457) * Raise proper error on anonymous composite input (tuple arguments) (MagicStack#664) (by @elprans in 7252dbe for MagicStack#664) * Fix incorrect application of custom codecs in some cases (MagicStack#662) (by @elprans in 50f65fb for MagicStack#662)
Refs #289, this is a rewrite of
bind_execute_many():writelines()once flow control turns green.SYNCat the very last - making use of the implicit transaction for the whole call toexecutemany().Support async iterable.executemany()to prepared statement.test_server_failure_during_writesis failing on Windowstest_timeout (test_execute.TestExecuteMany)This is a breaking change due to the implicit transaction.
The 3-level infinite loop keeps filling buffers and send them under flow control, the only exit is raising
StopIterationwith 3 possible values:Truemeans there was data sent in the last loopFalsemeans the last loop sent nothingexceptionmeans the data source raised an exceptionPlease note, during the loop, if server-side error is detected before exhausting the data source, it is also considered as
StopIteration(True), ending the process early with aSYNC_MESSAGE.(I wrote a post about the details of implicit transaction.)
UPDATED - pgbench results of inserting 1000 rows per query with
executemany()on Python 3.6 of 2.2GHz 2015 MacBook Air (best out of 5 runs):asyncpg 0.18.2
This PR:
Most branches are covered by tests, overall coverage 85% (1352 miss) -> 85% (1356 miss).
UPDATED - Python 3.8 of 2.3GHz 2020 MacBook Pro: