-
-
Notifications
You must be signed in to change notification settings - Fork 31.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpo-24905: Support BLOB incremental I/O in sqlite module #271
Conversation
The APSW doc for reference is at https://rogerbinns.github.io/apsw/blob.html Does having len make sense? Files don't have that method. It is also confusing - should len return the size from the current seek offset? The documentation should make clearer that you cannot change the size of a blob, and mention zeroblob as the means to make a blob in a query without having to fill it in. It may be worth mentioning that another approach is to store large data in a file, and only store the filename in the database. (This comes up on the sqlite-users mailing list quite a lot.) |
I cannot remember I ever was in need to read/write a part of a BLOB; it was always "all or nothing" for me. So I never used BLOB APIs; instead I always SELECT/INSERT/UPDATE BLOB columns; in Postgres they are not even BLOB columns — I always use BYTEA type. So I'm -0 on exposing BLOB API for SQLite. |
@phdru SQLite is the same with regular queries: you can only read or write blobs in their entirety. That for example means that if you store a 25MB blob then you must read or write 25MB at once. SQLite has the "incremental blob" API for accessing just portions of blobs. The motivation comes from "Lite" in the name - developers use SQLite because it is lighter weight (amongst other reasons). DBAPI doesn't specify incremental blob I/O so only developers intending to use SQLite directly and not another database would use it. Should they be able to? |
-0 from me means: I don't care and if there will be such an API I'm not gonna use it. That's all. |
Thanks for the input @rogerbinns.
What is the difference between implementing |
@palaviv there is no difference between the value returned by len and length or similar methods. It is however very uncommon to have a len method on file like objects - I couldn't find an example of any! For example StringIO is closest and has no len. Hence my recommendation to avoid len in favour of another method name. |
The |
@serhiy-storchaka good example. They don't document it though, and there is a size() method although it is returning something slightly different. There also seems to be a correlation between types that have len and those that can you can array access. In any event my recommendation is to avoid breaking new ground with a len method since that seems not to be normal practise for this kind of thing that provides a file like interface. |
I actually think that we should use |
Pull request conversation is purposed for discussing the code. It would be better to continue the design discussion on the bug tracker or mailing list. |
@serhiy-storchaka I have implemented the sequence protocol but I have a few questions:
|
4373ce8
to
24ed220
Compare
I think that the |
Doc/library/sqlite3.rst
Outdated
The BLOB size cannot be changed using the :class:`Blob` class. Use | ||
``zeroblob`` to create the blob in the wanted size in advance. | ||
|
||
.. versionadded:: 3.7 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
3.8
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Doc/library/sqlite3.rst
Outdated
Blob Objects | ||
------------ | ||
|
||
.. versionadded:: 3.7 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
3.8
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can retarget this for py 3.9
Hi @palaviv Would you be interested to upgrade your PR to the last master? Thank you |
efac873
to
765545e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Retarget for python 3.9
Doc/library/sqlite3.rst
Outdated
Blob Objects | ||
------------ | ||
|
||
.. versionadded:: 3.8 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
3.9
Hi @palaviv There is some plan with this PR? |
Hi @eamanu, |
I'd love to see this land in Python. I think there's a strong case for it: SQLite lets you store up to 2GB of data in a BLOB, and reading an entire 2GB value into memory at once isn't nearly as pleasant as reading it incrementally, which is what this would let us do. |
Thanks for the review @berkerpeksag. I have made the requested changes; please review again. |
Thanks for making the requested changes! @berkerpeksag: please review the changes made to this pull request. |
Other than rebasing (due new conflicts arising over time), is there anything that can be done to help move this PR along? (@palaviv do you want to do the rebase? if you'd like or are too busy I can do the rebase, though I'd need to open a new PR since I don't think I can modify yours) |
@nightlark, @palaviv: Here' a short list from the top-of-my head of what is needed to rebase this onto
If you want to try to land this, Ryan, please give Aviv a week or so to respond before opening a new PR :) |
@erlend-aasland Okay — I think I understand how to use argument clinic. Is there a guide to what iso. static types (or heap types)? Is the iso. a prefix for the types or an abbreviation? If it’s an abbreviation maybe that’s the missing a search term I should be using to find relevant resources. |
Great! AC is nice once you get into it. Feel free to ask if you get stuck :)
There's some info in the docs, but you can also check the PR's that converted the existing types:
Don't hesitate to ask if you need more pointers.
Sorry, it's an abbreviation: instead of :) I have the bad habit of using it too much. |
Regarding heap types, take a look at Victor's blog: https://vstinner.github.io/isolate-subinterpreters.html |
@nightlark Are you still working on this PR? |
It's 2022 now woooo |
+1 |
This PR is stale because it has been open for 30 days with no activity. |
I just merged #30680, a simplified version of this PR. Blobs will be in Python 3.11. |
This PR adds support in BLOB incremental I/O at the sqlite module. As asked by @serhiy-storchaka and @berkerpeksag I will try to get some more developers to give their input on the wanted API. I am tagging some people that are active in the ghaering/pysqlite and rogerbinns/apsw.
@ghaering, @rianhunter, @rogerbinns, @phdru. Please look at the PR and give your notes.
https://bugs.python.org/issue24905