Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sqlite3.create_collation() doesn't get called for custom types #92742

Closed
AdrianIssott opened this issue May 12, 2022 · 4 comments
Closed

sqlite3.create_collation() doesn't get called for custom types #92742

AdrianIssott opened this issue May 12, 2022 · 4 comments
Labels
extension-modules C modules in the Modules dir topic-sqlite3

Comments

@AdrianIssott
Copy link

Bug report

The following example demonstrates that sqlite3.create_collation() works for a basic type such as int but doesn't for a custom type. It used to work in python 2.7.18 but hasn't worked since v3.0.0 on Windows 10 x64.

import sqlite3
import pickle

def adapter_func(obj):
    return pickle.dumps(obj, protocol=0)

def converter_func(data):
    return pickle.loads(data)

class MyObj(object):
    def __init__(self, arg):
        self.arg = arg
    def __eq__(self, other):
        return self.arg == other.arg
    def __lt__(self, other):
        return self.arg < other.arg
    def __le__(self, other):
        return self < other or self == other

sqlite3.register_adapter(MyObj, adapter_func)
sqlite3.register_converter("MyObj", converter_func)

def reverse_collation_func(a, b):
    print('reverse_collation_func: %r, %r' % (a, b))
    if a < b:
        return 1
    if a > b:
        return -1
    return 0

with sqlite3.connect(':memory:') as conn:
    conn.create_collation('unpickle', reverse_collation_func)

    conn.execute("""
        create table obj (
            data TEXT collate unpickle
        )
    """)
    conn.execute("""
        create table basic (
            data TEXT collate unpickle
        )
    """)

    conn.executemany('insert into obj (data) values (?)',[(MyObj(x),) for x in range(1, 5)])
    conn.executemany('insert into basic (data) values (?)',[(x,) for x in range(1, 5)])

    print('Querying:')
    for obj in conn.execute("select data from obj order by data"):
        print("obj: %r" % obj)
    for obj in conn.execute("select data from basic order by data"):
        print("basic: %r" % obj)

Your environment

  • CPython versions tested on:

Python 2.7.18 (v2.7.18:8d21aa21f2, Apr 20 2020, 13:25:05) [MSC v.1500 64 bit (AMD64)] on win32 produces the following output:

Querying:
reverse_collation_func: "ccopy_reg\n_reconstructor\np0\n(c__main__\nMyObj\np1\nc__builtin__\nobject\np2\nNtp3\nRp4\n(dp5\nS'arg'\np6\nI3\nsb.", "ccopy_reg\n_reconstructor\np0\n(c__main__\nMyObj\np1\nc__builtin__\nobject\np2\nNtp3\nRp4\n(dp5\nS'arg'\np6\nI4\nsb."
reverse_collation_func: "ccopy_reg\n_reconstructor\np0\n(c__main__\nMyObj\np1\nc__builtin__\nobject\np2\nNtp3\nRp4\n(dp5\nS'arg'\np6\nI1\nsb.", "ccopy_reg\n_reconstructor\np0\n(c__main__\nMyObj\np1\nc__builtin__\nobject\np2\nNtp3\nRp4\n(dp5\nS'arg'\np6\nI2\nsb."
reverse_collation_func: "ccopy_reg\n_reconstructor\np0\n(c__main__\nMyObj\np1\nc__builtin__\nobject\np2\nNtp3\nRp4\n(dp5\nS'arg'\np6\nI2\nsb.", "ccopy_reg\n_reconstructor\np0\n(c__main__\nMyObj\np1\nc__builtin__\nobject\np2\nNtp3\nRp4\n(dp5\nS'arg'\np6\nI4\nsb."
reverse_collation_func: "ccopy_reg\n_reconstructor\np0\n(c__main__\nMyObj\np1\nc__builtin__\nobject\np2\nNtp3\nRp4\n(dp5\nS'arg'\np6\nI2\nsb.", "ccopy_reg\n_reconstructor\np0\n(c__main__\nMyObj\np1\nc__builtin__\nobject\np2\nNtp3\nRp4\n(dp5\nS'arg'\np6\nI3\nsb."
obj: u"ccopy_reg\n_reconstructor\np0\n(c__main__\nMyObj\np1\nc__builtin__\nobject\np2\nNtp3\nRp4\n(dp5\nS'arg'\np6\nI4\nsb."
obj: u"ccopy_reg\n_reconstructor\np0\n(c__main__\nMyObj\np1\nc__builtin__\nobject\np2\nNtp3\nRp4\n(dp5\nS'arg'\np6\nI3\nsb."
obj: u"ccopy_reg\n_reconstructor\np0\n(c__main__\nMyObj\np1\nc__builtin__\nobject\np2\nNtp3\nRp4\n(dp5\nS'arg'\np6\nI2\nsb."
obj: u"ccopy_reg\n_reconstructor\np0\n(c__main__\nMyObj\np1\nc__builtin__\nobject\np2\nNtp3\nRp4\n(dp5\nS'arg'\np6\nI1\nsb."
reverse_collation_func: '3', '4'
reverse_collation_func: '1', '2'
reverse_collation_func: '2', '4'
reverse_collation_func: '2', '3'
basic: u'4'
basic: u'3'
basic: u'2'
basic: u'1'

Python 3.0 (r30:67507, Dec 3 2008, 19:44:23) [MSC v.1500 64 bit (AMD64)] on win32 through to Python 3.10.4 (tags/v3.10.4:9d38120, Mar 23 2022, 23:13:41) [MSC v.1929 64 bit (AMD64)] on win32 produce:

Querying:
obj: b'ccopyreg\n_reconstructor\np0\n(c__main__\nMyObj\np1\ncbuiltins\nobject\np2\nNtp3\nRp4\n(dp5\nVarg\np6\nL1\nsb.'
obj: b'ccopyreg\n_reconstructor\np0\n(c__main__\nMyObj\np1\ncbuiltins\nobject\np2\nNtp3\nRp4\n(dp5\nVarg\np6\nL2\nsb.'
obj: b'ccopyreg\n_reconstructor\np0\n(c__main__\nMyObj\np1\ncbuiltins\nobject\np2\nNtp3\nRp4\n(dp5\nVarg\np6\nL3\nsb.'
obj: b'ccopyreg\n_reconstructor\np0\n(c__main__\nMyObj\np1\ncbuiltins\nobject\np2\nNtp3\nRp4\n(dp5\nVarg\np6\nL4\nsb.'
reverse_collation_func: '1', '2'
reverse_collation_func: '2', '3'
reverse_collation_func: '2', '4'
reverse_collation_func: '3', '4'
basic: '4'
basic: '3'
basic: '2'
basic: '1'

Note:

  • The lack of any prints from reverse_collation_func for the MyObj objects
  • The order of the obj values is not reversed as it was for python v2.7

Operating system and architecture:

OS Name:                   Microsoft Windows 10 Pro
OS Version:                10.0.19044 N/A Build 19044
OS Manufacturer:           Microsoft Corporation
OS Configuration:          Member Workstation
OS Build Type:             Multiprocessor Free
Product ID:                00330-52608-19715-AAOEM
Original Install Date:     23/02/2022, 07:15:14
System Boot Time:          12/05/2022, 09:12:53
System Manufacturer:       HP
System Model:              HP ZBook 15 G6
System Type:               x64-based PC
Processor(s):              1 Processor(s) Installed.
                           [01]: Intel64 Family 6 Model 158 Stepping 13 GenuineIntel ~2304 Mhz
BIOS Version:              HP R92 Ver. 01.10.01, 29/07/2021
Windows Directory:         C:\WINDOWS
System Directory:          C:\WINDOWS\system32
Boot Device:               \Device\HarddiskVolume1
System Locale:             en-gb;English (United Kingdom)
Input Locale:              en-gb;English (United Kingdom)
Time Zone:                 (UTC+00:00) Dublin, Edinburgh, Lisbon, London
Total Physical Memory:     32,574 MB
Available Physical Memory: 17,971 MB
Virtual Memory: Max Size:  37,438 MB
Virtual Memory: Available: 16,725 MB
Virtual Memory: In Use:    20,713 MB
Page File Location(s):     C:\pagefile.sys
Hotfix(s):                 9 Hotfix(s) Installed.
                           [01]: KB2693643
                           [02]: KB5013624
                           [03]: KB5003791
                           [04]: KB5007401
                           [05]: KB5013942
                           [06]: KB5011352
                           [07]: KB5011651
                           [08]: KB5014032
                           [09]: KB5005699
@AdrianIssott AdrianIssott added the type-bug An unexpected behavior, bug, or error label May 12, 2022
@erlend-aasland erlend-aasland added the extension-modules C modules in the Modules dir label May 12, 2022
@erlend-aasland
Copy link
Contributor

erlend-aasland commented May 13, 2022

TL;DR: The adapter function returns a bytes object, which means the data is stored with the SQLite type BLOB; collations only work with strings. Fixed by forcing the adapter to store the data as a string.

The lack of any prints from reverse_collation_func for the MyObj objects

This is because pickle.dumps returns a bytes object, which translates into the SQLite BLOB type.
In Python 2, a str object is returned:

$ python2

WARNING: Python 2.7 is not recommended. 
This version is included in macOS for compatibility with legacy software. 
Future versions of macOS will not include Python 2.7. 
Instead, it is recommended that you transition to using 'python3' from within Terminal.

Python 2.7.18 (default, Jan  4 2022, 17:47:56) 
[GCC Apple LLVM 13.0.0 (clang-1300.0.29.10) [+internal-os, ptrauth-isa=deployme on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import pickle
>>> type(pickle.dumps(1))
<type 'str'>
>>> ^D
$ python3
Python 3.11.0b1 (v3.11.0b1:8d32a5c8c4, May  6 2022, 22:45:29) [Clang 13.0.0 (clang-1300.0.29.30)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import pickle
>>> type(pickle.dumps(1))
<class 'bytes'>

Collations only work on strings. Quoting the SQLite docs:

Collating functions only matter when comparing string values. Numeric values are always compared numerically, and BLOBs are always compared byte-by-byte using memcmp().

If you force pickle.dumps to return a string, for example using the str built-in, you'll see that your collation function works as expected.

The order of the obj values is not reversed as it was for python v2.7

This is probably because the collation function was not run.

Suggesting closing this as not-a-bug.

@AdrianIssott
Copy link
Author

Ah ha! Indeed using the following

def adapter_func(obj):
    return pickle.dumps(obj, protocol=0).decode('utf-8')

def converter_func(data):
    return pickle.loads(data.encode('utf-8'))

does indeed work.

Thanks!

Suggesting closing this as not-a-bug.

Agreed

@AdrianIssott
Copy link
Author

@erlend-aasland, on a related note the https://docs.python.org/3/library/sqlite3.html documentation for create_collation says:

the callable will get its parameters as Python bytestrings

However, you can see from the output above that reverse_collation_func is actually receiving strings given that the output is:

reverse_collation_func: '1', '2'

and not:

reverse_collation_func: b'1', b'2'

Do you agree the the documentation is incorrect? If so I'll raise a new issue for that

@erlend-aasland
Copy link
Contributor

Do you agree the the documentation is incorrect? If so I'll raise a new issue for that

Yeah, I believe the docs can be improved; please open a new issue. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
extension-modules C modules in the Modules dir topic-sqlite3
Projects
None yet
Development

No branches or pull requests

2 participants