Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CorruptError if SQLITE_CONFIG_MMAP_SIZE used #156

Closed
rogerbinns opened this issue May 28, 2024 · 18 comments
Closed

CorruptError if SQLITE_CONFIG_MMAP_SIZE used #156

rogerbinns opened this issue May 28, 2024 · 18 comments

Comments

@rogerbinns
Copy link

The following program causes CorruptError:

import apsw

# calls sqlite3_config
apsw.config(apsw.SQLITE_CONFIG_MMAP_SIZE,  8448, - 1)

db = apsw.Connection("testdb")
db.pragma("hexkey", b"hello world".hex())
db.execute("create table x(y); insert into x values(zeroblob(78000))")
db.execute("select * from x").get

CorruptError is returned executing the select. Leaving out the hexkey also does not result in error.

If the mmap value is 8447 or less then the error does not occur. I was setting it to 2 **63 -1.

@utelle
Copy link
Owner

utelle commented May 28, 2024

I have to admit that I have never experimented with the SQLITE_CONFIG_MMAP_SIZE config parameter.

CorruptError is returned executing the select. Leaving out the hexkey also does not result in error.

Leaving out calling PRAGMA hexkey results in an unencrypted database file. That is, the encryption code is not called.

What happens, if you use PRAGMA key instead? Same error? Most likely yes.

At the moment I have no idea, what may cause the error. I'll have to try to reproduce the error in my development environment to find out what is going wrong.

@utelle
Copy link
Owner

utelle commented May 28, 2024

In the meantime I made a few experiments using the SQLite shell coming with the SQLite3 Multiple Ciphers releases.

The parameter MMAP_SIZE can also be set via PRAGMA mmap_size.

First, I used an older version of the shell I had at hand from other tests. If I used a value of 8192 I got the error "database disk image is malformed". With smaller values the error went away, although I did not reopen the database, but just reissued the SELECT command.

Then I used the shell from the latest release. But I was unable to reproduce the error. I could set mmap_size to any value.

This makes it extremely difficult to track down the cause of the error.

@rogerbinns
Copy link
Author

It doesn't matter whether key or hexkey is used. If I catch the CorruptError, turn mmap back off, and do a read again then all is fine. I also double checked the database is encrypted, and that calling the vfs xRead method does give the right results.

The issue in your code is almost certainly how VFS xFetch is handled.

I put a break point where I can get the sqlite3_io_methods and it does look correct:

(gdb) p *fp.pMethods 
$6 = {iVersion = 3, xClose = 0x7fffe9833de0 <mcIoClose>, 
  xRead = 0x7fffe9845070 <mcIoRead>, 
  xWrite = 0x7fffe9844dc0 <mcIoWrite>, 
  xTruncate = 0x7fffe9826bb0 <mcIoTruncate>, 
  xSync = 0x7fffe9826bc0 <mcIoSync>, 
  xFileSize = 0x7fffe9826bd0 <mcIoFileSize>, 
  xLock = 0x7fffe9826be0 <mcIoLock>, 
  xUnlock = 0x7fffe9826bf0 <mcIoUnlock>, 
  xCheckReservedLock = 0x7fffe9826c00 <mcIoCheckReservedLock>, 
  xFileControl = 0x7fffe98c0970 <mcIoFileControl>, 
  xSectorSize = 0x7fffe9826c10 <mcIoSectorSize>, 
  xDeviceCharacteristics = 0x7fffe9826c40 <mcIoDeviceCharacteristics>, 
  xShmMap = 0x7fffe9826c50 <mcIoShmMap>, 
  xShmLock = 0x7fffe9826c60 <mcIoShmLock>, 
  xShmBarrier = 0x7fffe9826c70 <mcIoShmBarrier>, 
  xShmUnmap = 0x7fffe9826c80 <mcIoShmUnmap>, 
  xFetch = 0x7fffe9826ca0 <mcIoFetch>, 
  xUnfetch = 0x7fffe9826cc0 <mcIoUnfetch>}

@rogerbinns
Copy link
Author

The changes APSW makes to SQLite defaults are in src/apsw.c and only if statically compiling in the amalgamation which I am doing in this case.

#define SQLITE_OMIT_DEPRECATED

#ifndef SQLITE_MAX_ATTACHED
#define SQLITE_MAX_ATTACHED 125
#endif

#ifndef SQLITE_MAX_MMAP_SIZE
#define SQLITE_MAX_MMAP_SIZE 0x1000000000000LL
#endif

#ifndef APSW_NO_NDEBUG
#define SQLITE_API static
#define SQLITE_EXTERN static
#endif

#ifdef SQLITE_DEBUG
#define SQLITE_ENABLE_API_ARMOR 1
#endif

Builds for PyPI also add SQLITE_ENABLE_COLUMN_METADATA configured in tools/setup-pypi.cfg

@utelle
Copy link
Owner

utelle commented May 28, 2024

It doesn't matter whether key or hexkey is used.

Any other answer would have surprised me.

If I catch the CorruptError, turn mmap back off, and do a read again then all is fine. I also double checked the database is encrypted, and that calling the vfs xRead method does give the right results.

Maybe the SQLite compile time options used to compile APSW differ from those I typically use for the SQLite shell.

The issue in your code is almost certainly how VFS xFetch is handled.

xFetch is simply forwarded to the xFetch method of the underlying real VFS.

@rogerbinns
Copy link
Author

xFetch is simply forwarded to the xFetch method of the underlying real VFS.

That can't be right for mcIoFetch. mcIoRead calls mcReadMainDb which does decryption. So mcIoFetch also needs to do decryption as does mcIoUnfetch need to do encryption. I'll claim the result of fetch being used because of memory mapping means that the raw encrypted data is coming back, hence the corruption error.

A simple fix is to make your interposer sqlite3_io_methods cap iVersion at 2.

@utelle
Copy link
Owner

utelle commented May 28, 2024

xFetch is simply forwarded to the xFetch method of the underlying real VFS.

That can't be right for mcIoFetch. mcIoRead calls mcReadMainDb which does decryption. So mcIoFetch also needs to do decryption as does mcIoUnfetch need to do encryption. I'll claim the result of fetch being used because of memory mapping means that the raw encrypted data is coming back, hence the corruption error.

I will further analyze what SQLite does when using xFetch/xUnFetch. It is a bit irritating that writing encrypted content seems to work properly, in contrast to reading.

A simple fix is to make your interposer sqlite3_io_methods cap iVersion at 2.

As a quick fix that would probably work.

@rogerbinns
Copy link
Author

I did try setting the xFetch entry in mcIoMethodsGlobal3 to 0 but then there is a SEGV in sqlite3OsFetch which isn't checking for a non-NULL pointer.

Setting iVersion to 2 in the same structure did make the issue go away.

@utelle
Copy link
Owner

utelle commented May 28, 2024

I did try setting the xFetch entry in mcIoMethodsGlobal3 to 0 but then there is a SEGV in sqlite3OsFetch which isn't checking for a non-NULL pointer.

Setting method pointer to 0 is not a good idea. SQLite doesn't check for null pointer at some critical places (as I found out when I tried to add SQLite3 Multiple Ciphers to the WASM module).

Setting iVersion to 2 in the same structure did make the issue go away.

Setting iVersion to 2 will prevent that xFetch and xUnfetch will be called.

That is, your analysis is absolutely right that those 2 methods do not what they should, if encryption is enabled.

@utelle
Copy link
Owner

utelle commented May 29, 2024

Ok, in the meantime I inspected the SQLite code. IMHO the methods xFetch/xUnfetch must not be called, if the database is encrypted. xFetch returns pointers to memory managed by the OS, and this means the content has to be the same as it is on disk. That is, unencrypted for a plain database or encrypted for an encrypted database. Decrypting the page content in xFetch would most likely cause database corruption.

The real question is, why method xFetch is called at all? The internal SQLite function setGetterMethod has been patched to check whether the database is encrypted or not. If memory mapping is supported and the database is not encrypted, the getter method getPageMMap will be selected; if the database is encrypted, the getter method getPageNormal will be selected. Thus, memory mapping should not be used, and xFetch should not be called. Why does it happen in the APSW environment?

@rogerbinns: Could you please find out which page getter method is set and whether it is correctly detected that the database is encrypted? The call stack when xFetch is entered could possibly be useful, too.

@rogerbinns
Copy link
Author

There are wiki instructions which is all I am doing to run the code at the top. This is with the updated sqlite3.c in that repo putting a breakpoint ion mcIoFetch

#0  mcIoFetch (pFile=0xc94b20, iOfst=4096, iAmt=4096, pp=0x7fffffffcc60)
    at /space/mc/sqlite3/sqlite3.c:307830
#1  0x00007fffe9949a7d in sqlite3OsFetch (pp=0x7fffffffcc60, 
    iAmt=<optimised out>, iOff=<optimised out>, id=<optimised out>)
    at /space/mc/sqlite3/sqlite3.c:26372
#2  getPageMMap (pPager=0xc94998, pgno=2, ppPage=0x7fffffffccb0, flags=2)
    at /space/mc/sqlite3/sqlite3.c:62913
#3  0x00007fffe9925225 in sqlite3PagerGet (flags=2, 
    ppPage=0x7fffffffccb0, pgno=2, pPager=<optimised out>)
    at /space/mc/sqlite3/sqlite3.c:62975
#4  getAndInitPage (pBt=0xc0fa08, pgno=2, ppPage=ppPage@entry=0xc9e978, 
    bReadOnly=2) at /space/mc/sqlite3/sqlite3.c:73078
#5  0x00007fffe9925a1c in moveToRoot (pCur=pCur@entry=0xc9e8f0)
    at /space/mc/sqlite3/sqlite3.c:76207
#6  0x00007fffe9925d5f in sqlite3BtreeFirst (pCur=0xc9e8f0, 
    pRes=0x7fffffffce60) at /space/mc/sqlite3/sqlite3.c:76314
#7  0x00007fffe99bb67b in sqlite3VdbeExec (p=p@entry=0xc9f678)
    at /space/mc/sqlite3/sqlite3.c:99562
#8  0x00007fffe99cd440 in sqlite3Step (p=<optimised out>)
    at /space/mc/sqlite3/sqlite3.c:91371
#9  sqlite3_step (pStmt=<optimised out>)
    at /space/mc/sqlite3/sqlite3.c:91432
#10 sqlite3_step (pStmt=<optimised out>)
    at /space/mc/sqlite3/sqlite3.c:91421
#11 0x00007fffe99ddf94 in APSWCursor_step (
    self=self@entry=0x7fffe9d85bc0) at src/cursor.c:736

Hopefully this is enough.

@rogerbinns
Copy link
Author

pPager in frame 2

(gdb) p *pPager
$3 = {pVfs = 0xc77738, exclusiveMode = 0 '\000', journalMode = 0 '\000', 
  useJournal = 1 '\001', noSync = 0 '\000', fullSync = 1 '\001', 
  extraSync = 0 '\000', syncFlags = 2 '\002', walSyncFlags = 10 '\n', 
  tempFile = 0 '\000', noLock = 0 '\000', readOnly = 0 '\000', 
  memDb = 0 '\000', memVfs = 0 '\000', eState = 1 '\001', 
  eLock = 1 '\001', changeCountDone = 0 '\000', setSuper = 0 '\000', 
  doNotSpill = 0 '\000', subjInMemory = 0 '\000', bUseFetch = 1 '\001', 
  hasHeldSharedLock = 1 '\001', dbSize = 21, dbOrigSize = 2, 
  dbFileSize = 21, dbHintSize = 21, errCode = 0, nRec = 0, 
  cksumInit = 4228306223, nSubRec = 0, pInJournal = 0x0, fd = 0xc94b20, 
  jfd = 0xc94cb0, sjfd = 0xc94be8, journalOff = 0, journalHdr = 0, 
  pBackup = 0x0, aSavepoint = 0x0, nSavepoint = 0, iDataVersion = 4, 
  dbFileVers = "\000\000\000\002\000\000\000\025\000\000\000\000\000\000\000", nMmapOut = 0, szMmap = 281474976710656, pMmapFreelist = 0x0, 
  nExtra = 136, nReserve = 32, vfsFlags = 33554694, sectorSize = 512, 
  mxPgno = 4294967294, lckPgno = 262145, pageSize = 4096, 
  journalSizeLimit = -1, zFilename = 0xc94d84 "/space/mc/testdb", 
  zJournal = 0xc94d96 "/space/mc/testdb-journal", 
  xBusyHandler = 0x7fffe987d680 <btreeInvokeBusyHandler>, 
  pBusyHandlerArg = 0xc0fa08, aStat = {9, 0, 23, 0}, 
  xReiniter = 0x7fffe9925130 <pageReinit>, 
  xGet = 0x7fffe99498a0 <getPageMMap>, pTmpSpace = 0xc94dd8 "", 
  pPCache = 0xc94ad0, pWal = 0x0, zWal = 0xc94daf "/space/mc/testdb-wal"}
(gdb) p *pPager->pVfs
$5 = {iVersion = 3, szOsFile = 200, mxPathname = 512, 
  pNext = 0x7fffe9ae61a0 <aVfs.2599>, 
  zName = 0xc777f0 "multipleciphers-unix", 
  pAppData = 0x7fffe9ae61a0 <aVfs.2599>, 
  xOpen = 0x7fffe984ee50 <mcVfsOpen>, 
  xDelete = 0x7fffe982b190 <mcVfsDelete>, 
  xAccess = 0x7fffe982b1a0 <mcVfsAccess>, 
  xFullPathname = 0x7fffe982b1b0 <mcVfsFullPathname>, 
  xDlOpen = 0x7fffe982b1c0 <mcVfsDlOpen>, 
  xDlError = 0x7fffe982b1d0 <mcVfsDlError>, 
  xDlSym = 0x7fffe982b1e0 <mcVfsDlSym>, 
  xDlClose = 0x7fffe982b1f0 <mcVfsDlClose>, 
  xRandomness = 0x7fffe982b200 <mcVfsRandomness>, 
  xSleep = 0x7fffe982b210 <mcVfsSleep>, 
  xCurrentTime = 0x7fffe982b220 <mcVfsCurrentTime>, 
  xGetLastError = 0x7fffe982b230 <mcVfsGetLastError>, 
  xCurrentTimeInt64 = 0x7fffe982b240 <mcVfsCurrentTimeInt64>, 
  xSetSystemCall = 0x7fffe982b250 <mcVfsSetSystemCall>, 
  xGetSystemCall = 0x7fffe982b260 <mcVfsGetSystemCall>, 
  xNextSystemCall = 0x7fffe982b270 <mcVfsNextSystemCall>}

@utelle
Copy link
Owner

utelle commented May 29, 2024

There are wiki instructions which is all I am doing to run the code at the top.

I will set up the environment in my Linux environment and will try to further investigate the issue.

This is with the updated sqlite3.c in that repo putting a breakpoint ion mcIoFetch

#0  mcIoFetch (pFile=0xc94b20, iOfst=4096, iAmt=4096, pp=0x7fffffffcc60)
    at /space/mc/sqlite3/sqlite3.c:307830
#1  0x00007fffe9949a7d in sqlite3OsFetch (pp=0x7fffffffcc60, 
    iAmt=<optimised out>, iOff=<optimised out>, id=<optimised out>)
    at /space/mc/sqlite3/sqlite3.c:26372
#2  getPageMMap (pPager=0xc94998, pgno=2, ppPage=0x7fffffffccb0, flags=2)
    at /space/mc/sqlite3/sqlite3.c:62913
#3  0x00007fffe9925225 in sqlite3PagerGet (flags=2, 
    ppPage=0x7fffffffccb0, pgno=2, pPager=<optimised out>)
    at /space/mc/sqlite3/sqlite3.c:62975
#4  getAndInitPage (pBt=0xc0fa08, pgno=2, ppPage=ppPage@entry=0xc9e978, 
    bReadOnly=2) at /space/mc/sqlite3/sqlite3.c:73078
#5  0x00007fffe9925a1c in moveToRoot (pCur=pCur@entry=0xc9e8f0)
    at /space/mc/sqlite3/sqlite3.c:76207
#6  0x00007fffe9925d5f in sqlite3BtreeFirst (pCur=0xc9e8f0, 
    pRes=0x7fffffffce60) at /space/mc/sqlite3/sqlite3.c:76314
#7  0x00007fffe99bb67b in sqlite3VdbeExec (p=p@entry=0xc9f678)
    at /space/mc/sqlite3/sqlite3.c:99562
#8  0x00007fffe99cd440 in sqlite3Step (p=<optimised out>)
    at /space/mc/sqlite3/sqlite3.c:91371
#9  sqlite3_step (pStmt=<optimised out>)
    at /space/mc/sqlite3/sqlite3.c:91432
#10 sqlite3_step (pStmt=<optimised out>)
    at /space/mc/sqlite3/sqlite3.c:91421
#11 0x00007fffe99ddf94 in APSWCursor_step (
    self=self@entry=0x7fffe9d85bc0) at src/cursor.c:736

As can be seen the page getter method getPageMMap is called (# 3). This should not happen for an encrypted database. No idea, why it happens nevertheless.

@utelle
Copy link
Owner

utelle commented May 29, 2024

Ok, I tested under Linux based on the wiki instructions. Here are the test results:

APSW debug build: missing sys.apsw_fault_inject_control
                Python  /home/ulrich/Development/GitHub/apsw-sqlite3mc/.venv/bin/python3 sys.version_info(major=3, minor=10, micro=12, releaselevel='final', serial=0) 64bit ELF
Testing with APSW file  /home/ulrich/Development/GitHub/apsw-sqlite3mc/apsw/__init__.cpython-310-x86_64-linux-gnu.so
          APSW version  3.46.0.0
    SQLite lib version  3.46.0
SQLite headers version  3046000
    Using amalgamation  True
.............................................................A message due to RecursionError is possible, and what is being tested
.....................................................................
----------------------------------------------------------------------
Ran 130 tests in 31.732s

OK

No CorruptError. Hm???

@rogerbinns
Copy link
Author

You only need the 6 lines of code in this issue report at the top to reproduce. The wiki has instructions at the bottom on how to set a breakpoint in code of your choice.

In commit utelle/apsw-sqlite3mc@2c623e8 I temporarily worked around the issue which is why it doesn't show up now. That was to allow a build for test pypi to complete.

I've now removed the workaround and added another test that explicitly checks memory mapping.

rogerbinns added a commit to utelle/apsw-sqlite3mc that referenced this issue May 29, 2024
This remove the temporary workaround from
2c623e8 and ensures a test
using memory mapping is always run.

Refs utelle/SQLite3MultipleCiphers#156
@utelle
Copy link
Owner

utelle commented May 30, 2024

You only need the 6 lines of code in this issue report at the top to reproduce. The wiki has instructions at the bottom on how to set a breakpoint in code of your choice.

Thanks. In the meantime I have set up the project in my Linux environment, so that I can continue to test there.

In commit utelle/apsw-sqlite3mc@2c623e8 I temporarily worked around the issue which is why it doesn't show up now. That was to allow a build for test pypi to complete.

I've now removed the workaround and added another test that explicitly checks memory mapping.

I see now the CorruptError and I will try to track down, what's causing the problem. This may take some time.

utelle added a commit that referenced this issue May 30, 2024
... and clean up some more trailing whitespace
@utelle
Copy link
Owner

utelle commented May 30, 2024

Actually, it was only a small glitch. The codec pointer was set in the SQLite file structure a bit too late, so that the SQLite pager got the impression that no codec is in place and selected the page getter method for memory-mapped database files.

Commit c95bd21 should fix the issue.

I committed an updated amalgamation to apsw-sqlite3mc.

@utelle
Copy link
Owner

utelle commented May 31, 2024

I could run all tests without problems. Therefore closing this issue. Reopen if necessary.

@utelle utelle closed this as completed May 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants