Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rekey results in invalid database #165

Closed
rogerbinns opened this issue Jun 2, 2024 · 18 comments
Closed

rekey results in invalid database #165

rogerbinns opened this issue Jun 2, 2024 · 18 comments

Comments

@rogerbinns
Copy link

import apsw

con = apsw.Connection("testdb")
con.pragma("cipher", "rc4")
con.pragma("key", "hello")
con.execute("create table x(y); insert into x values(randomblob(656536))")

con.pragma("cipher", "ascon128")
con.pragma("rekey", "world")
con.execute("insert into x values(randomblob(656536))")

When the final execute happens I get SQLError: attached databases must use the same text encoding as main database which it not right.

@rogerbinns rogerbinns changed the title ascon128 rekey results in invalid database rekey results in invalid database Jun 2, 2024
@rogerbinns
Copy link
Author

The same problem happens going from aes128cbc to chacha20

rogerbinns added a commit to utelle/apsw-sqlite3mc that referenced this issue Jun 2, 2024
@rogerbinns
Copy link
Author

And various other combinations where other errors like NotADBError: file is not a database happen. I'm guesing the header is corrupt.

I've commited a script that does this random testing. Run it with python3 -m apsw.mcall

@utelle
Copy link
Owner

utelle commented Jun 2, 2024

Typically, PRAGMA rekey should be used to change the passphrase only, not the cipher scheme.

The rekey function uses a modified vacuum procedure internally , if the cipher scheme should be changed at the same time. I have to analyze, what goes wrong for the various cipher combinations. It could well be that certain combinations don't work at all due to some SQLite restriction.

SQLCipher for example doesn't support PRAGMA rekey at all for encrypting a plain database or for decrypting an encrypted database. Only changing the passphrase is supported.

Of course, I will further investigate this issue, but it will not have highest priority. Just changing the passphrase should always work, and for changing the cipher scheme there is the alternative of vacuum into.

@rogerbinns
Copy link
Author

SQLCipher for example doesn't support PRAGMA rekey at all for encrypting a plain database

I refer to the docs:

The PRAGMA rekey resp PRAGMA hexrekey statement has 3 use cases:

  1. Encrypt an existing unencrypted database
  2. Change the encryption key of an existing encrypted database
  3. Remove encryption from an existing encrypted database

Unless an error is returned, it means the action is supported! Things that are not supported must error, and I am fine with that.

For the record the vast majority of rekeying does work.

@rogerbinns
Copy link
Author

You can run python3 -m apsw.mcall to run through randomized combinations. Here are some:

unencrypted to sqlcipher legacy=3 via rekey

SQLError: Rekeying failed. Pagesize cannot be changed for an encrypted database. (good response)

However memory is leaked.

unencrypted to sqlcipher via rekey

No error, database is ok

rc4 to sqlcipher via rekey

Rekey goes fine, accessing the database gives CorruptError (rekey should either refuse, or succeed and not corrupt)

rc4 to sqlcipher legacy=3 via rekey

Rekey gives SQLITE_ERROR without setting sqlite3_errmsg.

@utelle
Copy link
Owner

utelle commented Jun 3, 2024

sqlcipher legacy=3 uses a page size of 1024 in contrast to the default page size of 4096. Not sure, why the change in page size is sometimes detected and sometimes not. I will check the code.

@rogerbinns
Copy link
Author

I find the regular page_size differing from legacy_page_size confusing. If both are set, which takes priority? Why does legacy_page_size even exist?

Also although the documentation says legacy_page_size must be a power of 2,. the pragma happily accepts any values from 1 through 65536 inclusive, like 5432;.

BTW my original assumption was the cryptography was using its own pages within the SQLite pages - ie a cryptography block size that could be independent of the SQLite page size, which in turn is independent of the filesystem block size.

@utelle
Copy link
Owner

utelle commented Jun 3, 2024

I find the regular page_size differing from legacy_page_size confusing.

Yes, it is confusing, although I tried hard to explain it in the documentation.

The problem is that the SQLite database header contains information about the page size of the database file. This information is read by SQLite before the encryption extension has a chance to initialize the required cipher scheme. Therefore the official SQLite Encryption Extension (SEE) leaves exactly 8 bytes of the database header unencrypted, so that the page size can determined, before initializing SEE. Typically a cipher scheme needs to know the page size to be able to locate the reserved bytes per page.

However, the original versions of SQLCipher, sqleet, and (very early) versions of wxSQLite3 encrypt the complete header. This prevents SQLite from determining the correct page size, so that typically the default page size (currently 4096) will be used. In most cases this works, because the database actually has the default page size, but for example prior versions of SQLCipher used a page size of 1024.

If both are set, which takes priority?

First, the legacy page size will be set, but it can be overwritten by a pragma page_size.

Why does legacy_page_size even exist?

Different legacy cipher schemes use different page sizes. Specifying the default legacy page size spares the user from explicitly issuing a pragma page_size. If the database does not use the default, using pragma page_size will be still required, unless the legacy page size was adjusted accordingly.

Also although the documentation says legacy_page_size must be a power of 2,. the pragma happily accepts any values from 1 through 65536 inclusive, like 5432;.

You are right. This should be corrected. SQLite's pragma page_size does not issue an error message, if a wrong page size was specified, but it will not change the value in that case.

BTW my original assumption was the cryptography was using its own pages within the SQLite pages - ie a cryptography block size that could be independent of the SQLite page size, which in turn is independent of the filesystem block size.

No, the actual page size must be a power of 2. Cipher schemes (or VFSes, to be more generic) are allowed to reserve up to 255 bytes per page for their own use (for example, for nonce, HMACs or other data). SQLite will then use page size - reserved bytes bytes of a page for storing database content.

@rogerbinns
Copy link
Author

You've explained why the page size needs to be known before first access in many cipher configurations. But not why legacy_page_size exists. I don't see how having it spares you from a page_size pragma - if you need to set the page size in advance, why are there two different pragmas that have the same effect?

@utelle
Copy link
Owner

utelle commented Jun 4, 2024

You've explained why the page size needs to be known before first access in many cipher configurations.

Only for legacy cipher schemes. Otherwise SQLite knows the page size from the database header.

But not why legacy_page_size exists.

It is required for setting the page size of a database which is encrypted with a legacy cipher scheme.

I don't see how having it spares you from a page_size pragma

Sorry, my explanation was unfortunately not correct regarding pragma page_size. pragma page_size can be used to set page size for a new database or to change the page size of an existing database. For databases encrypted with legacy cipher schemes setting the page size for an existing database is required. And that can be done with pragma legacy_page_size only.

The parameter legacy_page_size has a default value for each legacy cipher scheme. For example 1024 for SQLCipher up to version 3. Usually you don't have to explicitly set legacy_page_size, but maybe a project decided to use a different page size, say 16384. Then pragma legacy_page_size allows to change the default legacy page size.

if you need to set the page size in advance, why are there two different pragmas that have the same effect?

pragma page_size is handled by SQLite, pragma legacy_page_size is handled by SQLite3MC. And their effect is not the same. The main purpose of pragma page_size is to change the page size of a database (or to query the page size). pragma legacy_page_size sets the page size of an existing legacy database, so that SQLite then knows the correct page size.

Pragma statements can currently only be used to change the configuration of the currently selected cipher. However, the default configuration of any supported cipher scheme can be adjusted via SQL functions. This can be useful, if an application has to deal with many databases which are attached to a database connection.

@utelle
Copy link
Owner

utelle commented Jun 4, 2024

Also although the documentation says legacy_page_size must be a power of 2,. the pragma happily accepts any values from 1 through 65536 inclusive, like 5432;.

This has been fixed in commit 56ac1e2. Now only valid page sizes are accepted. Otherwise the value is not changed.

utelle added a commit that referenced this issue Jun 6, 2024
For some configurations the rekey function did not enforce the page size and the number of reserved bytes of the database after finishing the rekeying operation. This could lead to corrupted databases.
@utelle
Copy link
Owner

utelle commented Jun 6, 2024

Commit efdb694 should fix the issue.

If there are still cases for which rekeying results in corrupted databases, please reopen.

@utelle utelle closed this as completed Jun 6, 2024
@rogerbinns
Copy link
Author

Running python3 -m apsw.mcall I no longer see corrupt databases. I do still see #164 where rekey pragma returns SQLITE_ERROR and does not set the error string.

@utelle
Copy link
Owner

utelle commented Jun 6, 2024

Running python3 -m apsw.mcall I no longer see corrupt databases.

Good.

I do still see #164 where rekey pragma returns SQLITE_ERROR and does not set the error string.

Yes, I haven't tracked down yet this issue.

Just a few minutes ago I tested on Linux. Here is the result I see on screen:

APSW debug build: missing sys.apsw_fault_inject_control
{'cipher': 'sqlcipher', 'hmac_use': 1, 'plaintext_header_size': 62, 'legacy': 2}
SQLError: Rekeying failed. Pagesize cannot be changed for an encrypted database.
{'cipher': 'rc4', 'legacy_page_size': 1024}
SQLError: Rekeying failed. Pagesize cannot be changed for an encrypted database.
{'cipher': 'rc4'}
SQLError: Rekeying failed. Pagesize cannot be changed for an encrypted database.
{'cipher': 'ascon128'}
{'cipher': 'aes128cbc'}
{'cipher': 'rc4'}
Traceback (most recent call last):
  File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/home/ulrich/Development/GitHub/apsw-sqlite3mc/apsw/mcall.py", line 154, in <module>
    run()
  File "/home/ulrich/Development/GitHub/apsw-sqlite3mc/apsw/mcall.py", line 125, in run
    con.pragma("hexrekey", newkey)
  File "src/cursor.c", line 959, in APSWCursor_execute.sqlite3_prepare_v3
    AddTraceBackHere(__FILE__, __LINE__, "APSWCursor_execute.sqlite3_prepare_v3", "{s: O, s: O}",
apsw.SQLError: SQLError: SQL logic error

Obviously, there still is at least one bug somewhere, but at the moment I don't know why there is a crash.

The second run produced the following error:

python3: /home/ulrich/Development/GitHub/apsw-sqlite3mc/sqlite3/sqlite3.c:60597: pagerOpenWalIfPresent: Assertion `pPager->eState==PAGER_OPEN' failed.

No idea, why SQLite tries to open a WAL pager.

@rogerbinns
Copy link
Author

Found another corrupt database, which also causes an assertion failure in debug build

import apsw

con = apsw.Connection("testdb")
con.execute("create table x(y); insert into x values(randomblob(65536))")
con.pragma("cipher", "sqlcipher")
con.pragma("plaintext_header_size", 33)
con.pragma("fast_kdf_iter", 63)
con.pragma("hmac_algorithm", 1)
con.pragma("hexrekey", "aabbccdd")
con.execute("insert into x select * from x")

Note that plaintext_header_size is documented as being a multiple of 32 but any value is accepted. Setting it to 32 makes no difference in this failure.

@rogerbinns
Copy link
Author

The second run produced the following error:

You can do the running having gdb automatically break on those assertion failures:

gdb -ex=run --args python3 -m apsw.mcall

@utelle
Copy link
Owner

utelle commented Jun 6, 2024

Found another corrupt database, which also causes an assertion failure in debug build

import apsw

con = apsw.Connection("testdb")
con.execute("create table x(y); insert into x values(randomblob(65536))")
con.pragma("cipher", "sqlcipher")
con.pragma("plaintext_header_size", 33)
con.pragma("fast_kdf_iter", 63)
con.pragma("hmac_algorithm", 1)
con.pragma("hexrekey", "aabbccdd")
con.execute("insert into x select * from x")

Note that plaintext_header_size is documented as being a multiple of 32 but any value is accepted. Setting it to 32 makes no difference in this failure.

Yes, just as for legacy_page_size it will be necessary to check that the values for plaintext_header_size take valid values only. SQLite3 Multiple Ciphers will happily accept any value, because the underlying AES algorithm uses Cypher Text Stealing if the buffer length is not a multiple of 16. However, the resulting database will no longer be compatible with the original SQLCipher implementation.

Thanks for the sample. This will help to analyze what is going wrong and where.

For the situation where the SQL logic error is thrown I have found out that the rekey operation fails with an error code (but without a message from rekey). The text SQL logic error is the default text for error code SQLITE_ERROR.

I'm quite confident that I will manage to resolve this issue within the next couple of days.

Thanks for the hint how to use gdb to catch assertions.

@utelle
Copy link
Owner

utelle commented Jun 7, 2024

I finally found the cause for the pager assertion. The HMAC size for HMAC algorithm SHA256 of SQLCipher was reported incorrectly. Thus, the HMAC was written partially outside of the page buffer bounds.

Commit 02b69ad fixes this. At least in my development environment, the test sample mcall.py does no longer throw exceptions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants