-
Notifications
You must be signed in to change notification settings - Fork 212
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Creating state-sync snapshot can fail with SQLITE_BUSY #8523
Labels
Comments
mhofman
added
bug
Something isn't working
cosmic-swingset
package: cosmic-swingset
swing-store
labels
Nov 10, 2023
Sources of info on how the WAL mode works and locking: |
warner
added a commit
that referenced
this issue
Dec 5, 2023
Add a new API to the exporter, `exporter.getHostKV(key)`, so that the cosmic-swingset state-sync exporter process can query `host.height` without accidentally creating a read-write transaction too. refs #8523
warner
added a commit
that referenced
this issue
Dec 5, 2023
Exporting a state-sync snapshot is a read-only operation, and is designed to run "in the background", i.e. in parallel with normal mutating operations. It accomplishes this by opening a read-only transaction right away, effectively capturing a snapshot of the SQLite database state, to insulate the export process from ongoing writes by the execution host. The cosmic-swingset exporter starts with a query of `host.height`, to confirm that the database has not already advanced to a new block before this snapshot/read-transaction can be taken. Previously, this query worked by using `openSwingStore`, and then calling `hostStorage.hostKVStore.get('host.height')`. This had two problems: * TOCTTOU: the `hostKVStore.get` used a different DB connection (and different txn) than the exporter, so it might return a different height, negating the accuracy of the consistency check * read-write txn: `openSwingStore` creates a read-*write* txn, even when merely opening the DB (because it might need to create the initial tables). This txn is closed right away, before `openSwingStore()` returns, so it did not present a threat to ongoing operations. But if the exporter was created while the ongoing execution side already had its own read-write txn open (e.g. while `controller.run()` was running), then it would fail, and `makeSwingStoreExporter` would fail with `SQLITE_BUSY` Instead, we take advantage of the new `swingStoreExporter.getHostKV()` API, and use *it* to fetch `host.height`. Unlike the normal swingstore, the swingstore-exporter refrains from creating read-write transactions entirely. So the cosmic-swingset export code can safely query the height without fear of getting the wrong value or failing because of an ongoing write transaction. We this this should fix the SQLITE_BUSY errors. refs #8523
warner
added a commit
that referenced
this issue
Dec 5, 2023
Exporting a state-sync snapshot is a read-only operation, and is designed to run "in the background", i.e. in parallel with normal mutating operations. It accomplishes this by opening a read-only transaction right away, effectively capturing a snapshot of the SQLite database state, to insulate the export process from ongoing writes by the execution host. The cosmic-swingset exporter starts with a query of `host.height`, to confirm that the database has not already advanced to a new block before this snapshot/read-transaction can be taken. Previously, this query worked by using `openSwingStore`, and then calling `hostStorage.hostKVStore.get('host.height')`. This had two problems: * TOCTTOU: the `hostKVStore.get` used a different DB connection (and different txn) than the exporter, so it might return a different height, negating the accuracy of the consistency check * read-write txn: `openSwingStore` creates a read-*write* txn, even when merely opening the DB (because it might need to create the initial tables). This txn is closed right away, before `openSwingStore()` returns, so it did not present a threat to ongoing operations. But if the exporter was created while the ongoing execution side already had its own read-write txn open (e.g. while `controller.run()` was running), then it would fail, and `makeSwingStoreExporter` would fail with `SQLITE_BUSY` Instead, we take advantage of the new `swingStoreExporter.getHostKV()` API, and use *it* to fetch `host.height`. Unlike the normal swingstore, the swingstore-exporter refrains from creating read-write transactions entirely. So the cosmic-swingset export code can safely query the height without fear of getting the wrong value or failing because of an ongoing write transaction. We think this should fix the SQLITE_BUSY errors. refs #8523
mergify bot
pushed a commit
that referenced
this issue
Dec 6, 2023
Add a new API to the exporter, `exporter.getHostKV(key)`, so that the cosmic-swingset state-sync exporter process can query `host.height` without accidentally creating a read-write transaction too. refs #8523
mhofman
pushed a commit
that referenced
this issue
Dec 6, 2023
Add a new API to the exporter, `exporter.getHostKV(key)`, so that the cosmic-swingset state-sync exporter process can query `host.height` without accidentally creating a read-write transaction too. refs #8523
mhofman
pushed a commit
that referenced
this issue
Dec 6, 2023
Exporting a state-sync snapshot is a read-only operation, and is designed to run "in the background", i.e. in parallel with normal mutating operations. It accomplishes this by opening a read-only transaction right away, effectively capturing a snapshot of the SQLite database state, to insulate the export process from ongoing writes by the execution host. The cosmic-swingset exporter starts with a query of `host.height`, to confirm that the database has not already advanced to a new block before this snapshot/read-transaction can be taken. Previously, this query worked by using `openSwingStore`, and then calling `hostStorage.hostKVStore.get('host.height')`. This had two problems: * TOCTTOU: the `hostKVStore.get` used a different DB connection (and different txn) than the exporter, so it might return a different height, negating the accuracy of the consistency check * read-write txn: `openSwingStore` creates a read-*write* txn, even when merely opening the DB (because it might need to create the initial tables). This txn is closed right away, before `openSwingStore()` returns, so it did not present a threat to ongoing operations. But if the exporter was created while the ongoing execution side already had its own read-write txn open (e.g. while `controller.run()` was running), then it would fail, and `makeSwingStoreExporter` would fail with `SQLITE_BUSY` Instead, we take advantage of the new `swingStoreExporter.getHostKV()` API, and use *it* to fetch `host.height`. Unlike the normal swingstore, the swingstore-exporter refrains from creating read-write transactions entirely. So the cosmic-swingset export code can safely query the height without fear of getting the wrong value or failing because of an ongoing write transaction. We think this should fix the SQLITE_BUSY errors. refs #8523
For posterity, the problem was the openDB to get the block height, which started an immediate tx, and executed some statements that tried to kick the tx into write mode. |
mhofman
pushed a commit
that referenced
this issue
Dec 6, 2023
Add a new API to the exporter, `exporter.getHostKV(key)`, so that the cosmic-swingset state-sync exporter process can query `host.height` without accidentally creating a read-write transaction too. refs #8523
mhofman
pushed a commit
that referenced
this issue
Dec 6, 2023
Exporting a state-sync snapshot is a read-only operation, and is designed to run "in the background", i.e. in parallel with normal mutating operations. It accomplishes this by opening a read-only transaction right away, effectively capturing a snapshot of the SQLite database state, to insulate the export process from ongoing writes by the execution host. The cosmic-swingset exporter starts with a query of `host.height`, to confirm that the database has not already advanced to a new block before this snapshot/read-transaction can be taken. Previously, this query worked by using `openSwingStore`, and then calling `hostStorage.hostKVStore.get('host.height')`. This had two problems: * TOCTTOU: the `hostKVStore.get` used a different DB connection (and different txn) than the exporter, so it might return a different height, negating the accuracy of the consistency check * read-write txn: `openSwingStore` creates a read-*write* txn, even when merely opening the DB (because it might need to create the initial tables). This txn is closed right away, before `openSwingStore()` returns, so it did not present a threat to ongoing operations. But if the exporter was created while the ongoing execution side already had its own read-write txn open (e.g. while `controller.run()` was running), then it would fail, and `makeSwingStoreExporter` would fail with `SQLITE_BUSY` Instead, we take advantage of the new `swingStoreExporter.getHostKV()` API, and use *it* to fetch `host.height`. Unlike the normal swingstore, the swingstore-exporter refrains from creating read-write transactions entirely. So the cosmic-swingset export code can safely query the height without fear of getting the wrong value or failing because of an ongoing write transaction. We think this should fix the SQLITE_BUSY errors. refs #8523
mhofman
pushed a commit
that referenced
this issue
Dec 6, 2023
Add a new API to the exporter, `exporter.getHostKV(key)`, so that the cosmic-swingset state-sync exporter process can query `host.height` without accidentally creating a read-write transaction too. refs #8523
mhofman
pushed a commit
that referenced
this issue
Dec 6, 2023
Exporting a state-sync snapshot is a read-only operation, and is designed to run "in the background", i.e. in parallel with normal mutating operations. It accomplishes this by opening a read-only transaction right away, effectively capturing a snapshot of the SQLite database state, to insulate the export process from ongoing writes by the execution host. The cosmic-swingset exporter starts with a query of `host.height`, to confirm that the database has not already advanced to a new block before this snapshot/read-transaction can be taken. Previously, this query worked by using `openSwingStore`, and then calling `hostStorage.hostKVStore.get('host.height')`. This had two problems: * TOCTTOU: the `hostKVStore.get` used a different DB connection (and different txn) than the exporter, so it might return a different height, negating the accuracy of the consistency check * read-write txn: `openSwingStore` creates a read-*write* txn, even when merely opening the DB (because it might need to create the initial tables). This txn is closed right away, before `openSwingStore()` returns, so it did not present a threat to ongoing operations. But if the exporter was created while the ongoing execution side already had its own read-write txn open (e.g. while `controller.run()` was running), then it would fail, and `makeSwingStoreExporter` would fail with `SQLITE_BUSY` Instead, we take advantage of the new `swingStoreExporter.getHostKV()` API, and use *it* to fetch `host.height`. Unlike the normal swingstore, the swingstore-exporter refrains from creating read-write transactions entirely. So the cosmic-swingset export code can safely query the height without fear of getting the wrong value or failing because of an ongoing write transaction. We think this should fix the SQLITE_BUSY errors. refs #8523
mhofman
pushed a commit
that referenced
this issue
Dec 6, 2023
Add a new API to the exporter, `exporter.getHostKV(key)`, so that the cosmic-swingset state-sync exporter process can query `host.height` without accidentally creating a read-write transaction too. refs #8523
mhofman
pushed a commit
that referenced
this issue
Dec 6, 2023
Exporting a state-sync snapshot is a read-only operation, and is designed to run "in the background", i.e. in parallel with normal mutating operations. It accomplishes this by opening a read-only transaction right away, effectively capturing a snapshot of the SQLite database state, to insulate the export process from ongoing writes by the execution host. The cosmic-swingset exporter starts with a query of `host.height`, to confirm that the database has not already advanced to a new block before this snapshot/read-transaction can be taken. Previously, this query worked by using `openSwingStore`, and then calling `hostStorage.hostKVStore.get('host.height')`. This had two problems: * TOCTTOU: the `hostKVStore.get` used a different DB connection (and different txn) than the exporter, so it might return a different height, negating the accuracy of the consistency check * read-write txn: `openSwingStore` creates a read-*write* txn, even when merely opening the DB (because it might need to create the initial tables). This txn is closed right away, before `openSwingStore()` returns, so it did not present a threat to ongoing operations. But if the exporter was created while the ongoing execution side already had its own read-write txn open (e.g. while `controller.run()` was running), then it would fail, and `makeSwingStoreExporter` would fail with `SQLITE_BUSY` Instead, we take advantage of the new `swingStoreExporter.getHostKV()` API, and use *it* to fetch `host.height`. Unlike the normal swingstore, the swingstore-exporter refrains from creating read-write transactions entirely. So the cosmic-swingset export code can safely query the height without fear of getting the wrong value or failing because of an ongoing write transaction. We think this should fix the SQLITE_BUSY errors. refs #8523
anilhelvaci
pushed a commit
to Jorge-Lopes/agoric-sdk
that referenced
this issue
Feb 16, 2024
Add a new API to the exporter, `exporter.getHostKV(key)`, so that the cosmic-swingset state-sync exporter process can query `host.height` without accidentally creating a read-write transaction too. refs Agoric#8523
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Describe the bug
During blocks with a lot of activity, or while catching-up a node, taking a state-sync snapshot can often fail with an error code
SQLITE_BUSY
. Since the export is starting a read-only transaction, and the DB is in WAL mode, there is no reason it shouldn't be able to start the transaction.To Reproduce
Enable state-sync snapshots on a node. Start from state-sync or another old cosmos snapshot.
Expected behavior
No errors starting a snapshot
Platform Environment
N/A
Additional context
SQLite likely needs to acquire a lock on the DB to indicate it's starting a read transaction, avoiding the flushing of the WAL file. It's possibly that lock acquiring that fails in some circumstances
Screenshots
The text was updated successfully, but these errors were encountered: