Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make RpcBlockchain import more efficient #740

Closed

Conversation

evanlinjin
Copy link
Member

@evanlinjin evanlinjin commented Sep 1, 2022

Fixes #715

Description

Introduces RpcImportParams that keeps track of next derivation index to start import from for next sync. This avoids re-importing scripts/descriptors into Bitcoin Core for every sync cycle. However, the user will be responsible for persisting the RpcImportParams structure; the RpcBlockchain::get_import_parameters and RpcBlockchain::replace_import_parameters functions can be used for this purpose.

Also add pagination for importing scripts/descriptors.

Also had to make clippy happy by deriving Eq for various structures.

Notes to the reviewers

Refer to: #715 (comment)

Changelog notice

  • Introduce RpcImportParams to make sync more efficient
  • Introduce RpcSyncParams::page_size to restrict request/response array size of RPC calls

Checklists

All Submissions:

  • I've signed all my commits
  • I followed the contribution guidelines
  • I ran cargo fmt and cargo clippy before committing

New Features:

  • I've added tests for the new feature
  • I've added docs for the new feature

Bugfixes:

* [ ] This pull request breaks the existing API
* [ ] I've added tests to reproduce the issue which are now passing (This is a performance bug fix)

  • I'm linking the issue being fixed by this PR

`RpcImportParams` keeps track of the scriptPubKey derivation index to
start from for the next call to `importdescripts`/`importmulti`, thus
avoiding re-importing into Bitcoin Core.
@evanlinjin evanlinjin force-pushed the improve_rpc_bc_import branch from 4dce1e6 to 5319816 Compare September 1, 2022 07:57
@evanlinjin evanlinjin marked this pull request as ready for review September 1, 2022 09:47
@evanlinjin evanlinjin mentioned this pull request Sep 1, 2022
4 tasks
@notmandatory
Copy link
Member

The idea of paging makes good sense, but is there a reason to not save the next derivation index in the existing wallet DB as we do with other sync related data?

@evanlinjin
Copy link
Member Author

evanlinjin commented Sep 7, 2022

The idea of paging makes good sense, but is there a reason to not save the next derivation index in the existing wallet DB as we do with other sync related data?

Thanks for the comment @notmandatory .

The "next derivation index" in this context means the index after the last imported into Bitcoin Core. Ideally, we should be able to obtain this information easily from Bitcoin Core itself, but the RPC interface does not allow it.

This is different to how "next derivation index" should be interpreted in Database, which is "the next index in which we get a fresh address".

The ideal would be to have the two "next derivation index" concepts in sync. This would be possible if "user-get-new-address" and "import-new-address-into-bitcoin-core" could be performed atomically. Another solution would be to mark "imported-into-bitcoin-core" addresses in the Database. Both these solutions are non-viable (due to the restrictions of the "model" of BDK).

So in essence, I did this hack because: restrictive RPC interface in Core + restrictive "bounds" of the current BDK "model".

src/blockchain/rpc.rs Outdated Show resolved Hide resolved
src/blockchain/rpc.rs Show resolved Hide resolved
src/blockchain/rpc.rs Show resolved Hide resolved
@@ -66,6 +68,8 @@ pub struct RpcBlockchain {
capabilities: HashSet<Capability>,
/// Sync parameters.
sync_params: RpcSyncParams,
/// Import parameters.
import_params: RefCell<RpcImportParams>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to be safe I would replace this with an Arc<RpcImportParams>. Then you can use Arc::make_mut() whenever you want to replace it

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This won't work as the WalletSync is NOT &mut self. We can do Arc<Mutex<RpcImportParams>>? But I think this is too much. I feel like the user can just handle thread-safety elsewhere?

@afilini
Copy link
Member

afilini commented Sep 12, 2022

I think #372 would fix the same issue this PR tries to tackle, because it would allow us to save in the database which addresses have already been imported. It's been blocked all this time because we haven't figured out a good way to do database migrations, but maybe it's time we get back to it.

Also, a suggestion: maybe you don't remember, but the old rpc implementation used the label of a fixed address as a "storage section" to save essentially the RpcImportParams. What do you think about moving back to that model? Keeping your current architecture, but instead of asking the user to persist that structure, we just do it automatically with that hack. It would also have the added benefit of automatically resetting if the user switches to a clean node, otherwise that requires manual intervention (manually marking the node as "clean" so that bdk reimports everything)

@notmandatory
Copy link
Member

I'm putting this on the list to discuss tomorrow, but since it looks like it requires more discussion will probably have to remove it from the 0.23.0 feature freeze.

* Add `RpcSyncParams::page_size` that restricts req/resp array count
  for various RPC calls.
* Add `pagenated_import` function.
@evanlinjin evanlinjin force-pushed the improve_rpc_bc_import branch from ff398ec to 83edbf0 Compare September 27, 2022 12:13
@evanlinjin
Copy link
Member Author

Also, a suggestion: maybe you don't remember, but the old rpc implementation used the label of a fixed address as a "storage section" to save essentially the RpcImportParams. What do you think about moving back to that model?

@afilini I am currently unconvinced that the previous model is a good idea. It will cause problems if a different Wallet with the same descriptors attempts to sync with the same Bitcoin Core (scriptPubKeys will be skipped). Also it is not elegant.

Even with what I have in the PR, and if the user does not persist RpcSyncParams, only the first sync of the RpcBlockchain lifetime will be inefficient. I think this in itself is decent enough.

I think #372 would fix the same issue this PR tries to tackle, because it would allow us to save in the database which addresses have already been imported. It's been blocked all this time because we haven't figured out a good way to do database migrations, but maybe it's time we get back to it.

It does sound like this would fix the issue, but I think #372 would be more elegant with "birthdays" of scriptPubKeys. So database will store birthdays of scriptPubKeys and "last-sync" for each different "blockchain". This way we can compare which scriptPubKeys we need to import, without having to do a map of which scriptPubKey has been synced to which "blockchains".

@notmandatory notmandatory removed this from the Release 0.23.0 Feature Freeze milestone Sep 27, 2022
@evanlinjin
Copy link
Member Author

BDK Chain will replace the need for this

@evanlinjin evanlinjin closed this Jan 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

Large importdescriptors call slow, possible to reduce number of watched addresses?
4 participants