Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Impact of lookahead variable on users of bdk_wallet #1540

Open
thunderbiscuit opened this issue Aug 7, 2024 · 3 comments
Open

Impact of lookahead variable on users of bdk_wallet #1540

thunderbiscuit opened this issue Aug 7, 2024 · 3 comments
Labels
documentation Improvements or additions to documentation

Comments

@thunderbiscuit
Copy link
Member

Hello folks. After reviewing the new builder pattern for the wallet type and speaking to people in Nashville, I've come to the conclusion that the idea behind the lookahead and its impact on users of bdk_wallet is not well understood. Moreover, setting the lookahead value is part of the public API of the Wallet (through the builder) but is not (I think?) part of persistence, so understanding it well is important for users.

This issue is meant to clear that up for everyone, and arrive at either:

  • Bubbling up to the bdk_wallet API docs what setting this value entails and how it will impact the behaviour of the wallet, including tradeoffs and dangers (or not) of setting different lookahead values between sessions.
  • Remove the setter from the Wallet public API entirely if it's better left as an internal construct.

CC'ing the two people I have not yet asked about this in person or over the dev call yesterday: @evanlinjin @LLFourn.


Related API Docs

Just so they're easy to find for everyone, here are the API docs on this:


Previous Discussions

See my Q2 on #1533:

I see that the lookahead is a setter, but even after reading the docs I'm not sure what it's setting exactly because it just sounds like a sort of stop gap but not a stop gap. See the docs here on lookahead and the more explicit lookahead in KeychainTxOutIndex.

When an user first recovers a wallet (i.e. from a recovery phrase and/or descriptor), we will NOT have knowledge of which script pubkeys are revealed. So when we index a transaction or txout (using index_tx/index_txout) we scan the txouts against script pubkeys derived above the last revealed index. These additionally-derived script pubkeys are called the lookahead.

Don't we scan up to the stop gap? I know this from experience when looking at the Esplora logs on full scan and syncs (for example a stop gap of 3 with a lookahead of 100 will query the Esplora instance for only 4 addresses), so not sure if maybe it refers to something else?

The KeychainTxOutIndex is constructed with the lookahead and cannot be altered.

Wait so is it persisted then? Or by that do we mean it can't be altered per session but altered on every load of the wallet?

I know (think?) the lookahead is not persisted, but is there a danger in having two different lookahead between sessions? What if I set a lookahead of 3 when I create a wallet, and 300 the next time I load the wallet? Or the opposite?

And my follow up here:

Does lookahead impact the sync/full_scan workflows? I have not been able to find this to be the case locally, so I'm still unclear about what this lookahead is doing, and because it's part of the public API of the Wallet type I want to make sure I really get it. Here's what I have tried on regtest, and looking at the logs for my esplora instance:

  • Stop gap of 3, lookahead of 100. The wallet syncs 6 addresses total (3 per keychain). The lookahead does not impact the spks the wallet attempts to sync.
  • Stop gap of 25, lookahead of 7. The wallet syncs 50 addresses, (lookahead didn't seem to be a problem, again not impacting the syncing process)
  • Send coins to address 19, set the lookahead to 7 and the stop gap to 25. The coins are seen and wallet shows correct balance. Lookahead below the stop gap did not cause any problems.

More notes on this: the docs here say:

Without a lookahead the index will miss outputs you own when processing transactions whose output script pubkeys lie beyond the last revealed index. In certain situations, such as when performing an initial scan of the blockchain during wallet import, it may be uncertain or unknown what the index of the last revealed script pubkey actually is.

Again this makes it sound close to the concept of the stop gap, but I assume it's different somehow or a more generalized version of it that can be used in the deeper logic of the KeychainIndex? That's as far as the API docs go.

And the response from @ValuedMammal:

The lookahead doesn't appear to have a role during full scan for esplora because we get the SPKs to sync by calling all_unbounded_spk_iters

For sync it's also not relevant because we only sync SPKs that are already revealed.

In short, the lookahead defines a number of scripts to derive ahead of time without revealing them. If you do a full scan with the default lookahead of 25 and persist the wallet, then on subsequent loads those SPKs are still stored in the index and there shouldn't be any harm in setting a new lookahead.

Edit: Nope that's also wrong. we don't persist derived SPKs, only the value of last_revealed. So yeah we probably need more testing.

One way to illustrate the effect of the lookahead is to create a wallet with the default value of 25. Then peek at address index 25. The method Wallet::is_mine will return false because internally the address is not yet derived (remembering that SPKs are 0-indexed).

For context one of the reasons we chose to make the lookahead configurable is because for a short time we had it set to 1000 and users reported unbearably long load times. Beyond that, I don't think it's something users should generally care about other than to know it exists

Further questions

Following up on the comment from VM,

  • Is the lookahead important for people that would add custom transactions to track in their local TxGraph, which contain addresses that have not yet been revealed, and for methods like is_mine() to work as intended?
  • Quote: we chose to make the lookahead configurable is because for a short time we had it set to 1000 and users reported unbearably long load times. Here do we mean creating the wallet itself, i.e. the operation of deriving the 1000 addresses in the lookahead?
@thunderbiscuit thunderbiscuit added the documentation Improvements or additions to documentation label Aug 7, 2024
@ValuedMammal
Copy link
Contributor

Thanks for collecting all this information in one place. My current thinking is to keep the lookahead option on LoadParams and expand and clarify the documentation. But at the same time since it wasn't an option that was available until now, it's questionable whether it's truly needed for bdk_wallet or if sticking with a sane default is sufficient. Since the lookahead only impacts the in-memory representation of KeychainTxOutIndex I don't think there are grave consequences of changing it across sessions, but that's only a hunch. Curious to hear other opinions

@ValuedMammal
Copy link
Contributor

I suppose the other option is to do away with the lookahead method in bdk_wallet and assume wallet users will be satisfied with a sane default like 25

@LLFourn
Copy link
Contributor

LLFourn commented Sep 10, 2024

The lookahead is there for situations where you are not using SPK based syncing (electrum and esplora). There you use stop_gap when you are first recovering a wallet and then this value doesn't matter after that since it's presumed that you are the only system revealing script pubkeys. If that's not the case then you just keep using stop_gap full scanning.

How are we meant to find script pubkeys when you do apply_block_relevant. That's what lookahead is for.

I think in the future lookahead and stop_gap should be fixed together and that there should be some mechanism to stop you accidentally revealing more spks than lookahead from the wallet unless you really want to.

Also I see there is a method called peek_address on wallet. I think it should be deleted. If you have some strange reason to see addresses without revealing them you should just use descriptor.

ValuedMammal added a commit that referenced this issue Sep 16, 2024
028f687 doc(wallet): Add docs to explain the lookahead (valued mammal)

Pull request description:

  Adds clarifying language to `CreateParams` and `LoadParams` regarding the `lookahead` parameter. Commit 028f687 also includes some minor documentation fixes.

  If anyone is aware of any more documentation flaws that need attention I'm happy to add them here.

  cc #1540

  ### Checklists

  #### All Submissions:

  * [x] I've signed all my commits
  * [x] I followed the [contribution guidelines](https://github.com/bitcoindevkit/bdk/blob/master/CONTRIBUTING.md)
  * [x] I ran `cargo fmt` and `cargo clippy` before committing

ACKs for top commit:
  LagginTimes:
    ACK 028f687
  oleonardolima:
    ACK 028f687
  evanlinjin:
    ACK 028f687

Tree-SHA512: 971d09652948ed2e2dc86d255cfd18607d96b0806aa0e990190cd1d7035c6660ea2ac1092ef2c6c209e61920c0d4ff9d8c0a900bcc74a8662546d284fec3218f
@notmandatory notmandatory added this to the 1.0.0-beta milestone Sep 17, 2024
@notmandatory notmandatory removed this from the 1.0.0-beta milestone Nov 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
Status: Discussion
Development

No branches or pull requests

4 participants