-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unified OUTPOINT_TO_UTXO_ENTRY table #3915
Conversation
As far as I know, no one has ever used this, and it would take extra work to support it with the new OUTPOINT_TO_UTXO_ENTRY table (which greatly improves performance). Just remove the option for now.
This code is unused, now that we have the newer (and much nicer) pointer-handling code on lines 190-194.
is_output_spent() was assuming that OUTPOINT_TO_TXOUT contains all utxos when --index-addresses is specified, but that's not correct; only --index-sats guarantees that. This was causing is_output_spent() to return incorrect values for outputs that were spent before block 767430, when using --index-addresses but not --index-sats.
Replace the OUTPOINT_TO_TXOUT, OUTPOINT_TO_SAT_RANGES, and SATPOINT_TO_SEQUENCE_NUMBER tables with a new OUTPOINT_TO_UTXO_ENTRY table. This saves space by avoiding redundant key storage, and makes updates quicker since we can write all of an outpoint's data at once. It also simplifies caching. We used to have utxo_cache for OUTPOINT_TO_TXOUT and range_cache for OUTPOINT_TO_SAT_RANGES, while SATPOINT_TO_SEQUENCE_NUMBER was uncached; now, there's a single unified utxo_cache which covers all three. Additionally, since SCRIPT_PUBKEY_TO_OUTPOINT and SEQUENCE_NUMBER_TO_SATPOINT are storing data which is already included in the cache (just with the keys and values swapped), we get to cache those tables too for free. The combined effect is that an index with --index-sats and --index-addresses is now about 54 GB smaller and twice as fast to build.
With the unified OUTPOINT_TO_UTXO_ENTRY table, these stats aren't useful; they're either identical to the stats in the regular "Committing at block height ..." message (if you're using --index-sats), or zero (if you're not).
I think no reason not to, since the schema version is changing anyways.
I think millis is probably too much precision unless doing performance work.
We have an `Entry` trait which uses "value" for the low-level value in redb, and `entry` for the high-level value parsed from the low-level value. Use that for UtxoEntry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made small cosmetic changes to this until I convinced myself that it's fine.
Nice! These changes look good to me. The new usage of "entry"/"value" doesn't quite seem consistent, though. A UtxoEntry isn't an entry in the sense of the Entry trait; it's an entry in the sense of InscriptionEntry or RuneEntry, whose low-level types are (quite logically) called InscriptionEntryValue and RuneEntryValue. So if we want to be consistent, I think the types should be UtxoEntry, UtxoEntryValue, and UtxoEntryValueBuf. Or for shorter names, we could call this a Utxo rather than a UtxoEntry, and then the names would be Utxo, UtxoValue, and UtxoValueBuf. I think that might help with variable naming, too -- all the "utxo_entry" variables I had which are now "value" would instead become "utxo", which is a lot more informative. |
Yah, good point. |
Yeah, I agree that Utxo and UtxoValue are ambiguous. Personally I feel like UtxoEntryValue is already kind of heinous, though. I think the "Value" convention works for the existing types mostly because the Value types are so rarely used: the runes code is entirely based on RuneEntry, and you never see RuneEntryValue except in the table definition. So even though RuneEntryValue is a relatively long and uninformative name, it successfully communicates "this is a variant version of RuneEntry", and that's good enough for its very minor usage. But UtxoEntry isn't like that at all. The "high-level" UtxoEntry type isn't even usable by itself -- it just stores references into the low-level value! The low-level type is the one you actually use everywhere. And as we optimize ord, other types will probably start to follow this pattern too, because it's more efficient; converting all the data into a new format on every load and store is pretty expensive in terms of memory allocations and copies. In particular, keeping sat ranges in the low-level format is high on my list of future optimizations. For type names, my general feeling is it's much more important to communicate what data the type contains than it is to describe exactly what format that data is stored in. This is why the "Value" convention works at all: the word "Value" doesn't communicate much, but that's fine, because a RuneEntry and a RuneEntryValue store the same data and we don't really care about the details of the representation. So I think my preference is still to use UtxoEntry and UtxoEntryBuf for the low-level types. In particular, this successfully communicates that these are the types you're supposed to use: when you load a rune entry from the database, you want to store it as a RuneEntry, and when you load a UTXO entry, you want to store it as a UtxoEntry (or UtxoEntryBuf). The other versions of each type are just transient things you convert through but don't keep around. I'd love to find a better name than ParsedUtxoEntry for the high-level type, but also I feel like that name isn't particularly important for exactly the same reason that "RuneEntryValue" isn't that important: it doesn't get used much. |
Oh yeah, another option that might be worth considering: we could get rid of ParsedUtxoEntry entirely, and just have accessors on the low-level UtxoEntry to retrieve the individual fields. I suspect this is a wash in terms of performance -- it's slightly faster in some cases (because we don't have to parse the whole entry to retrieve a single field) and slightly slower in others (because we have to reparse the beginning to access the middle), but I'm guessing the difference is insignificant in both cases. This was actually my original plan during development. It makes field access a little nicer for the user, since they don't have to call .parse(), and it avoids the need for separate high-level and low-level types entirely. I switched from this to ParsedUtxoEntry in order to make the parsing code a bit easier to read (because parsing the whole entry at once is simpler than parsing it incrementally), but I think I might actually be able to make an incremental version that's as easy to read as the current code. Alternatively, if it turns out that the parser really needs to be stateful to be efficient, we could have UtxoEntry, UtxoEntryBuf, and UtxoEntryParser. This avoids the redundant parsing work, and from the user's point of view it would work just like the current code: you call .parse() to get a UtxoEntryParser, and then you can call accessors to get the individual fields. But the naming makes it very clear that the underlying UtxoEntry is the real type, and the parser is just a view into it. |
Yah, I think you're right, I find all that convincing. I pushed a commit going back tot he old names. |
* Change test-bitcoincore-rpc to mockcore in README.md (ordinals#3842) * Commit twice to work around redb off-by-one bug (ordinals#3856) * Release 0.19.1 (ordinals#3864) - Bump version: 0.19.0 → 0.19.1 - Update changelog - Update changelog contributor credits - Update dependencies * Update Portuguese Translation pt.po (ordinals#3837) * Updated Chinese translation (ordinals#3881) * Fix rune links for runes with no symbol (ordinals#3849) * Suppress printing sat_ranges by default (ordinals#3867) * Re-enter beta (ordinals#3884) * Update pointer specification (ordinals#3861) * Clarify that unused runes tags should not be used (ordinals#3885) * Migrate object.rs to snafu error handling (ordinals#3858) * Make index settings harder to misuse (ordinals#3893) * Don't unnecessarily insert into utxo cache when indexing addresses (ordinals#3894) * Remove trailing space from runes specification (ordinals#3896) * Serve responses with cross origin isolation headers (ordinals#3898) * List all Bitcoin Core wallets (ordinals#3902) * Make first first and last sat in range clickable (ordinals#3903) * Migrate Outgoing to SnafuError (ordinals#3854) * Update Bitcoin Core deploy to 27.1 (ordinals#3912) * Add sat_balance to address API (ordinals#3905) Co-authored-by: raph <raphjaph@protonmail.com> * Add Dutch translation to Ordinals Handbook (ordinals#3907) * Migrate chain.rs to snafu error (ordinals#3904) * Bump version to 0.20.0-dev (ordinals#3916) * Revert "Serve responses with cross origin isolation headers" (ordinals#3920) * Remove inscription content type counts from /status page (ordinals#3922) * Unified OUTPOINT_TO_UTXO_ENTRY table (ordinals#3915) - Upgrade `redb` to 2.1.1 - Remove `--index-spent-sats` - Remove redundant pointer handling in `index_inscriptions()` - Fix incorrect `is_output_spent()` results when not using `--index-sats` - Unify UTXO index data in `OUTPOINT_TO_UTXO_ENTRY` table - Read addresses from index when exporting * Add address field to `/r/inscription/:id` (ordinals#3891) * Add inscriptions and runes details to address API endpoint (ordinals#3924) * Release 0.20.0 (ordinals#3928) - Bump ord version: 0.19.1 → 0.20.0 - Bump ordinals version: 0.0.9 → 0.0.10 - Update changelog - Update changelog contributor credits - Update dependencies --------- Co-authored-by: Anchor <49644098+TheHeBoy@users.noreply.github.com> Co-authored-by: Casey Rodarmor <casey@rodarmor.com> Co-authored-by: 0xArtur <metaverseartur@gmail.com> Co-authored-by: Dr.JingLee <95461562+DrJingLee@users.noreply.github.com> Co-authored-by: nine <118634361+cryptoni9n@users.noreply.github.com> Co-authored-by: Bohdan Cryptolions <37701692+ansigroup@users.noreply.github.com> Co-authored-by: raph <raphjaph@protonmail.com> Co-authored-by: Patrick Collins <patrick@collinatorstudios.com> Co-authored-by: Tibebtc <tibedekock@live.nl> Co-authored-by: partialord <178683722+partialord@users.noreply.github.com> Co-authored-by: Eloc <42568538+elocremarc@users.noreply.github.com> Co-authored-by: twosatsmaxi <vashalpesh@gmail.com>
* Change test-bitcoincore-rpc to mockcore in README.md (ordinals#3842) * Commit twice to work around redb off-by-one bug (ordinals#3856) * Release 0.19.1 (ordinals#3864) - Bump version: 0.19.0 → 0.19.1 - Update changelog - Update changelog contributor credits - Update dependencies * Update Portuguese Translation pt.po (ordinals#3837) * Updated Chinese translation (ordinals#3881) * Fix rune links for runes with no symbol (ordinals#3849) * Suppress printing sat_ranges by default (ordinals#3867) * Re-enter beta (ordinals#3884) * Update pointer specification (ordinals#3861) * Clarify that unused runes tags should not be used (ordinals#3885) * Migrate object.rs to snafu error handling (ordinals#3858) * Make index settings harder to misuse (ordinals#3893) * Don't unnecessarily insert into utxo cache when indexing addresses (ordinals#3894) * Remove trailing space from runes specification (ordinals#3896) * Serve responses with cross origin isolation headers (ordinals#3898) * List all Bitcoin Core wallets (ordinals#3902) * Make first first and last sat in range clickable (ordinals#3903) * Migrate Outgoing to SnafuError (ordinals#3854) * Update Bitcoin Core deploy to 27.1 (ordinals#3912) * Add sat_balance to address API (ordinals#3905) Co-authored-by: raph <raphjaph@protonmail.com> * Add Dutch translation to Ordinals Handbook (ordinals#3907) * Migrate chain.rs to snafu error (ordinals#3904) * Bump version to 0.20.0-dev (ordinals#3916) * Revert "Serve responses with cross origin isolation headers" (ordinals#3920) * Remove inscription content type counts from /status page (ordinals#3922) * Unified OUTPOINT_TO_UTXO_ENTRY table (ordinals#3915) - Upgrade `redb` to 2.1.1 - Remove `--index-spent-sats` - Remove redundant pointer handling in `index_inscriptions()` - Fix incorrect `is_output_spent()` results when not using `--index-sats` - Unify UTXO index data in `OUTPOINT_TO_UTXO_ENTRY` table - Read addresses from index when exporting * Add address field to `/r/inscription/:id` (ordinals#3891) * Add inscriptions and runes details to address API endpoint (ordinals#3924) * Release 0.20.0 (ordinals#3928) - Bump ord version: 0.19.1 → 0.20.0 - Bump ordinals version: 0.0.9 → 0.0.10 - Update changelog - Update changelog contributor credits - Update dependencies * Bump version to 0.20.0-dev (ordinals#3929) * Add test to remind us to fix the UtxoEntry redb type name (ordinals#3934) * Put AddressInfo into api module (ordinals#3933) * Fix clippy lint (ordinals#3937) * Add inscription index to /status (ordinals#3938) * Remove unnecessary symbols in docs/src/guides/testing.md (ordinals#3945) * Add inscription examples to handbook (ordinals#3769) * Allow scrolling in iframe (ordinals#3947) * Skip serializing None in batch::File (ordinals#3943) * Fix /output page (ordinals#3948) * Add `/satpoint/<SATPOINT>` endpoint (ordinals#3949) * Don't log RPC connections to bitcoind (ordinals#3952) * Start indexing at correct block height (ordinals#3956) * Fix output API struct (ordinals#3957) * Remove dependency on `ord-bitcoincore-rpc` crate (ordinals#3959) * Keep sat ranges in low-level format (ordinals#3963) * Implement burn for wallet command (ordinals#3437) * Add multi parent support to wallet (ordinals#3228) * Get parents using `as_slice` instead of converting to `Vec` (ordinals#3972) * Rename parents_values -> parent_values (ordinals#3973) * Fix non-existant output lookup (ordinals#3968) * Release 0.20.1 (ordinals#3975) * Refactor burn command (ordinals#3976) * Remove regtest.ordinals.net just recipes (ordinals#3978) * Add `ord verify` (ordinals#3906) * Release 0.21.0 (ordinals#3997) - Bump version: 0.20.1 → 0.21.0 - Update changelog - Update changelog contributor credits - Update dependencies * Remove /runes/balances API endpoint (ordinals#3980) * Update rust-bitcoin in ord (ordinals#3962) * Revert redb to 2.1.3 (ordinals#4003) * Release 0.21.1 (ordinals#4006) * Update Bitcoin Core install script (ordinals#4007) * Remove pre-alpha warning from ord help (ordinals#4011) * Change mint progress to `mints / terms.cap` (ordinals#4012) * Only show rune mint progress during mint (ordinals#4013) * Show if JSON API is enabled on /status (ordinals#4014) * Fix build error --------- Co-authored-by: Anchor <49644098+TheHeBoy@users.noreply.github.com> Co-authored-by: Casey Rodarmor <casey@rodarmor.com> Co-authored-by: 0xArtur <metaverseartur@gmail.com> Co-authored-by: Dr.JingLee <95461562+DrJingLee@users.noreply.github.com> Co-authored-by: nine <118634361+cryptoni9n@users.noreply.github.com> Co-authored-by: Bohdan Cryptolions <37701692+ansigroup@users.noreply.github.com> Co-authored-by: raph <raphjaph@protonmail.com> Co-authored-by: Patrick Collins <patrick@collinatorstudios.com> Co-authored-by: Tibebtc <tibedekock@live.nl> Co-authored-by: partialord <178683722+partialord@users.noreply.github.com> Co-authored-by: Eloc <42568538+elocremarc@users.noreply.github.com> Co-authored-by: twosatsmaxi <vashalpesh@gmail.com> Co-authored-by: tiaoxizhan <tiaoxizhan@outlook.com> Co-authored-by: onchainguy <1436535+onchainguy-btc@users.noreply.github.com> Co-authored-by: lifofifo <134870335+lifofifoX@users.noreply.github.com> Co-authored-by: dcorral <hi@dcorral.com>
Replace the OUTPOINT_TO_TXOUT, OUTPOINT_TO_SAT_RANGES, and SATPOINT_TO_SEQUENCE_NUMBER tables with a new OUTPOINT_TO_UTXO_ENTRY table. This saves space by avoiding redundant key storage, and makes updates quicker since we can write all of an outpoint's data at once.
It also simplifies caching. We used to have utxo_cache for OUTPOINT_TO_TXOUT and range_cache for OUTPOINT_TO_SAT_RANGES, while SATPOINT_TO_SEQUENCE_NUMBER was uncached; now, there's a single unified utxo_cache which covers all three. Additionally, since SCRIPT_PUBKEY_TO_OUTPOINT and SEQUENCE_NUMBER_TO_SATPOINT are storing data which is already included in the cache (just with the keys and values swapped), we get to cache those tables too for free.
The combined effect is that an index with --index-sats and --index-addresses is now about 54 GB smaller and twice as fast to build.