Minor tweaks to the database schema #202
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR makes slight tweaks to the database schema..
chain_id
column to anINTEGER
instead of aNUMERIC
.Numeric is arbitrary precision which we don't need in this case, as chain IDs are small integers. Would decrease table size and improve performance, see here - https://dba.stackexchange.com/a/110882. I would argue that we can also switch to
BIGINT
for theblock_number
column, which is also a numeric, though we use that column in queries a lot less.sales_collection_address_idx
index to also include thetoken_id
besides the lowercased collection address.We never do queries based on
token_id
alone. We typically doWith the current setup we have two indexes - one on lowercased
collection_address
and one ontoken_id
. On query execution, database engine will do one index scan oncollection_address
, one ontoken_id
, and do a BitmapAnd to see which rows match on both conditions.If we instead have an index on
LOWER(collection_address), token_id
, then a single index will be used in a query.So the difference is doing two index scans + additional operations vs a single index scan. On a really trivial test I did locally on a table with 7.5M rows, the
SELECT
in the first case takes 0.771 ms of planning and 101.433 ms of execution; the second one takes 0.391 ms for planning and 0.138 ms for execution.On the other hand, all queries doing
will still benefit just as much from the index (since collection address is the leftmost column), so we don't lose anything in performance there.