Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

services/horizon/expingest: Create ledger entry changes cache in front of DB #2003

Closed
bartekn opened this issue Dec 4, 2019 · 1 comment
Closed
Labels
horizon ingest New ingestion system

Comments

@bartekn
Copy link
Contributor

bartekn commented Dec 4, 2019

Currently, updating ledger entries in a DB can be slow because a DB is updated after processing each LedgerEntryChange. To solve this we can create a in-memory cache of ledger entry changes by squashing changes to a single ledger entry change. A simple example is sending multiple payments between two accounts. Instead of updating a DB after each payment, we can squash the changes and update accounts with final balances.

The cache struct should have data integrity checks built in, for example: attempt to create an account that already exists in the cache should result in an error.

@bartekn bartekn added horizon ingest New ingestion system labels Dec 4, 2019
@bartekn bartekn added this to the Horizon 0.25.0 milestone Dec 4, 2019
@bartekn bartekn mentioned this issue Dec 4, 2019
7 tasks
bartekn added a commit that referenced this issue Dec 12, 2019
…x meta (#2004)

This commit adds `exp/ingest/io.LedgerEntryChangeCache` that squashes
all the ledger entry changes. This can be later used to decrease number
of DB queries when applying them. See #2003.

Some ledgers that add a lot of changes connected to a small set of
entries are causing a performance issues because every ledger entry
change is applied to a DB.  `LedgerEntryChangeCache` solves this problem
because it makes holds a final version of a ledger entry after all the
changes. 

Before this fix, extreme cases when two accounts send a payment between
each other 1000 times in a ledger required 3000 DB updates (2000 account
changes due to payment and 500 fee meta per account). After the fix, it
requires just 2 DB updates.

Algorithm used in `LedgerEntryChangeCache` is explained below:

1. If the change is CREATED it checks if any change connected to given entry
   is already in the cache. If not, it adds CREATED change. Otherwise, if
   existing change is:
   a. CREATED it returns error because we can't add an entry that already
      exists.
   b. UPDATED it returns error because we can't add an entry that already
      exists.
   c. REMOVED it means that due to previous transitions we want to remove
      this from a DB what means that it already exists in a DB so we need to
      update the type of change to UPDATED.
2. If the change is UPDATE it checks if any change connected to given entry
   is already in the cache. If not, it adds UPDATE change. Otherwise, if
   existing change is:
   a. CREATED it means that due to previous transitions we want to create
      this in a DB what means that it doesn't exist in a DB so we need to
      update the entry but stay with CREATED type.
   b. UPDATED we simply update it with the new value.
   c. REMOVED it means that at this point in the ledger the entry is removed
      so updating it returns an error.
3. If the change is REMOVE it checks if any change connected to given entry
   is already in the cache. If not, it adds REMOVE change. Otherwise, if
   existing change is:
   a. CREATED it means that due to previous transitions we want to create
      this in a DB what means that it doesn't exist in a DB. If it was
      created and removed in the same ledger it's a noop so we remove entry
      from the cache.
   b. UPDATED we simply update it to be a REMOVE change because the UPDATE
      change means the entry exists in a DB.
   c. REMOVED it returns error because we can't remove an entry that was
      already removed.
@bartekn
Copy link
Contributor Author

bartekn commented Dec 12, 2019

Closed in #2004.

@bartekn bartekn closed this as completed Dec 12, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
horizon ingest New ingestion system
Projects
None yet
Development

No branches or pull requests

1 participant