Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory optimizations for large address queries #392

Merged
merged 20 commits into from
Jan 27, 2016

Conversation

braydonf
Copy link
Contributor

@braydonf braydonf commented Jan 4, 2016

  • Switches to use transform streams for output and input queries for better memory usage
  • Adds configurable limits to queries to avoid several minute long queries being enabled by default
  • Refactors address history and address summary methods
  • Limits the number of files that leveldb keeps open in cache
  • Cleaner shutdown by removing bitcoin event listeners
  • Closes: Address Service: Resolve memory issues with large queries #372

New Address Service Options:

  • maxInputsQueryLength (default 50,000) // The maximum number of inputs per query
  • maxOutputsQueryLength (default 50,000) // The maximum number of outputs per query
  • maxHistoryQueryLength (default 100) // The maximum number of transactions per query
  • maxAddressesQuery (default 10,000) // The maximum number of addresses per query

New DB Service Options:

  • maxOpenFiles (default 200) // The maximum number of files that leveldb keeps in cache

@braydonf braydonf force-pushed the large-queries branch 9 times, most recently from 71e892d to 7b0464d Compare January 13, 2016 22:11
- Refactored getAddressSummary and added several tests
- Fixed bugs revealed from the integration regtests
- Updated many unit tests
Braydon Fuller added 4 commits January 14, 2016 17:17
Querying addresses that have millions of transactions is supported however
takes hundreds of seconds to fully calculate the balance. Creating a cache of
previous results wasn't currently working because the `isSpent` query is always
based on the current bitcoind tip. Thus the balance of the outputs would be included
however wouldn't be removed when spent as the output wouldn't be checked again
when querying for blocks past the last checkpoint. Including the satoshis in the
inputs address index would make it possible to subtract the spent amount,
however this degrades optimizations elsewhere. The syncing times or querying
for addresses with 10,000 transactions per address.

It may preferrable to have an additional address service that handles high-volume
addresses be on an opt-in basis so that a custom running client could select
high volume addresses to create optimizations for querying balances and history.
The strategies for creating indexes differs on these use cases.
@braydonf braydonf changed the title WIP: Memory optimizations for large address queries Memory optimizations for large address queries Jan 18, 2016
@braydonf
Copy link
Contributor Author

Ready for wider testing

Braydon Fuller added 5 commits January 18, 2016 16:03
There was an issue where streams would still be held open if "pause" was
called before "end", this would lead to http requests from the insight-api
not being returned with an error status as soon as possible but would
instead stay open.
@kleetus
Copy link
Contributor

kleetus commented Jan 27, 2016

LGTM

kleetus added a commit that referenced this pull request Jan 27, 2016
Memory optimizations for large address queries
@kleetus kleetus merged commit b0a0f62 into bitpay:master Jan 27, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants