Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up burningman and statistics view loads #6697

Conversation

stejbac
Copy link
Contributor

@stejbac stejbac commented May 10, 2023

This PR makes the trade statistics set held by TradeStatisticsManager into a sorted collection (a TreeSet), in order to make certain optimisations possible which speed up the Burning Man view, the Trades Charts view and the BSQ Dashboard view (plus a few other minor areas of the application). The PR also applies quite a lot of miscellaneous optimisations to those three views, plus a simple related optimisation to the BSQ Supply view.

The two main benefits of holding the trade statistics in a sorted set (with chronological ordering) are:

  1. Lightweight caches can be used in timezone, time interval and similar calculations much more easily, since consecutive trade statistics objects tend to be close to each other in time;
  2. It is possible to obtain a date range view of the trade statistics set very cheaply (in O(log n) time), avoiding the need to stream over the entire collection in AveragePriceUtil and elsewhere. For this purpose, a RangeUtils class is provided, which works round the limitation that java.util.NavigableSet only allows searching with keys of the collection type, not mapped keys such as dates.

Note that even though a tree has somewhat slower lookups than a hash table, it doesn't matter here since there are no lookups into the trade statistics set, only streaming & filtering of the elements, plus occasional insertions. (Also I believe TreeSets are a little more memory efficient than HashSets.)

--

The biggest speedup is probably to the burning man view (Under DAO / Proof of Burn / Burningmen). Tabbing to it five times gives the following hotspots from JProfiler before the changes:
Screenshot-2023-5-10 Hot Spots
As can be seen, by far the biggest hotspot comes from AveragePriceUtil.getAveragePriceTuple, which filters the entire trade statistics set for BSQ & USD trades in a given one-month (as specified) period.

After the changes, tabbing to the view five times gives the following hotspots from JProfiler:
Screenshot-2023-5-10 Hot Spots(1)

stejbac added 22 commits May 10, 2023 19:40
Short circuit the exception control flow used in the method
'TradeStatistics3.getPaymentMethodId', which occurs whenever the payment
method code is stored directly in the 'paymentMethod' field instead of
first being converted into a numeric string. This occurs if the method
is unrecognised as it is not in listed into 'PaymentMethodWrapper' enum.

This fixes an unnecessary slowdown of 'TradeStatistics3.isValid', which
calls the above, when scanning the list of BSQ trade statistics in
AveragePriceUtil and elsewhere, due to the fact that the 'BSQ_SWAP'
payment method has been missed out of the above enum.

Also add a (presently disabled) unit test to prevent any future payment
methods from being missed out of the enum. (Should fix the missing BSQ
swaps issue in a separate PR to make sure that the seed nodes recognise
the new payment method code before anyone else.)
Use a Map to speed up 'PaymentMethod.getPaymentMethod', called from
'isValid', instead of scanning the payment method list every invocation.
Make the list immutable to ensure the map never goes stale, which is OK
since no code modified it outside PaymentMethod's static initialisation.

Also speed up the global accessor 'TradeLimits.getMaxTradeLimit', by
caching the DAO param value in a volatile field, cleared upon each new
block arrival.

Furthermore, speed up 'TradeLimits.getFirstMonthRiskBasedTradeLimit' by
simplifying the rounding logic to avoid a pass through the (rather slow)
BigDecimal type.
Use nonstrict bounds when filtering outliers from the provided trade
statistics list, since otherwise it will always remove the outermost two
inliers from the list. This is because the dependent call to
'InlierUtil.findInlierRange' returns the min & max inlier values.
Use a DoubleStream when streaming over 'List<Double>' method arguments
in InlierUtil, as well as a primitive array sort in 'InlierUtil.trim'
(followed by taking a sublist view), instead of calling 'Stream.sorted'.

To this end, use Guava 'Doubles.asList' to pass lists of Doubles to/from
the InlierUtil methods without incurring any boxing or unboxing costs,
since their spliterators can be simply downcast to Spliterator.OfDouble
(opportunistically), instead of needing to use 'mapToDouble' to unbox.

This was a minor hotspot when called from AveragePriceUtil (used by the
burning man and BSQ dashboard views).
Provide a 'RangeUtils' class for computing subsets of a navigable set
with natural element order, with each bound defined by a mathematical
filter (that is, a predicate specifying whether an element is 'big' -
true, or 'small' - false), instead of a specific element. This allows
the subset of all elements which map into a given range to be computed,
provided the mapping function is (strictly or non-strictly) increasing.
Provide a fluent interface for this in RangeUtils (with unit tests).

To support this, provide a Comparable sub-interface, 'ComparableExt', of
elements which may be compared with marks/delimiters instead of just
elements of the same type, to work round the limitation that sorted (&
navigable) Sets/Maps in Java do not support general binary searching
with a filter (predicate) on the keys instead of just a specific key.

This will make it possible to efficiently take subsets of objects within
a given date range, say, without having to scan the entire set, provided
it is sorted (w.r.t. a suitable natural order).
Make TradeStatistics3 implement the previously added ComparableExt
interface and make TradeStatisticsManager hold them as a TreeSet instead
of a HashSet, to support fast retrieval of statistics in any given date
range. (Even though red-black trees are generally slower than hash
tables, this should not matter here since the set is only being iterated
over and infrequently appended, and does not benefit from O(1) lookups/
additions/removals.)

Add a 'TradeStatisticsManager.getNavigableTradeStatisticsSet' accessor,
which returns the backing TreeSet of the current ObservableSet field, so
that callers can access its NavigableSet interface where needed (as
there is no ObservableSortedSet or similar in JavaFX). Use this to
optimise 'AveragePriceUtil.getAveragePriceTuple',
'DisputeAgentSelection.getLeastUsedDisputeAgent' and
'MutableOfferDataModel.getSuggestedSecurityDeposit', to obtain a narrow
date range of trade statistics without streaming over the entire set.

Additionally optimise & simplify the price collation in
'TradeStatisticsManager.onAllServicesInitialised', by exploiting the
fact that the statistics are now sorted in order of date (which is the
presently defined natural order).
Now that the trade statistics are retrieved as a sorted set, it can be
assumed that the USD & BSQ trade lists passed to 'getUSDAverage' are
already sorted. Use this to avoid repeatedly scanning the USD trade list
for the first trade dated after each given BSQ trade, by moving two
cursors in a single pass across the respective lists simultaneously.
1) Change statement lambdas to expression lambdas;
2) Replace 'Map.putIfAbsent' then 'Map.get' with 'Map.computeIfAbsent';
3) Add missing @VisibleForTesting annotation or make private.
Avoid scanning all the ticks backwards from 90 to 1 repeatedly, to find
the one with the correct date interval for each item in the
'tradeStatisticsByCurrency' list. Instead, for each item, remember the
last found tick index and move forwards if necessary, then scan
backwards from that point to find the correct tick. As the trade
statistics are now in chronological order, this is much faster (though
it will still work correctly regardless of the order of the list items).

Also, hold 'itemsPerInterval' as a 'List<Pair<..>>' instead of a
'Map<Long, Pair<..>>', since the keys are just tick indices from 0 to 91
inclusive, so it is cleaner and more efficient to use an array than a
hash table.
The test was erroneously passing a candle tick start time (as a long) to
'ChartCalculations.getCandleData', which expects a tick index from 0 to
MAX_TICKS + 1 (91) inclusive. Since this is out of range, the method
skipped an 'itemsPerInterval' map lookup which would have thrown an NPE
prior to the last commit. Fix the test by making 'itemsPerInterval'
nonempty and passing 0 as the tick index. Also check the now correctly
populated 'date' field in the returned candle data.

Additionally, tidy up the class a little and avoid an unnecessary temp
directory creation.
Cache enum arrays 'TickUnit.values()' & 'PaymentMethodWrapper.values()'
as the JVM makes defensive copies of them every time they are returned,
and they are both being used in tight loops. In particular, profiling
suggests this will make 'TradeStatistics3.isValid' about twice as fast.
Now that the trade statistics are encountered in chronological order,
speed up 'roundToTick(LocalDateTime, TickUnit)' by caching the last
calculated LocalDateTime -> Date mapping from the tick start (with one
cache entry per tick unit), as multiple successive trades will tend to
have the same tick start.

This avoids a relatively expensive '.atZone(..).toInstant()' call, which
was slowing down 'ChartCalculations.getUsdAveragePriceMapsPerTickUnit',
as it uses 'roundToTick' in a tight loop (#trades * #tick-units calls).

Also unqualify 'TradesChartsViewModel.TickUnit' references for brevity.
Avoid calculating average prices for ticks that won't ever be part of a
visible chart candle, as only the last 90 ticks can fit on the chart. To
this end, stream the trade statistics in reverse chronological order
(which requires passing them as a NavigableSet), so that once more than
MAX_TICKS ticks have been encountered for a given tick unit, the
relevant map (and all lower granularity maps) can stop being filled up.

Also add a 'PriceAccumulator' static class to save time and memory when
filling up the intermediate maps, by avoiding the addition of each trade
statistics object to (multiple) temporary lists prior to average price
calculation.
Optimise (further) the ChartCalculations methods 'getItemsPerInterval' &
'getCandleData' by replacing HashSets in the former with sorted sets,
which avoids relatively expensive calls to 'TradeStatistics3.hashCode'
and needless subsequent re-sorting by date in 'getCandleData'. (Forming
the trade statistics into an ImmutableSortedSet, OTOH, is cheap since
they are already encountered in chronological order.)

Further optimise the latter by using a primitive array sort of the trade
prices to calculate their median, instead of needlessly boxing them and
using 'Collections.sort'.
Reduce a hotspot sorting the trade statistics table, triggered by the
'sortedList.bind(comparatorProperty)' call upon completion of the
'fillList' future. Profiling shows that repeated invocation of the cell
value factory over the entries of the sorted column is a bottleneck, so
speed this up by caching the returned cell value (given by calling
'new ReadOnlyObjectWrapper<>(listItem)') as an instance field of
TradeStatistics3ListItem.

As a further significant optimisation, stream the trade statistics in
reverse chronological order, when collecting into a list wrapped by
SortedList, as this matches the default display order, reducing the
number of comparisons done by SortedList's internal mergesort to O(n).
As profiling shows a hotspot mapping the set of trade statistics to a
list of currencies to pass to 'CurrencyList.updateWithCurrencies',
attempt to speed this up with a parallel stream. For this to work
correctly, take care to use the backing set (with unmodifiable wrapper)
in place of 'tradeStatisticsManager.getObservableTradeStatisticsSet()',
as ObservableSetWrapper doesn't delegate calls to its spliterator.
Now that the trade statistics are retrieved in chronological order,
optimise the per-interval BSQ & USD price and volume calculations in
PriceChartDataModel & VolumeChartDataModel, by adding caches to avoid
relatively expensive timezone calculations in TemporalAdjusterModel,
similarly to the cache added for 'ChartCalculations.roundToTick' (as
profiling shows 'TemporalAdjusterModel.toTimeInteval' is a hotspot).

Add a cache to speed up Instant -> LocalDate mappings by storing the
unix time (Instant) range of the last seen day (LocalDate) in a tuple,
then just returning that day if the next Instant falls in range. Also
add a cache of the last temporal adjustment (start of month, week, etc.)
of that day. In this way, successive calls to 'toTimeInteval(Instant)'
with input times on the same day are sped up.

Since TemporalAdjusterModel is used by multiple threads simultaneously,
store the caches in instance fields and add a 'withCache' method which
clones the model and enables the caching, since otherwise the separate
threads keep invalidating one another's caches, making it slower than it
would be without them. (We could use ThreadLocals, but profiling
suggests they are too heavyweight to be very useful here, so instead use
unsynchronised caching with nonfinal fields and benign data races.)

Provide the method 'ChartDataModel.toCachedTimeIntervalFn' which returns
a method reference to a cloned & cache-enabled TemporalAdjustedModel, to
use in place of the delegate method 'ChartDataModel.toTimeInterval' when
the caching is beneficial.
Use the previously added 'ChartDataModel.toCachedTimeIntervalFn' to
additionally speed up some of the charts in the BSQ supply view, in
particular the trade fees & total burned BSQ, via the DaoChartDataModel
methods 'getBsqTradeFeeByInterval' & 'getTotalBurnedByInterval'. (The
other changes in the BSQ supply, such as proofs of burn or issuance, are
too infrequent to benefit from the LocalDate caching.)

For this to work, the filtered BSQ txs must be streamed in chronological
order, so provide local methods 'get[Burnt|Trade]FeeTxStream()', to use
in place of the DaoStateService methods 'get[Burnt|Trade]FeeTxs()',
which return unordered HashSets.
Optimise 'BurningManPresentationService.getCandidateBurnTarget' to avoid
the repeated computation of the total accumulated decayed burned amount
for every listed burning man. To this end, cache the total in a nullable
Long field, along with the method 'getAccumulatedDecayedBurnedAmount()'
to lazily initialise it. (This eliminates a minor hotspot in the burning
man view revealed by JProfiler.)
Factor out duplicated logic in the 'Stream.map' lambdas to compute the
BSQ value of the BTC of each streamed ReceivedBtcBalanceEntry, returned
as an 'Optional<Long>'. Also simplify the logic slightly and return an
OptionalLong instead for greater efficiency.

(Also replace a statement lambda with an expression lambda.)
Add an 'averagePricesValid' boolean field to avoid needless refilling of
the cached BSQ prices map when calling 'getAverageBsqPriceByMonth()'.
(Also skip a redundant filling of the map will non-historical data upon
startup of the service.) Since the prices are calculated from the
(observable) set of all trade statistics, add a listener to the set to
invalidate the cache whenever it changes.

This significantly speeds up the burning man view, since the getter is
called several times when activating it.
Short circuit the BigInteger arithmetic in 'AltcoinExchangeRate' &
'org.bitcoinj.utils.ExchangeRate' (from which the former is adapted), by
using ordinary long arithmetic when it is guaranteed not to overflow due
to the two quantities to be multiplied fitting in an int. This will be
the case most of the time. Also remove duplicated logic, to ensure that
all conversions of BTC amounts to volumes happen via the 'Price'
instance methods, so that the optimisation always applies.

In particular, this speeds up the BTC -> BSQ conversions in the burning
man view, as well as the USD price calculations for the candles in the
trades charts view via 'TradeStatistics3.getTradeVolume()'.

Additionally, fix the lazy initialisation pattern in TradeStatistics3 to
ensure that it is thread safe (that is, it only has benign data races),
by making it of the form:

  Foo foo = this.foo;
  if (foo == null) {
      this.foo = foo = computeFoo();
  }
  return foo;

This avoids the problem that 'foo' is a nonvolatile field and can
therefore be seen to alternate any number of times between null and
nonnull from the PoV of the thread initialising it (at least when the
initialisation is racy).
@HenrikJannsen
Copy link
Collaborator

Wow! Great thanks for the impressive improvements!

Copy link
Collaborator

@HenrikJannsen HenrikJannsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK - Very impressive work!!!

@HenrikJannsen
Copy link
Collaborator

I tested the branch and the BM view and BSQ Facts&Figures views are much faster now!

Copy link
Contributor

@alejandrogarcia83 alejandrogarcia83 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK

Thank you!! This is a massive improvement.

@alejandrogarcia83 alejandrogarcia83 merged commit ac8ad24 into bisq-network:master May 16, 2023
@alejandrogarcia83 alejandrogarcia83 added this to the v1.9.11 milestone May 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants