Jordan/transaction pool study #799

mjfh · 2021-08-13T18:12:03Z

Sneak preview like

preliminary toolkit that might be useful for transaction pool ops
added tx-pool TDD playground: dummy interface + unit test
working towards go reference implementation
replacing async loop in ref implementation by job/commit scheme

why: Flexible framework for constructing sorted lists with data entries that can be transparently related/linked. details: The underlying red-black tree can theoretically replaced by any other sorted table.

why: The DoublyLinkedList data type does not allow for efficient random data access (find() is O(n).)

zah · 2021-08-13T22:28:45Z

It's not customary in Nim to use name prefixes in APIs (these are usually needed in languages with weaker support for namespaces/modules). When designing containers, it's best to use established names such as add, mgetOrPut, init(RedBlackTree[Foo, Bar]), etc in order to facilitate writing generic code.

Besides the suggestion to enhance Fusion's B-Tree implementation, it occurred to me that @cheatfate has worked on another B-Tree implementation in nim-presto. To promote the reuse of these general purpose data-structures, we should really add all such modules to nim-stew.

KonradStaniec · 2021-08-14T08:03:59Z

I have recently came across of comparison of different data structures which may be useful for mempool - https://github.com/lithdew/rheia#mempool, it maybe nice data point to the discussion.

why: was on `todo` status: all black nodes on any chain down the tree count the same lengths

why: No need for module name in methods/functions, see #799 (comment) details: update tests, comments

details: Test setup and some simple stupid data collectors. It will not do much at the moment.

why: Added table [] sematics where appropriate

mjfh · 2021-08-24T12:01:53Z

[..] mempool - https://github.com/lithdew/rheia#mempool, [..]

nice one, thank you

why: Allow deleting current element in a `for` loop details: More unit tests, lots of small fixes

details: Basic structure is a keyed queue, organised by data arrival. Operations are consistent, verified by unit tests.

why: Logic was inspired by go reference implementation. details: Lists and queue pulled out into NIM sub-module source files.

why: Enhances code readability details: Heavy use of return codes `rc` of type Result[] lead to repeated `value` symbols like `rc.value.value`. Now this reads `rc.value.data`.

mratsim · 2021-08-27T11:53:25Z

We might want the input of @c-blake the author of https://github.com/c-blake/adix since he dived deeply in those questions in all Nim table-related PRs.

why: Garbled links after deleting the second entry relative to either queue end.

why: Logic was inspired by go reference implementation.

mjfh · 2021-08-27T12:09:55Z

We might want the input of @c-blake the author of https://github.com/c-blake/adix since he dived deeply in those questions in all Nim table-related PRs.

How smart should Nim sets & tables be? nim-lang/RFCs#201

https://forum.nim-lang.org/t/829

Fix #14971 OrderedTable del and pop are slow nim-lang/Nim#14995

[TODO] pseudorandom probing for hash collision nim-lang/Nim#13418 (comment)

hashing collision: improve performance using linear/pseudorandom probing mix nim-lang/Nim#13440 (comment)

Fix #14971 OrderedTable del and pop are slow nim-lang/Nim#14995 (comment)

Not using ordered tables but the collision stuff might be relevant, thanks.

c-blake · 2021-08-27T12:57:31Z

For search trees I prefer to separate search & mutation which is not done here, having various search APIs populate a "path" object which things like insert/delete/rebalance can use. This allows having search-by-rank or search-by-key or search by structure (e.g. min/max/root) all as simple single API calls populating a path and then sharing a couple edit operations.

Besides being manifestly non-recursive (function calls cost a lot if keys are integers, say), this factoring also makes it easy to both support duplicate keys or not and to have such keys obey a FIFO or LIFO order along duplicate chains. The economies here are not merely "less code" but also conceptual economy for users.

Another such conceptual economy is to not use "explicit" left/right notation but an array of 2 pointers and [0] [1] everywhere. This allows the left-right symmetry of the binary tree to be exposed. The opposite side is the !side (in C). This allows rebalance after delete to be like 30 lines of code. The only author I know who pushes this is Kurt Mehlhorn.

Overall, binary search trees (AVL, red-black or otherwise) are generally quite deep & slow compared to B-trees. To me their main area of application is "intra B-tree node" unless you need the "doubly linked list" like property that a BST with parent links can provide. (I.e., were the extra link can prevent an otherwise slow lookup and can make a node "autonomous"/able to delete itself.)

Inside a B-tree node intranode it allows B-trees to be dynamically updated cheaply even with very wide nodes 4096 bytes say. That might be 512*(4B payload+4B) links objects. The constant memmove-like shifting of elements around to keep a sorted in-node array can get very expensive if edits are common. So, there you want the links to be small 1..2 byte "pointers" (node numbers, really). In this case, you really want a BST over a memory pool, not over virtual memory. (I.e. a memory pool like this (with int -> uint16|uint8):

type Pool*[T] = object
  head: int         # head of free list
  free: seq[int]    # free list
  slot*: seq[T]     # object array
proc len*[T](p: var Pool[T]): int = p.free.len
proc init*[T](p: var Pool[T], size: int) =
  p.head = 0
  p.free = newSeq[int](size)
  p.slot = newSeq[T](size)
  for i in 0 ..< size: p.free[i] = i + 1 # thread `free`
proc alloc*[T](p: var Pool[T], zero=false): int =
  result = p.head           # `p.len` => OUT OF SPACE
  if p.head < p.len:
    p.head = p.free[p.head] # return head & hop over it
    if zero: zeroMem p.slot[result].addr, T.sizeof
proc free*[T](p: var Pool[T], i: int, destroy=false) =
  if destroy: `=destroy`(p.slot[i])
  p.free[i] = p.head
  p.head = i

) not like thread pools/process pools/multi-threaded pools. (FWIW, that may be a better answer to some recent Forum post, but questioner in said post seemed confused..so, maybe not.)

Sorry if this is "too much information", but I feel it was actively solicited. There may be a wrinkle or two I neglected. E.g., there is also this paper about cache effects: https://pdos.csail.mit.edu/papers/masstree:eurosys12.pdf

I notice that this PR references Julienne Walker of eternallyconfuzzled. Does anyone happen to know what became of her? Her site has been down for several years now. She didn't get everything right (who does?), but got less wrong than many.

mjfh · 2021-08-27T13:33:37Z

good point, thanks -- will come back to that when i re-do/replace the underlying tree

c-blake · 2021-08-27T13:33:52Z

Oh, one other off-the-beaten path thing I did in my adix/btree is a bulk loading API which lets you build a minimum height B-tree from already ordered data. Statistically, "aged trees" are only about 2/3 full, meaning about 3/2 taller and 1.5x more work (and more space) than a bulk loaded tree. The @cheatfate b-tree Zah mentioned is very bare bones, as is the one in the Nim compiler itself and the fusion one.

I only do a pool allocator (in the classic 20th century sense as detailed in my last post). This yields a notable limitation that you cannot put GC'd objects in for keys/values in my current version (though I believe otherwise it is algorithmically & code factoring better than most B-trees out there). PRs are welcome to improve it, but I would kind of like to preserve the ability to allocate on disk/memoryfiles as well as in VM. Running right off memory files has always been planned, as I am trying to encourage with https://github.com/c-blake/nio (which takes the "radical" approach that if you want to persist data maybe just..save it to a file? not a pseudo-HDF5 tarball/zipfile or SQL DB). Anyway, spanning both GC'd usage mode and external address/file offset mode seems a good bit of work/re-factoring with UncheckedArray[T]/etc. (For adix/lptabz, too.) PRs along these lines are more than welcome.

why: Ref objects made it easier to shuffle around when trying to find useful primitives (aka methods/objects).

odd: that it was accepted only by amd64 compiler

why: Something like key-queue reflects better what this data structure is for. KQueue would be the preferred name but is too general and used already in the NIM library.

why: Not all generic functions were pulled while unit testing.

why: LRU will be needed for tx-pool. So, keequ has all the primitives needed for setting up a LRU queue. This will eventually replace lru_cache (which is over-egineered and ugly.)

GasInt > AccountNonce and AccountNonce > GasInt.

effectiveGasTip >: local/remote > nonce

zah · 2021-09-16T14:18:35Z

nimbus/utils/tx_pool.nim

+           Result[void,TxPoolError]
+    {.inline,gcsafe,raises: [Defect,CatchableError].} =
+  ## Convenience wrapper
+  var txs = @[tx]


No need for allocating a sequence here, a regular array would be fine.

zah · 2021-09-16T14:28:28Z

nimbus/utils/tx_pool/tx_base.nim

+  ## Walk over item lists grouped by sender addresses.
+  var rc = xp.bySender.first
+  while rc.isOK:
+    let (key, itemList) = (rc.value.key, rc.value.data)


Is it necessary to copy the data in itemList here as opposed to yielding directly rc.value.data?

no need to copy here as this is an internal module; will be important reassign() in the next patch when passing TxItemRef rather than IDs (this one covered in unit tests)

zah · 2021-09-16T14:29:34Z

nimbus/utils/tx_pool/tx_base.nim

+# at your option. This file may not be copied, modified, or distributed except
+# according to those terms.
+
+## Transaction Pool Basig Primitives


zah · 2021-09-16T14:48:55Z

nimbus/utils/ec_recover/ec_helpers.nim

+# Private, debugging ...
+# ------------------------------------------------------------------------------
+
+proc joinXX(s: string): string =


This code seems to be copy/pasted in multiple modules.

.. and is debugging only, should go away later

zah · 2021-09-16T14:51:59Z

nimbus/utils/keequ.nim

+    data: V       ## Some data value, can freely be modified.
+    prv, nxt: K   ## Queue links, read-only.
+
+  KeeQuPair*[K,V] = object ##\


You can add a TODO item to introduce view types for these return values once we upgrade to Nim 1.6

Also, consider using a tuple for this. This would enable the users to use the tuple unpacking syntax.

will try, ok :) -- applies to similar cases

Sorry, I was wrong. Returning a tuple gloriously fails within the ok() part of a Result[], see #817

zah · 2021-09-16T15:55:22Z

nimbus/utils/tx_pool/tx_item.nim

+# Public functions, getters
+# ------------------------------------------------------------------------------
+
+proc id*(item: TxItemRef): auto {.inline.} =


In Nim 1.2.x this will compile to a function that copies the 32 bytes of the id each time it's called.
I tend to use templates for such simple accessors because they avoid the problem and they reduce the work the compiler needs to do (less C code to produce in the codegen pass).

ok, thanks -- has changed anyway (but good to know for other instances)

zah · 2021-09-16T15:57:01Z

nimbus/utils/tx_pool/tx_jobs.nim

+      discard
+    of txJobsInactiveJobsEviction:
+      discard
+    of txJobsJocalTxJournalRotation:


Typo? (Jocal)

zah · 2021-09-16T16:00:00Z

nimbus/utils/tx_pool/tx_jobs.nim

+    return id
+
+
+proc txDelete*(t: var TxJobs; id: TxJobID): Result[TxJobPair,void]


Why do we call deleting a job txDelete? Calling all deletion functions del makes sense, but calling them "transaction delete" not so much

I use it for the 2nd level list/queue drivers as syntactic sugar -- not ment to be seen outside

zah · 2021-09-16T16:01:08Z

nimbus/utils/tx_pool/tx_list.nim

+    txListVfyLeafQueue ## Corrupted leaf list
+    txListVfySize      ## Size count mismatch
+
+  TxListMark* = ##\


If the SLst is keyed by gas what are we going to use the TxListMark for?

a blind value as commented, the dups queue is a keyed list with a value entry (I do not see that I can have a unified module for (key,value) and key only with no value)

should not be exported, renamed type as BlindValue

zah · 2021-09-16T16:03:49Z

nimbus/utils/tx_pool/tx_nonce.nim

+  TxNonceItems* = object ##\
+    ## Item list indexed by `AccountNonce` > `GasPrice`.
+    size: int
+    nonceList: SLst[AccountNonce,TxNoncePriceRef] ##\


Why do we need a sorted list by "account nonce"? i.e. in what context do we compare accounts by their nonces?

this one has been removed already, but the go ref implementation defines the interface returning transaction lists sorted by nonces (so this this list will re-appear as a sub-list somewhere)

main points: + Constructors as init(T: type XXX): T rather than initXXX(): XXX + Bury unused queue values for key-only queues in KeeQu implementation (no TxListMark type used, anymore) + Redefined 'auto' return codes

why: All txDB functionality hidden by tx_tabs (formerly tx_base), table drivers in sub-directory other: The module tx_jobs at the same level as tc_tabs and the fancy naming txInit, txAdd etc. has been corrected to init(), add().

why: Cascading operators as in `bySender.eq(someAddr).eq(local = false))` is is preferred over something like `bySenderEq(someAddr, local = false)`

details: per sender address buckets: all -- all entries in the same bucket local/remote -- all entry buckets grouped by locality queued/lending/included -- all entry buckets grouped by status buckets: are sorted by nonces

why: Concurrent pool access supported details: Currently not very smart, only debugging interface. Provision for + out-of-band/priority jobs (just unshift instead of append) + Can stop/resume execution at any job

details: Using 50k blocks network capture for import/replay. This allows for about 30k blocks initial block chain. The remaining stuff is enough for playing with at least 1500 transactions.

why: Background processor cannot directly respond. So expired, unwanted, or wrong transactions along with meta-data will silently be moved to the waste basket for further investigation and/or deletion.

why: Actions on the queue must be serialised, so they run through the job monitor which allows unique access to the database for the call-back function while running.

details: Incoming transaction are classified and tagged queued or pending, or moved to the waste basket

why: bails out on ci for i386, otherwise

why: Both, tx-pool and clique unit tests can use the same dump

why: Less overhead by using semaphores

arnetheduck · 2021-10-05T09:44:48Z

I'm trying to understand the background for this PR, and the motivation for choosing this approach in particular.

From what I understand, this is a port of the geth implementation, but with changes. What I'm missing is the rationale for doing a massive effort porting an existing implementation where it's unclear whether it solves our problem (geth and nimbus priorities don't necessarily align), whether it solves the problem well, whether it is burdened by historical decisions that no longer are relevant, specially in a post-merge world where block packing often is outsourced, and whether it fits nim to begin with, given tooling, collections, threading styles etc - already, there's enough code for an entire collection library in here which speaks to the fact that it might not be the best fit - specially since transaction packing is a necessary but low-priority feature that we don't want to spend sophistication on - block packing will to a large degree be outsourced post-merge.

mjfh · 2021-10-05T09:50:20Z

It is not a port - just taking some ideas from the go implementation. AFIK there are quite a few pieces missing in our nimbus implementation and comprehensive public documentation is a precious i find hard to come by.

arnetheduck · 2021-10-06T05:41:06Z

Nonetheless, what I'm trying to understand, and am asking for, is information / background about why this particular approach was taken - as a study, it's interesting, but it also brings in a complex algorithm depending on multiple custom collections and indices which both represents a long development time (contrary to our needs/requirements to quickly create a prototype of the merge and a light client implementation) and a significant maintenance burden (that may not be beneficial for nimbus for the long term, given that the landscape of block production, and therefore transaction pooling, changes significantly with the merge and projects like flashbots). When embarking on a month-long quest to produce a transaction pool, these are critical questions to answer, before proceeding to the code stage.

To give an example, a simple hash table bounded to the 1000 most profitable entries without any further indices would be enough to cover our needs for the next 6 months, and would allow us to focus on more important compatibility problems with goerli and the merge, allowing us to perform the tx pool work with a better understanding of the landscape. What is the rationale for not pursuing a strategy like that?

why: jobs might be added concurrently

c-blake · 2021-11-01T15:18:46Z

I decided waiting on me to port the ANSI C stuff to Nim was perhaps less than ideal. So, I just published the C version. To my knowledge there is nothing of similar efficiency & generality (in any programming language) in the open source BST arena { although maybe copies will now start to spring up! :-) }.

This always struck me as The One True Way to implement binary search trees..low node space overhead, low time overhead (no recursion), high flexibility (dups or not, rank/positional or not, etc.) and high generality all in really very compact code. It's only like 4 or 5 main abstraction decisions, but combined they do a lot of reduction. A lot of the times, I mention a bunch of abstract nice properties and people smile & nod and just do their own thing, though.

Also, if someone with some time wants to port to Nim, I'd be happy to include it in adix and maybe weave it into the B-tree node structure there so that B-tree edits can be more scalable.

mjfh added 3 commits August 13, 2021 19:00

Sorted list framework based on red-black tree

3375364

why: Flexible framework for constructing sorted lists with data entries that can be transparently related/linked. details: The underlying red-black tree can theoretically replaced by any other sorted table.

Data queue with keyed random access

8fc3749

why: The DoublyLinkedList data type does not allow for efficient random data access (find() is O(n).)

Activate SLst and RndQu tests

a2ecc76

mjfh added 5 commits August 23, 2021 19:00

complete rb-tree verification:

74b5d4b

why: was on `todo` status: all black nodes on any chain down the tree count the same lengths

Update queue interface

65ff9d4

why: No need for module name in methods/functions, see #799 (comment) details: update tests, comments

Update list interface

89afb61

why: No need for module name in methods/functions, see #799 (comment) details: update tests, comments

tx-pool test framework & play ground

9e0554e

details: Test setup and some simple stupid data collectors. It will not do much at the moment.

Jazzed up API

e34ada2

why: Added table [] sematics where appropriate

mjfh added 5 commits August 25, 2021 15:19

Make queue iterators robust against deletion

07c0005

why: Allow deleting current element in a `for` loop details: More unit tests, lots of small fixes

Make delete() non-discardable

e08f9ea

Update preliminary API and unit tests

a11a7cd

details: Basic structure is a keyed queue, organised by data arrival. Operations are consistent, verified by unit tests.

Re-implemented tx-pool basic queue with local/remote sub-queues

b0253d7

why: Logic was inspired by go reference implementation. details: Lists and queue pulled out into NIM sub-module source files.

Renamed queue/list value fields => data

b30b506

why: Enhances code readability details: Heavy use of return codes `rc` of type Result[] lead to repeated `value` symbols like `rc.value.value`. Now this reads `rc.value.data`.

mjfh added 2 commits August 27, 2021 12:57

Fix deletion functions

0526631

why: Garbled links after deleting the second entry relative to either queue end.

Test deletion of group of transactions based on gas price

b8a6a79

why: Logic was inspired by go reference implementation.

mjfh added 2 commits August 31, 2021 15:09

Replaced ref objects (aka pointers) by var objects

41e1f37

why: Ref objects made it easier to shuffle around when trying to find useful primitives (aka methods/objects).

Typo in object initialiser

92a5cc3

odd: that it was accepted only by amd64 compiler

mjfh force-pushed the jordan/transaction-pool-study branch from 6271944 to 92a5cc3 Compare August 31, 2021 16:19

mjfh added 2 commits August 31, 2021 18:22

Renamed rnd_qu/RndQueue => keequ/KeeQu

74f5b54

why: Something like key-queue reflects better what this data structure is for. KQueue would be the preferred name but is too general and used already in the NIM library.

Fix more typos

f998909

why: Not all generic functions were pulled while unit testing.

mjfh force-pushed the jordan/transaction-pool-study branch from c7dca49 to f998909 Compare September 1, 2021 13:32

Added LRU functionality to keequ

df6edaf

why: LRU will be needed for tx-pool. So, keequ has all the primitives needed for setting up a LRU queue. This will eventually replace lru_cache (which is over-egineered and ugly.)

mjfh added 5 commits September 10, 2021 12:22

Added go API functions: lookup sub-system

1616fdc

Re-arranged/split address groups into local/remote transactions

b3cb7be

Replaced byGasPrice list by two lists indexed

0c8f06d

GasInt > AccountNonce and AccountNonce > GasInt.

Reorganised byGasPrice list with cascaded index

1df3fd2

effectiveGasTip >: local/remote > nonce

Renamed TxGroup => TxSender

afc3491

zah reviewed Sep 16, 2021

View reviewed changes

mjfh added 17 commits September 17, 2021 14:35

Fixes and code improvements (thanks to Zah)

6d5fb38

main points: + Constructors as init(T: type XXX): T rather than initXXX(): XXX + Bury unused queue values for key-only queues in KeeQu implementation (no TxListMark type used, anymore) + Redefined 'auto' return codes

Shuffled source files a bit

02bc5ac

why: All txDB functionality hidden by tx_tabs (formerly tx_base), table drivers in sub-directory other: The module tx_jobs at the same level as tc_tabs and the fancy naming txInit, txAdd etc. has been corrected to init(), add().

Cleaned up list interfaces

de6d4e0

why: Cascading operators as in `bySender.eq(someAddr).eq(local = false))` is is preferred over something like `bySenderEq(someAddr, local = false)`

Renamed tx_tab list sources tx_queue => tx_itemid, tx_gas => tx_tipcap

a5f13ba

Added status management as sum-lists in sender table

a8eabfe

details: per sender address buckets: all -- all entries in the same bucket local/remote -- all entry buckets grouped by locality queued/lending/included -- all entry buckets grouped by status buckets: are sorted by nonces

Serialise jobs

e846cf9

why: Concurrent pool access supported details: Currently not very smart, only debugging interface. Provision for + out-of-band/priority jobs (just unshift instead of append) + Can stop/resume execution at any job

Simplify datatabases

6a49122

Added block chain from main-net for testing/TDD

7214b92

details: Using 50k blocks network capture for import/replay. This allows for about 30k blocks initial block chain. The remaining stuff is enough for playing with at least 1500 transactions.

Always delete via rejection queue (aka waste basket)

8c1deb3

why: Background processor cannot directly respond. So expired, unwanted, or wrong transactions along with meta-data will silently be moved to the waste basket for further investigation and/or deletion.

Replace transaction queue iterator as call-back service

65b9621

why: Actions on the queue must be serialised, so they run through the job monitor which allows unique access to the database for the call-back function while running.

Implemented pending queue/bucket handling

4e62dec

details: Incoming transaction are classified and tagged queued or pending, or moved to the waste basket

Explict u64 baseFee type in unit tests

606ae5c

why: bails out on ci for i386, otherwise

Move block chain dumps to separate directory

f66c2e6

why: Both, tx-pool and clique unit tests can use the same dump

Can re-calculate queued/pending txs on-the-fly

e0b1234

Re-orgaising some source files

22f2179

backup

e649e08

Moved simple jobs out of the batch queue to be executed immediately

9d573e4

why: Less overhead by using semaphores

Using async paradigm from chronos for processing batch queue

92489ff

why: jobs might be added concurrently

mjfh closed this Jan 13, 2022

mjfh deleted the jordan/transaction-pool-study branch January 13, 2022 11:45

		return id


		proc txDelete*(t: var TxJobs; id: TxJobID): Result[TxJobPair,void]

Jordan/transaction pool study #799

Jordan/transaction pool study #799

Conversation

mjfh commented Aug 13, 2021 • edited Loading

zah commented Aug 13, 2021

KonradStaniec commented Aug 14, 2021

mjfh commented Aug 24, 2021 • edited Loading

mratsim commented Aug 27, 2021

mjfh commented Aug 27, 2021

c-blake commented Aug 27, 2021

mjfh commented Aug 27, 2021 • edited Loading

c-blake commented Aug 27, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mjfh Sep 16, 2021 • edited Loading

Choose a reason for hiding this comment

arnetheduck commented Oct 5, 2021

mjfh commented Oct 5, 2021

arnetheduck commented Oct 6, 2021

c-blake commented Nov 1, 2021 • edited Loading

mjfh commented Aug 13, 2021 •

edited

Loading

mjfh commented Aug 24, 2021 •

edited

Loading

mjfh commented Aug 27, 2021 •

edited

Loading

mjfh Sep 16, 2021 •

edited

Loading

c-blake commented Nov 1, 2021 •

edited

Loading