-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Use keyset delta algorithm to speed up GetData requests #3896
Conversation
My 2c since I was the last one in here. I can't commit to a full review of this even out of draft, but want to help avoid the pitfalls I fell into during my time in this layer and steer you towards more beneficial work. How much testing have you done with this? My understanding of the startup path from @chimp1984 is that the capabilities isn't set on these initial requests so switching on it isn't going to work. Here is the relevant discussion on my proposed changes to these requests a few cycles back: #3610 (comment)
Some other minor notes after a quick review:
This is a risky enough area and a non-trivial amount of work. It would have been good to see a proposal so the work involved to review, merge, and test it could be analyzed before seeing the code. Everything right now needs a value-add and I'd like to see data (known bugs, user reports, etc) about whether the affected users for this performance patch will bring in the revenue to offset the cost. It might be better to target the known cost sinks for Bisq until the DAO is more self-sufficient. |
Yes, I was planning to tidy it up a bit and separate it into multiple commits. I didn't want to sink too much more effort into it before getting some feedback. Most of the testing has been to the delta structure, although I did do some manual testing of the startup of the application on mainnet against a single seed node running the custom code on a separate machine (over Tor). |
@stejbac @julianknutsen |
I like the idea and it's a good idea to add the capability exchange as well as it will help in other areas. I agree with @julianknutsen that this will require a fair bit of testing and it's not a trivial change. The question is if it's worth the effort to do it now or wait. A lot of users are complaining about slow or failed startups so it's surely an issue that when solved or mitigated would increase usability and with that attract more users. As we don't have a good measure of the amount of users and an even worse measure of the amount of users having trouble with this it's hard to make a well founded decision. It would be good to at least have an estimate of the amount of work needed to get this to production. |
I was largely motivated to look into this due to problems I've experienced from time to time starting up the application on my home network (and also some other poorer networks like a shared hotel WiFi and a phone WiFi hotspot in a rural area with a weak signal). If the local network is particularly bad then I've found it's basically impossible to start the application, due to repeated timeouts (without, say, manually raising them in the code from 90s to 180s). I think it will be necessary to address the issue fairly soon, for scaling purposes, since the ~2MB request will grow over time as the trade statistics and witness data mount up. Also the ~3MB response (consisting mainly of offers and mailbox entries) will presumably increase in proportion to the trading volume, which will hopefully increase a lot during the next rally (whenever that is). So a sharply increasing proportion of users is likely to be affected over time. |
I think we have to separate different types of messages. Trade statistics and account age witness objects are append only and ever growing. They are in sum about 80 000 objects as far I remember. Offers and mailbox messages are not growing linearely with number of user or trade volume IMO as conversion time for an open offer to a trade will be faster with more traders. Offers and mailbox messages are in the same range most of the time as far I remember. Currently its 383 offers and 455 mailbox messages [1]. Trade statistics and account age witness use a 20 bytes hash (other data 32 bytes). I think the optimisation of the keys (e.g. bloom filter like structure) is still great but might get lower priority if the trade statistics and account age witness are treated differently. So I would suggest to treat both problems separately and develop a roadmap how to implement and deploy both. Regarding deployment: I also suggest to use a new API (e.g. new message types) instead of overloading the existing one. It still would require lots of care and testing but might not be as difficult to deploy like other new features which require same versions between regular nodes. [1] From http://104.248.88.175
|
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
I had this issue in mind for months now (and obviously missed this suggestion). Here is what I would suggest. Please note that this is the result of previous discussions with @chimp1984 and others. Proposal
ExampleBy adding the info "I am Bisq v1.2.1" to the list of known objects, we know what objects the client has - namely, objects shipped with the data stores of v1.2.1. DiscussionAdvantages
Disadvantages
Aftermathplease keep in mind this is by no means complete. There are some hidden efforts for sure. But as a general direction, that would line up with discussions and results of previous months. |
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
The bandwidth requirements are not really O(ln(N)) in the keyset size - its more or less proportional to the size of the symmetric difference of the keysets, that is the number of keys the seed node has that the peer doesn't (in the case that the peer doesn't somehow have any keys unknown to the seed node). So as the datasets grow, the request shouldn't really grow in size. But if the peer has been offline for a while, the initial request will fail due to the difference in the keysets being too large for the initial default delta datastructure to handle. In that case, it will retry with a larger request datastructure that's highly likely to be able to accommodate the difference. So I think that's the main cost with this method - there's no guarantee the initial request to the seed nodes will succeed, which adds some extra delay to the startup of nodes that have been offline for a while (say for months, or maybe days/weeks in future if Bisq does 10x). Perhaps its better to think of the algorithm to be like that of rsync rather than Bloom filters. (It doesn't use a rolling hash, but like rsync it uses linear coding to detect and locate differences between two datasets.) Is it worth picking this up again soon to complement the approach taken in bisq-network/projects#25, or leaving it a while perhaps? |
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
Is this still being worked on? Wanted to have a look at it and evtl try it out, but latest commit is in January. |
there is a project proposal in the making for maybe integrating this into Bisq. The current status is, that this is not exactly what we need since our status quo in terms of number of objects would overburden the solution by a factor already. However, with bisq-network/projects#25 in place, this becomes interesting again. At least it is a strong candidate. |
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
I've rebased onto master, as these changes were made a while ago. |
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
This issue has been automatically closed because of inactivity. Feel free to reopen it if you think it is still relevant. |
When Bisq starts it sends simultaneous
PreliminaryGetDataRequest
packets to two seed nodes to sync the latest trade statistics, offer book, mailbox and other data, followed by a laterGetUpdatedDataRequest
packet. The requests contain a list of keys of all the data to exclude (as we already have it) and are consequently about 2MB each. The responses to the first twoGetData
requests are typically around 3MB, it seems, usually consisting mostly of offer and mailbox items (as those are not persisted and have high turnover anyway).Thus there is typically about 10MB (in total) being sent/received in the preliminary
GetData
requests/responses to/from the seed nodes. On slow networks, this appears to frequently result in timeouts and difficulty starting up the application, as the 3MB response arrives after the 90s timeout window and gets rejected.In an attempt to alleviate the problem, this PR replaces the
excluded_keys
list in theGetData
requests with a delta structure (that is a sort of binary checksum/linear code syndrome), which is XORed with a version computed equivalently at the receiving node, to obtain a decodable list of keys. This is the symmetric difference of the two sets of keys at each node, which tells the responding node which keys are missing from the peer.The delta is decodable provided the symmetric difference isn't too large compared to the physical delta size (powers-of-two from 8KB to 128KB). With the data structure I used, an 8KB delta can reliably decode up to 550 keys and a 128KB delta can reliably decode up to 10450 keys. Since this holds true only when the keys are randomised, they are all hashed to a fixed 64-bit width using a randomly seeded secure PRF (in this case SipHash) - 64 bits is enough to ensure a very low probability of collision per request and deliberately causing collisions is impossible since the random seed/hash-salt is an ephemeral shared secret.
In case decoding fails, the responding peer estimates the size of the undecodable delta and sends that information back, so that the requesting peer can retry with a delta structure big enough to accommodate the expected symmetric difference. The keys can also be restricted to an arbitrary range of 64-bit values, which effectively filters out items out at random, ensuring that the symmetric difference will be reliably decoded in the next request. The code will keep retrying until it gets everything.
The above requires new fields to be added to the
PreliminaryGetDataRequest
,GetUpdatedDataRequest
andGetDataResponse
proto definitions. I've tried to do this in a way which is backwards compatible by adding a provisionalGET_DATA_KEY_SET_DELTA
capability and only using the new delta structure in requests to peers which flag the capability.This PR is a draft and still needs some work (e.g. to the unit tests and commit messages). I'm also considering making the following possible further enhancements:
In the initial requests to the pair of seed nodes, use the range filter to split the keyset in two and send a different half to each pair, so each one only sends back approximately 1.5 MB, cutting the network load in half again. After getting one response, wait a little while (rather than the usual 90s) to get the other response before retrying.
Now that the requests are cheap, it might make sense to limit the responses to 1MB or less and make repeated requests to sync everything, rather than getting it all in one go. This should make startup more reliable on very slow networks. Also possibly add a corresponding progress bar to the splash screen.
Optimistically persist/cache the offer book and mailbox entries for faster startup next time, in case the application was only shut down for a few hours or days. Since the response now contains a list of unrecognised keys, any items that don't match can be purged from the cache and the rest promoted to the protected store (as the seed node definitely still has them).
I'm not sure whether it would be best to include the above in this PR or consider them for later PRs.