-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce initial request size #4233
Reduce initial request size #4233
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've read through the core parts of the PR. For now just adding comments to gain a better understanding. This PR makes me concerned that we might lose data during upgrades or otherwise, but I don't have a full understanding yet.
p2p/src/main/java/bisq/network/p2p/storage/persistence/SplitStoreService.java
Show resolved
Hide resolved
p2p/src/main/java/bisq/network/p2p/storage/persistence/SplitStoreService.java
Show resolved
Hide resolved
p2p/src/main/java/bisq/network/p2p/storage/persistence/SplitStoreService.java
Outdated
Show resolved
Hide resolved
p2p/src/main/java/bisq/network/p2p/storage/persistence/SplitStoreService.java
Outdated
Show resolved
Hide resolved
p2p/src/main/java/bisq/network/p2p/storage/persistence/SplitStoreService.java
Outdated
Show resolved
Hide resolved
p2p/src/main/java/bisq/network/p2p/storage/persistence/SplitStoreService.java
Outdated
Show resolved
Hide resolved
7169457
to
599110a
Compare
just rebased to master and folded in the first batch of feedback. (Thanks @sqrrm for your efforts) Build is failing because JDK12 does not support removing final-modifiers through reflections. If I remove the tests that rely on that everything would be green. Not sure how to handle that. And no, there is no other way to test this except loosening something that clearly is final to being not final in the first place. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking a bit about it I guess there is conceptually the same model we have now with one store, just split in many.
I am having a hard time getting a good overview of what's going on, but I think there is some muddling of concerns. A new client would not have to know about the old format as I see it. Only those nodes delivering data (seednodes typically) need to separate the kind of request they're getting and deliver data accordingly.
p2p/src/main/java/bisq/network/p2p/storage/persistence/SplitStoreService.java
Outdated
Show resolved
Hide resolved
p2p/src/main/java/bisq/network/p2p/storage/persistence/SplitStoreService.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Requested changes. Mostly related to clean up, but still significant. I found readability difficult. Would appreciate more helpful names and more intermediary variables.
I wrote up a description of the PR and the problem it's solving. I'll include it below. Some of the things there may not be correct; it represents my ongoing learning about Bisq's P2P (and this PR). @freimair, after corrections, if they're necessary, would you consider including some of this information in an appropriate place in the code? I think future devs will appreciate an explanation of why this datastore design was chosen. I would appreciate feedback if I got something wrong.
A datastore is a store of Bisq p2p objects. It is synced between peers-peers and peers-seednodes,
and this way makes up a distributed database.
I presume that a peer's datastore is synced in this way:
- a peer/client starts up;
- it starts a "bulk sync" procedure:
- gathers up UIDs (or hashes) of the objects it already has in its datastore;
- uploads those UIDs to a seed node;
- seed node takes a set difference of the passed UIDs and the UIDs of the objects it has;
- seed node uploads to the peer/client the objects corresponding to that difference;
- peer then enters p2p mode, listening for and publishing updates.
The problem is that this "bulk sync" procedure includes the peer sending a large number of UIDS
to the seed node, which can timeout, in case of insufficient network throughput, causing the
syncing to unrecoverably terminate.
That shows two problems:
- Bisq does not recover from throughput issues;
- the bulk sync is bandwidth-intensive.
This spec addresses the bandwidth-intensity.
Chronological information in the p2p objects would allow querying it by comparing objects to a
timestamp-like value; however, the p2p objects do not contain chronological information. It is
unclear whether introducing a timestamp would have any negative effects, but it would involve
making sensitive changes.
Freimair's proposal is about using chronologically-indexed datastore "checkpoints" that are
distributed with releases
- (and thus identical across clients and seednodes)
- each checkpoint contains objects generated since the last checkpoint
- the first checkpoint contains all prior objects
- this effectively assigns chronological information to objects
- precision of which, for the purposes of bulk sync, corresponds to frequency of releases;
- this way a bulk sync can be initiated by identifying
- the latest checkpoint that the client has
- and the objects it has stored since then (objects not in any checkpoint known to him)
- which reduces the bandwidth requirement
- to the equivalent of uploading UIDs that are newer than the release used
- because identifying the checkpoint is negligible bandwidth-wise.
- to the equivalent of uploading UIDs that are newer than the release used
Requirements for the checkpoint datastore:
- initializable from the previous monolithic datastore or from scratch;
- a "running" checkpoint has to be maintained;
- which holds objects generated since the last "release" checkpoint;
- at no point can the datastore be mutated, except via appends;
- "release" checkpoints must be identical across peers and seednodes.
p2p/src/main/java/bisq/network/p2p/storage/persistence/SplitStoreService.java
Show resolved
Hide resolved
p2p/src/main/java/bisq/network/p2p/storage/persistence/AppendOnlyDataStoreService.java
Outdated
Show resolved
Hide resolved
p2p/src/main/java/bisq/network/p2p/storage/persistence/SplitStoreService.java
Show resolved
Hide resolved
p2p/src/main/java/bisq/network/p2p/storage/persistence/SplitStoreService.java
Show resolved
Hide resolved
storage.queueUpForSave(); | ||
|
||
return historicalStore; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we copying historical datastore checkpoints/splits out of resources? Because that's what the live-store-only solution we currently use does? For troubleshooting purposes? Seems unnecessary, though I may be missing something. I presume we keep the live store in the working directory, because we can't mutate resource files (is that even true?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we copying historical datastore checkpoints/splits out of resources? Because that's what the live-store-only solution we currently use does? For troubleshooting purposes? Seems unnecessary, though I may be missing something.
the existing file storage mechanisms are all tailored to working in the working directory. Changing that would be equivalent to replacing the file storage mechanisms with something else entirely. Hence, leaving it as is (until bisq-network/projects#29).
I presume we keep the live store in the working directory, because we can't mutate resource files (is that even true?).
Well, obviously, the "live store" is something that is created and maintained by the Bisq app per installation and even per (configurable) app name and path. Placing that in the resources would only allow for a single instance of the data store AND would be overwritten and therefore invalidated on every app upgrade.
@dmos62 thanks for your review. Appreciate it. First of all, your PR description is correct. Furthermore, I believe I can address all of your comments in summary: That being said, bisq-network/projects#29 is in the queue. That project will get rid of the copying files around because there will be no more individual files, the API will need a major overhaul, the filter might get enhanced, the Now to your obvious question: why not do it properly? I rather not spend time on reworking the same thing twice 😄 because I am a) a lazy bastard, and b) we need to use the resources we have efficiently. And of course, coding styles and our brains differ, so there is no perfect PR anyways. |
f5d02d1
to
1d70856
Compare
@freimair followed up above post with a Keybase message and we had a brief exchange there. I'll summarize how I see this situation and what my concerns are:
My concerns:
Idea: Stack a backup system on top of this in such a way that the mutations we perform cannot lose data. Of course we have to be aware that this is addressing complexity with more complexity. Maybe this is an example of how not to solve this situation. |
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
A few "high level" comments/thoughts: By splitting up the data into groups we are creating "blocks" (I use that terminology with intention). How to uniquely identify a block in a distributed system is a challenge. The approach used here makes the shortcut via the release maintainer to be the single entity to define what is the right "block". This might not be an issue (or at least does not make things worse as it is) but I am not 100% sure. The checkpoint terminology reveals the centralisation issue. Is there another way to do it? Another approach could be to use a fixed time period window (e.g. 10 days) and fill up all data whose date falls into that period. The GetData request would send the index of the time window and the hash of its content. If the Seed node has a different hash it will send the whole package. For older data that should be very rare and the additional network overhead might be a good trade off with the more simple approach and low processing costs. The hash gets calculated once and is stored with the package. A problem is how to handle the case if a user has more data items than the seed node. The it would request each time again the package. If a seed is missing data it would get dossed by the network.... Another approach might be to use Bloom filters for the data request, but that is likely expensive for the seed node to process. A probably bigger problem than the 2.5 MB request is the load on the seed node to process the response. |
The main problem that we're trying to get around is that p2p objects are meant to be indexed by hash and don't have properties that would allow them to be indexable by something lightweight (like time). Bloom filters are normally the solution for these situations, but their processing is memory intensive, which opens up Bisq to denial-of-service attacks. It might be worthwhile to look at academic literature on the topic. I can't dig deeper now, but I found some interesting articles following these links. Proabilistic set membership data structures are actively researched: |
@dmos62 Thanks for your input! After a discussion with @sqrrm we agreed the best way forward is to limit a first mitiagtion to trade statistic objects only. This will reduce bandwith requirement by 50% as well as seed node processing load by 50% (as its about 50% of the hashes). It will also reduce required RAM for the Bisq app if it is only loaded on demand. Rough idea: Prune existing trade statistics by removing all data older than a cut off date - e.g. 2 months (TBD). Beside that handling of trade statistics data is exactly as it is now but instead of 4 years of data we only deal with 2 month. I assume that will be about 1-5% of current data. This approach reduces a lot the risks as if anything goes wrong we only screw up trade statistics which are non-critical data. Backward compatibility: I might work on that myself in a couple of weeks if I find time. Otherwise if anyone wants to pick it up, please leave a message. |
Having discussed it and considering the risks it seems much safer to only focus on the tradestats to begin with. The risks are very high if any of the DaoState data or account witness data is lost. Losing some tradestats data is not critical in the same way but could still yield results. It would also help to prove the concept. |
I started working on it: #4405 Lets move discussion over there.... |
9c43a2d
to
9d3d650
Compare
A bit late, I know, but I can be online again, local infrastructure is fixed, and I take my time to comment on your discussions. Although I would suggest a call sometime, just to get somewhere. ad rough ideapruning data does not solve the problem.
ad another way to do it
ad backwards compatibility
ad
|
Up until now, the tests needed to create a version history set for testing. These changes had to be made to a final static variable during runtime. Now, enough versions are in the history set anyways, so we could remove this part of the testing framework.
3542e27
to
5012aad
Compare
5012aad
to
76cc37e
Compare
I thought I could get by using a fixed version string so we can have tests involing future data stores...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NACK
Basic concept is ok but many impl. elements are not acceptable IMO. Specially the hack to encode the version into the payload of the hashes instead of adding a new protobug field to the request msg.
I suggest that we use #4519 which is based on this PR one but has a lot of changes. It is still WIP and not much tested, should be ready for review in a few days...
Fixes #2547, fixes #2549, fixes #3763, part of bisq-network/projects#25
This PR reduces the initial request size to a O(1). By numbers, the number of historical keys to be sent to the seednodes is 1 now. With 1.3.4, that is 1 vs. 100k+ object keys and 2x1kB vs. 2x2500kB at startup.
Details
This is a big step towards a more pleasant user experience and due to changes to parts of the data store implementation it is also high risk.
I see the split data store implementations as a temporary solution until I can proceed to bisq-network/projects#29. Hence, even as there could be more tests, there aren't.
Testing efforts
I created a number of test classes handling the basics of the migration and update scenarios, everyday use cases plus a few special cases. Additionally, there is a shell script running a kind of manual integration test which helped in finding more issues. The "manual" test takes care of setting up a testing testnet and through the steps upgrades pieces of software as well. Take a look at it if you so please. The results of the test have to manually observed although there is a
result.log
once finished which lists all the request and response sizes. (please not that some logs are put there twice because that is simply how it is).Please give this a thorough test. If there are any questions, please feel free to contact me on keybase.
EDIT:
as always when trying to test disk IO, travis complains and fails the tests. Please try the PR locally anyways. I will deactivate the tests for travis before the PR is merged.