-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make serialisation in FileManager::saveToFile thread-safe #4025
Make serialisation in FileManager::saveToFile thread-safe #4025
Conversation
Minor change for consistency: narrow the signature of some remaining such methods, which have return type 'PersistableEnvelope'. (This excludes some other cases with return type 'NetworkEnvelope'.)
Add toProtoMessageSynchronized() default method to PersistableEnvelope, which performs (blocking) protobuf serialisation in the user thread, regardless of the calling thread. This should prevent data races like the ConcurrentModificationException observed in bisq-network#3752, under the reasonable assumption that shared persistable objects are only mutated in the user thread. Also add a ThreadedPersistableEnvelope sub-interface overriding the default method above, to let objects which are expensive to serialise (like DaoStateStore) be selectively serialised in the 'save-file-task-X' thread as before, but directly synchronised with each mutating op. As most objects are cheap to serialise, this avoids a noticeable perf drop without having to track down every mutating method for each store. In all cases but one, classes implementing ThreadedPersistableEnvelope are stores like TradeStatistic2Store, with a single ConcurrentHashMap field. These require no further serialisation, since the map entries are immutable, so the only mutating operations are map.put(..) calls which are already synchronised with map reads. (Even if map.values().stream() sees updates @ different keys happen out-of-order, it should be benign.) The remaining case is DaoStateStore, which is only ever reset or modified via a single persist(..) call with a cloned DaoState instance and hash chain from DaoStateSnapshotService, so there is no aliasing risk from the various DAO state mutations done in DaoStateService and elsewhere. This should fix bisq-network#3752.
the part about performance sounds awfully familiar to me. see #4003 |
I would like to get this PR into the next release if possible. It touches core DAO code so @ManfredKarrer needs to review it. I think it looks good from what I've seen so far. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
utACK - but would prefer to see the suggested changes and want to emphasise that it has to be tested very well. Its certainly a high risk PR. But great to see some progress in that area! Thanks @stejbac !
I would suggest renaming toProtoMessageSynchronized
to toPersistableMessage
.
Also would suggest to make it more explicit and flexible by adding the new behaviour in a UserThreadMappedPersistableEnvelope
interface and keep PersistableEnvelope as non-safe-thread but fast interface with the default method:
default Message toPersistableMessage() { return toProtoMessage(); }
If this is applied it has to be carefully checked for each persisted object which interface is most suitable and check with profiling how the changes compare. There are some generic classes like PersistableList
which need to be considered as well (at the moment its all covered as PersistableEnvelope
is used for the new method.
There are several classes implementing PersistableEnvelope
like SignedWitness
but I think that is wrong as they are not persisted directly but wrapped in a storage class. I have not checked more closely but might be worth to clean that up in a separate PR.
I am not sure about the performance impacts when using UserThread mapping. Any delay > 20 ms is visually visible in the UI as lagging.
I would suggest to create 3 categories for all persisted objects:
- Big objects -> performance is critical, use threaded version (DaoState,...)
- Small objects, correctness is critical -> use UserThread mapping (SequenceNrMap,...)
- Other uncritical -> use non-thread safe option in dedicated thread (NavigationPath)
The uncritical ones could be kept as non-threadsafe IMO.
For further improvements the delays between write operations should be checked, there might be some optimisation possible, specially for those which have frequent write operations.
From the DAO perspective I don't see any obvious risk with that change, but it should be tested very well specially with DaoStateHashes and the monitoring for the 3 different hash chains.
For the most problematic objects (DaoState, TradeStatistics, AccountAgeWitness) I suggest to start to look into a different design as suggested in the related issues/PRs (e.g. use a historical data structure which is not handled by the P2PDataStorage and only loaded in memory on demand). For DaoState that approach might not work, but some solution to make it more scalable is required here as well.
|
||
/** | ||
* Interface for the outside envelope object persisted to disk. | ||
*/ | ||
public interface PersistableEnvelope extends Envelope { | ||
|
||
default Message toProtoMessageSynchronized() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find the method name a bit confusing as it is only synchronized in ThreadedPersistableEnvelope
but here we use mapping to userThread. I would recommend using toPersistableMessage
as stated above.
|
||
/** | ||
* Interface for the outside envelope object persisted to disk. | ||
*/ | ||
public interface PersistableEnvelope extends Envelope { | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe its better to use a specific interface here as well like in ThreadedPersistableEnvelope
? Might make it more explicit that we alter the basic non-thread-safe but fast behaviour. We could use then toPersistableMessage
as method name for all 3 cases and the interface marks which style is actually used. Would make it also a bit more distinct from the generic toProtoMessage method and shows its only relevant for persisting an envelope.
Make PersistableEnvelope method name less confusing, as it is only synchronised in ThreadedPersistableEnvelope.
Make the default toPersistableMessage() method of PersistableEnvelope simply delegate to Proto.toProtoMessage for speed, so that stores can explicitly implement (Threaded|UserThreadMapped)PersistableEnvelope if they actually need concurrency control. As part of this, make PeerList implement PersistableEnvelope directly instead of extending PersistableList, as it is non-critical & cloned on the user thread prior to storage anyway, so doesn't need be thread-safe. In this way, only PaymentAccountList & small DAO-related stores extend PersistableList, so they can all be made user-thread-mapped. After this change, the only concrete store classes not implementing (Threaded|UserThreadMapped)PersistableEnvelope are: AccountAgeWitness, BlindVotePayload, ProposalPayload, SignedWitness, TradeStatistics2, NavigationPath & PeerList The first five appear to erroneously implement PersistableEnvelope and can be cleaned up in a separate commit. The last two are non-critical. (Make NavigationPath.path an immutable list, for slightly better thread safety anyway - that way it will never be observed half-constructed.)
I've just pushed changes to address some of the issues pointed out above. I renamed Serialisation_Benchmarks_with_SequenceNumberMap.txt I think it would be better to make it continue to implement |
@stejbac Thanks for the implementation of the suggested changes! Could we make a list of all potential persisted objects (some are only created in rare cases like FailedTrades) to check for each persisted object if the adequate interface is applied? Would be also great to see a performance comparison before/after change (hope I don't get too annoying with those requests but I think as its a bit of a risky change its better to be extra careful here). I just found another issue (no actual problem but potential risk for future changes): |
Thanks for the list. Also the Thanks again for working on that! |
@ManfredKarrer Is this a requirement for this PR to get merged? If not please NACK or if it is up to be merged ACK from your side to make it clear for the maintainers 😄 . Thanks! |
@ripcurlx Thanks for reminding on that. I recommend to not change the behaviour of those 2 classes (not using |
Provide UserThreadMappedPersistableList subclass for persistable lists which need to implement UserThreadMappedPersistableEnvelope, instead of putting the interface on the base class. Make the (non-storage) classes MeritList and VoteWithProposalTxIdList keep the original PersistableList superclass, deriving the remaining subclasses of PersistableList from the new class instead. In this way, further persistence-related changes are less likely to inadvertently alter the behaviour of those two consensus-critical classes. Removing the superfluous PersistableEnvelope interface from the two classes (via the base class) will be done in a separate PR.
I've just made another commit to remove the This does, however, include three other |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK - based on #4025 (comment)
Tested it myself on Regtest (Arbitration, Voting Cycle, Trade, Account Creation) and I didn't see any issues so far.
Add
toProtoMessageSynchronized()
default method toPersistableEnvelope
, which performs (blocking) protobuf serialisation in the user thread, regardless of the calling thread. This should prevent data races like theConcurrentModificationException
observed in #3752, under the reasonable assumption that shared persistable objects are only mutated in the user thread.Also add a
ThreadedPersistableEnvelope
sub-interface overriding the default method above, to let objects which are expensive to serialise (likeDaoStateStore
) be selectively serialised in thesave-file-task-X
thread as before, but directly synchronised with each mutating operation. As most objects are cheap to serialise, this avoids a noticeable performance drop without having to track down every mutating method for each store.In all cases but one, classes implementing
ThreadedPersistableEnvelope
are stores likeTradeStatistic2Store
, with a singleConcurrentHashMap
field. These require no further synchronisation, since the map entries are immutable, so the only mutating operations aremap.put(..)
calls which are already synchronised with map reads. (Even ifmap.values().stream()
sees updates at different keys happen out-of-order, it should be benign, e.g. trade statistics might be persisted out-of-order if two trades come along at almost exactly the same time.)The remaining case is
DaoStateStore
, which is only ever reset or modified via a singlepersist(..)
call with a clonedDaoState
instance and hash chain fromDaoStateSnapshotService
, so there is no aliasing risk from the various DAO state mutations done inDaoStateService
and elsewhere.This should fix #3752.
--
Adding some temporary benchmark logging to
PersistableEnvelope
shows the time spent in eachsaveToFile
protobuf conversion forced to run on the user thread with the PR changes, from my Bisq node (running on mainnet) from startup to shutdown:Serialisation_Benchmarks.txt
As can be seen in the filtered log, the serialisation of
PreferencesPayload
,UserPayload
andTradableList
took 100-140ms (which might cause noticeable jitter on the user thread) the first time they were persisted upon application startup. However, subsequent serialisations of the same objects seem to occur much faster (presumably after the protobuf code has had a chance to warm up), with no (single-threaded)PersistableEnvelope
object taking longer than around 20ms to serialise once the application has started.The assumed expensive-to-serialise objects like
DaoStateStore
and those managed byMapStoreService
all implementThreadedPersistableEnvelope
, so don't appear in the above benchmark and are converted directly on eachsave-file-task-X
thread as before.