-
Notifications
You must be signed in to change notification settings - Fork 210
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve database model conversion performance by caching extra data and keeping row cache #3534
Conversation
SDK Size
|
SDK Performance
|
93d2f62
to
e424ea0
Compare
SDK Size
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a couple of comments for now 👍
// Refresh the state by merging persistent state and local state for avoiding optimistic locking failure | ||
for object in self.writableContext.updatedObjects { | ||
self.writableContext.refresh(object, mergeChanges: true) | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain why was this needed before, and now it is not? 🤔 What are the chances of this producing some regressions? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't there a chance that Data will be missing or the FRC won't trigger updates?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we need more details on why was this removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked into the history of this and it goes like this:
- In 2021 it started as an optimisation change for reducing context changes
// If you touch ManagedObject and update one of it properties to same value
// Object will be marked as `updated` even it hasn't changed.
// By reseting such objects we remove updates that are not updates.
for object in self.writableContext.updatedObjects {
if object.changedValues().isEmpty {
self.writableContext.refresh(object, mergeChanges: false)
}
}
refresh(_:mergeChanges:false) = "Updates the persistent properties of a managed object to use the latest values from the persistent store. If flag is false, the context discards pending changes and the managed object becomes a fault. Upon next access, the context reloads the object’s values from the persistent store or last cached state."
- In early 2023 it gets changed to
self.writableContext.refresh(object, mergeChanges: true)
"If flag is true, the context reloads the object’s property values from the store or the cache. Then the context applies local changes over the newly loaded values. Merging the local values into object always succeeds, and never results in a merge conflict."
I think this was done because change 1 was making FRC to lose track of changes.
Reasoning:
We use DTOs' willSave to force FRC updates by accessing the "parent" DTO and assigning a property and setting back the same value to the DTO. For example, UserDTO is getting saved and if it happens to be CurrentUserDTO.user, then UserDTO.willSave forces the CurrentUserDTO to change as well. Then FRC observing the CurrentUserDTO picks up the change. Discarding changes in these cases breaks FRC observing which I think the change number 2 tackled. The change in number 2 made sure these updates go through.
From UserDTO:
if let currentUser = currentUser, !currentUser.hasChanges {
let fakeNewUnread = currentUser.unreadChannelsCount
currentUser.unreadChannelsCount = fakeNewUnread
}
- I removed the if check when testing if it fixes these some crashes in CoreData. It didn't.
Since the change number 2 some things have happened like we do not use viewContext anymore (except reading the state in DataStore, DB observers do not use it anymore). All the DB mutations go through the writableContext
. Therefore, there should never be a case where writableContext
itself would require to refresh the DTO from the persistent store (because it is the only one changing it). That is the main reason I believe we can remove it.
Interesting side of it is that refreshing an object seems to discard cached data in Core Data (cached data in other contexts) which leads to performance degradation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't we also introduce a state layer context? Will this have an impact on it, have you checked that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For state layer, with similar test, the change is 1.60 s to 0.77 s (as expected)
self.init( | ||
dateFormatter: iso8601formatter, | ||
dateCache: dateCache, | ||
rawJSONCache: RawJSONCache(countLimit: 500) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the performance benchmark table, we are not checking the memory usage. Is there any significant increase in memory by caching all the extra data?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't expect it, but maybe we clean it up on memory warning notifications, just in case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Internally it is NSCache which clears itself on low memory warning. I used a wrapping class because of unit-tests. Can't mock NSCache from the Swift side (it is objc generic class and brings restrictions).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
} catch { | ||
memberExtraData = [:] | ||
} | ||
let memberExtraData: [String: RawJSON] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another interesting thing that we could try to improve is that ChatMember
and ChatUser
are both classes. And I think we should be able to convert them to structs without API changes. Could be interesting to see if there are any performance improvements by doing so 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do you think structs would improve performance? Probably only the SDK size would increase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ChatUser.init itself is visible in the time profiler, but it is like 1 ms which is nothing compared to other things happening. At some point it would be nice to only use structs for models (consistent).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Usually structs are lightweight to create and destroy, so it should improve things, but maybe not significantly. @laevandus We not only need to check the init, but also how much time is spend destroying existing objects 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on time profiler I don't see this popping up really, a couple of ms for ChatUser.__deallocating_deinit
and ChatChannelMember.__deallocating_deinit
. We should do that for consistency, but in a separate ticket. No real performance improvement here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay nice 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! I'm only concerned about the refresh removal, let's test this a bit.
} catch { | ||
memberExtraData = [:] | ||
} | ||
let memberExtraData: [String: RawJSON] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do you think structs would improve performance? Probably only the SDK size would increase.
// Refresh the state by merging persistent state and local state for avoiding optimistic locking failure | ||
for object in self.writableContext.updatedObjects { | ||
self.writableContext.refresh(object, mergeChanges: true) | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we need more details on why was this removed.
self.init( | ||
dateFormatter: iso8601formatter, | ||
dateCache: dateCache, | ||
rawJSONCache: RawJSONCache(countLimit: 500) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't expect it, but maybe we clean it up on memory warning notifications, just in case?
41c1b22
to
7c25aea
Compare
…nd not force refreshing updated objects which affects Core Data row caches
7c25aea
to
9f33d73
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! ✅ ⚡
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM ✅
Quality Gate passedIssues Measures |
🔗 Issue Links
Resolves: IOS-610
🎯 Goal
📝 Summary
🛠 Implementation
Performance measurement setup
Demo user
Chewbacca
has 19 muted channels and has a channel named "Performance Testing" with 12 members where each of the member has extra data filled with 162 fields (nested data).I added a simple snippet to the demo app (locally) which sent 20 messages with 1 second delay. While the message sending was taking place, I used time profiler to measure how busy threads are.
Time spent for extra data decoding went down a lot as expected, but time profiler was also showing that Core Data is busy fetching data for DTOs when
asModel()
functions were called (for ChannelDTO.asModel it was around 40% of time). What triggered this behaviour is the snippet in theDatabaseContainer.write
where updated objects are fully refetched from the persistent store. This seems to affect the Core Data's row cache and makes it to discard it which in turn means thatbackgroundReadOnlyContext
needs to refetch everything (because each context has its own DTO objects). The current state is that all the writes go through thewritableContext
, therefore there is not any need to actually sync update objects in the write call because there is no-one else changing the persistent store.🎨 Showcase
Baseline
Cache - only extra data caching
Cache & merge off - extra data caching and not force updating updated objects
Cache & merge off & data prefetching - extra data caching, not force updating updated objects, and relationship prefetching enabled (
StreamRuntimeCheck._isDatabasePrefetchingEnabled
)ChannelDTO.asModel()
(from channel controller)ChannelDTO.asModel()
(total)CurrentUserDTO.asModel()
(total)Caching extra data also reduces the memory footprint for RawJSON
Takeaways
refresh:mergeChanges:
affects a lot (~45% improvement in channel controller)-
CurrentChatUser
should not reference fullChatChannel
models (cids would be better)🧪 Manual Testing Notes
☑️ Contributor Checklist