-
Notifications
You must be signed in to change notification settings - Fork 105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Data below a threshold should not be stored in DataStore. #456
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🔥 🙏 🎸
Thanks @LiranCohen! Pretty much there, with a few comments.
I have not looked at the tests thoroughly because my comments impacts the tests written!
f9a37c4
to
5398907
Compare
5398907
to
8984696
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I missed the initial discussion around this PR, but I'm curious about an alternative approach. It makes sense that we need to reduce the number of queries, and moving smaller data
into the messageStore
accomplishes that. Have we considered adding more indexes to the message store? Currently we store the data blob with messageCid
as the only index. If we added a list of indexes
to dataStore.put
, we could dataStore.query
on the same filters that we use for messageStore.query
. This would take a more significant refactor of data-store-level
to support, so I'm comfortable with merging this PR for now if we need to solve this n+1 query issue today.
Part of why I'm suggesting an alternate approach is that storing data in the message store just feels weird to me. While reviewing I found it confusing that the messageStore.query
or messageStore.get
would returns RecordsWriteWithOptionalEncodedData
instead of the actual message.
@diehuxx I agree that it might be a bit confusing that RecordsWriteMessage can optionally have encodedData as an internal mechanism. We should definitely have a more in-depth discussion at some point about this. I added some comments in addition to your suggestions to hopefully make things a little clearer in certain places when reading the code. |
201c7b2
to
be49e0a
Compare
…shold In this commit I've added utilized the existing RecordsWriteMessageWithOptionalEncodedData class to leverage writing data from RecordsWrite messages directly into the MessageStore when the data is below the 50k byte threshold. For now I'm still storing in DataStore, but that will be removed in subsequent commits. I had to take care and remove `encodedData` from messages when computing the CID and made sure to use the Message.getCid() helper for computing CIDs for message storage. Additionally, I've made sure to prune the original RecordsWrite of any encodedData when cleanup happens.
In this commit I've built on the previous one by not storing data larger than the treshold within the DataStore and keepign it only embedded within the MessageStore. I've imicked the sareguards and errors from putData into processEncodedData taking into account encodedData. I've also implemented tests for various data sizes in certain places where I noticed it may matter. I'm missing some additional tests that I will write in subsequent commits.
…etween putData and processEncodedData
be49e0a
to
3d700c7
Compare
Updating the DWN SDK to the latest version. This Includes: - `encodedData` stored within the `MessageStore` below a certain threshold - `MessageStore` interface now accepts multiple Filters to create OR conditions between filters. - 'MessageStore' interface accepts sorting and pagination inputs - If sorting by `datePublished`, add the property `published` set to true within the filter. Relevent `dwn-sdk-js` PRs: TBD54566975/dwn-sdk-js#478 TBD54566975/dwn-sdk-js#456 * upgrade kysley for tuple support * upgrade sdk and queries, first pass -- tests fail * sorting/pagination * store encodedData when available, sort and paginate according to messageCid cusror * added more comments * added comment about removing encodedData from message before encoding for storage * bump version * new release version * npm i after version bump
Data below a certain threshold should not be stored in
DataStore
but instead directly inMessageStore
as anencodedData
property.This was brought up as a potential solution for: #453
The way you deal with RecordsWrite/RecordsRead/RecordsQuery stays the same, this only affects the storage of messages.
Prior to this change any data below 10K would be embedded within the RecordsQuery response as
encodedData
, this will now be bumped to the 50k limit.I opened this as a Draft PR, I still have some tests to do and would like to review it myself some more.
Any input/feedback would be appreciated.