Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data below a threshold should not be stored in DataStore. #456

Merged
merged 15 commits into from
Aug 15, 2023

Conversation

LiranCohen
Copy link
Member

Data below a certain threshold should not be stored in DataStore but instead directly in MessageStore as an encodedData property.

This was brought up as a potential solution for: #453

The way you deal with RecordsWrite/RecordsRead/RecordsQuery stays the same, this only affects the storage of messages.

Prior to this change any data below 10K would be embedded within the RecordsQuery response as encodedData, this will now be bumped to the 50k limit.

I opened this as a Draft PR, I still have some tests to do and would like to review it myself some more.

Any input/feedback would be appreciated.

@LiranCohen LiranCohen marked this pull request as ready for review July 28, 2023 17:43
Copy link
Member

@thehenrytsai thehenrytsai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔥 🙏 🎸

Thanks @LiranCohen! Pretty much there, with a few comments.

I have not looked at the tests thoroughly because my comments impacts the tests written!

src/handlers/records-write.ts Show resolved Hide resolved
src/handlers/messages-get.ts Outdated Show resolved Hide resolved
src/store/message-store-level.ts Outdated Show resolved Hide resolved
src/store/storage-controller.ts Outdated Show resolved Hide resolved
src/handlers/records-write.ts Outdated Show resolved Hide resolved
src/handlers/records-write.ts Outdated Show resolved Hide resolved
@LiranCohen LiranCohen force-pushed the lirancohen/453-reduce-db-queries branch from f9a37c4 to 5398907 Compare August 1, 2023 22:07
@LiranCohen LiranCohen force-pushed the lirancohen/453-reduce-db-queries branch from 5398907 to 8984696 Compare August 3, 2023 20:25
Copy link

@diehuxx diehuxx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I missed the initial discussion around this PR, but I'm curious about an alternative approach. It makes sense that we need to reduce the number of queries, and moving smaller data into the messageStore accomplishes that. Have we considered adding more indexes to the message store? Currently we store the data blob with messageCid as the only index. If we added a list of indexes to dataStore.put, we could dataStore.query on the same filters that we use for messageStore.query. This would take a more significant refactor of data-store-level to support, so I'm comfortable with merging this PR for now if we need to solve this n+1 query issue today.

Part of why I'm suggesting an alternate approach is that storing data in the message store just feels weird to me. While reviewing I found it confusing that the messageStore.query or messageStore.get would returns RecordsWriteWithOptionalEncodedData instead of the actual message.

@LiranCohen
Copy link
Member Author

@diehuxx I agree that it might be a bit confusing that RecordsWriteMessage can optionally have encodedData as an internal mechanism.

We should definitely have a more in-depth discussion at some point about this.

I added some comments in addition to your suggestions to hopefully make things a little clearer in certain places when reading the code.

diehuxx
diehuxx previously approved these changes Aug 4, 2023
diehuxx
diehuxx previously approved these changes Aug 4, 2023
…shold

In this commit I've added utilized the existing RecordsWriteMessageWithOptionalEncodedData class
to leverage writing data from RecordsWrite messages directly into the MessageStore when the data
is below the 50k byte threshold.

For now I'm still storing in DataStore, but that will be removed in subsequent commits.

I had to take care and remove `encodedData` from messages when computing the CID and made sure to use the
Message.getCid() helper for computing CIDs for message storage.

Additionally, I've made sure to prune the original RecordsWrite of any encodedData when cleanup happens.
In this commit I've built on the previous one by not storing data larger than the treshold
within the DataStore and keepign it only embedded within the MessageStore.

I've imicked the sareguards and errors from putData into processEncodedData taking into account encodedData.
I've also implemented tests for various data sizes in certain places where I noticed it may matter.

I'm missing some additional tests that I will write in subsequent commits.
@LiranCohen LiranCohen merged commit b2927e6 into main Aug 15, 2023
3 checks passed
@LiranCohen LiranCohen deleted the lirancohen/453-reduce-db-queries branch August 15, 2023 22:08
diehuxx pushed a commit that referenced this pull request Aug 18, 2023
* main:
  Update tenant-gate.ts (#475)
  Data below a threshold should not be stored in DataStore. (#456)
LiranCohen added a commit to TBD54566975/dwn-sql-store that referenced this pull request Sep 26, 2023
Updating the DWN SDK to the latest version.

This Includes:
 - `encodedData` stored within the `MessageStore` below a certain threshold
 - `MessageStore` interface now accepts multiple Filters to create OR conditions between filters.
 - 'MessageStore' interface accepts sorting and pagination inputs
 - If sorting by `datePublished`, add the property `published` set to true within the filter.
 
 Relevent `dwn-sdk-js` PRs:
 TBD54566975/dwn-sdk-js#478
 TBD54566975/dwn-sdk-js#456

* upgrade kysley for tuple support

* upgrade sdk and queries, first pass -- tests fail

* sorting/pagination

* store encodedData when available, sort and paginate according to messageCid cusror

* added more comments

* added comment about removing encodedData from message before encoding for storage

* bump version

* new release version

* npm i after version bump
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants