Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Remove duplication of block heads on delete #3096

Merged
merged 10 commits into from
Oct 3, 2024

Conversation

AndrewSisley
Copy link
Contributor

@AndrewSisley AndrewSisley commented Oct 2, 2024

Relevant issue(s)

Resolves #3085 #3089

Documents #3056 #3086 #3087 (I'm going to close these on merge, no need to have them littering the backlog)

Description

Removes the duplication of head links from delete blocks.

PR also includes the following to save the hassle of multiple test-cid updates:

  • Removes fieldName from composite block deltas
  • Removes the magic _head link name, and extracts head links to a new, optional prop
  • Documents the reasons for duplicating various bits of data in the blockstore blocks as discussed in standup

With the actions defined in TestQueryCommitsWithFieldIDFieldWithUpdate, create block size has been reduced by 4%, and update block size by 7% - this will vary a lot depending on what fields are being updated though, the test chosen to calc was just the first test I found that created one small doc, and updated a single field.

I recommend reviewing commit by commit. The test-cid changes have been pulled out to their own commit.

@AndrewSisley AndrewSisley added bug Something isn't working area/db-system Related to the core system related components of the DB refactor This issue specific to or requires *notable* refactoring of existing codebases and components labels Oct 2, 2024
@AndrewSisley AndrewSisley added this to the DefraDB v0.14 milestone Oct 2, 2024
@AndrewSisley AndrewSisley requested a review from a team October 2, 2024 19:13
@AndrewSisley AndrewSisley self-assigned this Oct 2, 2024
Note how the most recent commit has two heads!  I'm too lazy to bother including the cids, but both links are identical and point to the create composite.
They are added later in MerkleClock.AddDelta - the removed code was just doubling them up.
Now we no longer need to store const string values in the blocks
Copy link

codecov bot commented Oct 2, 2024

Codecov Report

Attention: Patch coverage is 83.33333% with 10 lines in your changes missing coverage. Please review.

Project coverage is 79.79%. Comparing base (e59f6d9) to head (ab66518).
Report is 1 commits behind head on develop.

Files with missing lines Patch % Lines
internal/planner/commit.go 72.73% 6 Missing ⚠️
internal/db/merge.go 63.64% 2 Missing and 2 partials ⚠️
Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #3096      +/-   ##
===========================================
+ Coverage    79.71%   79.79%   +0.08%     
===========================================
  Files          351      351              
  Lines        27358    27335      -23     
===========================================
+ Hits         21806    21810       +4     
+ Misses        4001     3984      -17     
+ Partials      1551     1541      -10     
Flag Coverage Δ
all-tests 79.79% <83.33%> (+0.08%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
internal/core/block/block.go 87.74% <100.00%> (-2.92%) ⬇️
internal/core/crdt/base.go 79.17% <ø> (ø)
internal/core/crdt/composite.go 75.00% <100.00%> (-0.76%) ⬇️
internal/core/crdt/ipld_union.go 85.15% <ø> (+1.49%) ⬆️
internal/db/collection.go 73.62% <ø> (-0.05%) ⬇️
internal/db/collection_delete.go 36.84% <100.00%> (-5.02%) ⬇️
internal/db/fetcher/versioned.go 72.73% <100.00%> (+7.19%) ⬆️
internal/merkle/clock/clock.go 67.91% <100.00%> (+0.28%) ⬆️
internal/merkle/crdt/composite.go 83.33% <ø> (-0.67%) ⬇️
internal/merkle/crdt/merklecrdt.go 83.87% <ø> (-0.50%) ⬇️
... and 3 more

... and 7 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e59f6d9...ab66518. Read the comment docs.

Copy link
Contributor

@islamaliev islamaliev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, Andy. A huge thanks for good comments you left for block fields. Very helpful.

testUtils "github.com/sourcenetwork/defradb/tests/integration"
)

func TestQueryCommits_WithDelete(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: the test name is not quite descriptive. Had to read the test's body in order to understand what it tests.
Wouldn't something like TestQueryCommits_AfterDocDeletion_ShouldStillFetch fit better?

Copy link
Contributor Author

@AndrewSisley AndrewSisley Oct 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what you mean, thanks Islam :)

  • Rename test

Copy link
Member

@shahzadlone shahzadlone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all the documentation and the removal of C haha LGTM

Comment on lines +86 to +89
// This field currently serves no purpose and is duplicating data already held on the target
// block. However we want to have this long term to enable some fancy P2P magic to allow users
// to configure the collection to only sync particular fields using
// [GraphSync](https://github.com/ipfs/go-graphsync) which will need to make use of this property.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: You could just say that this field is needed for graphSync and sync optimisation instead of (not sure this was you intention but this is what I get from this) showing through the comment that you are annoyed that the field is still there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No intended annoyance in the wording, I do feel that it is important to note that the current lack of purpose is known and somewhat deliberate though otherwise people (including future me) will assume the lack of purpose is accidental and try and remove it,

@AndrewSisley AndrewSisley merged commit 4e5470c into sourcenetwork:develop Oct 3, 2024
42 of 43 checks passed
@AndrewSisley AndrewSisley deleted the 3085-block-dedup branch October 3, 2024 13:26
@AndrewSisley AndrewSisley linked an issue Oct 3, 2024 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/db-system Related to the core system related components of the DB bug Something isn't working refactor This issue specific to or requires *notable* refactoring of existing codebases and components
Projects
None yet
Development

Successfully merging this pull request may close these issues.

FieldName is held on composite and field level blocks Block store duplicates lots of data
4 participants