Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: localstore transaction #4626

Merged
merged 57 commits into from
May 9, 2024
Merged

refactor: localstore transaction #4626

merged 57 commits into from
May 9, 2024

Conversation

istae
Copy link
Member

@istae istae commented Mar 27, 2024

Checklist

  • I have read the coding guide.
  • My change requires a documentation update, and I have done it.
  • I have added tests to cover my changes.
  • I have filled out the description and linked the related issues.

Description

An extensive cleanup of the db transaction.
Every localstore write operation now utilizes a transaction. Calls to the chunkstore are also part of the transaction.

also tackled: evict just enough chunks of a batch to fall below the reserve capacity

TODO:
revert reserve size change
revert reserve wake up duration

Open API Spec Version Changes (if applicable)

Motivation and Context (Optional)

Related Issue (Optional)

closes #4341 #4538 #4636 #4633

Screenshots (if appropriate):

@istae istae force-pushed the localstore-transaction branch 2 times, most recently from db4ead4 to c85ea95 Compare April 15, 2024 11:45
@istae istae marked this pull request as ready for review April 15, 2024 15:43
Copy link
Member

@janos janos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changeset in this PR is quite large +3481 −5890 lines across 117 files. I am not able to validate all the changes for correctness as I would need more context on them. Some of the changes look to me not related to localstore transactions. I think that an interactive session on this PR would be more efficient than submitting comments. Esad, would you consider doing an interactive session?

cmd/bee/cmd/db.go Show resolved Hide resolved
if err != nil {
logger.Error(err, "getting sleep value failed")
}
defer func() { time.Sleep(d) }()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the advantage of sleeping after the operation successfully completes?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a infra requirement, to prevent the pod from restarting when it quits

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add a comment then.

pkg/storer/storer.go Show resolved Hide resolved
pkg/api/stewardship.go Show resolved Hide resolved
pkg/file/joiner/joiner_test.go Show resolved Hide resolved
pkg/node/statestore.go Outdated Show resolved Hide resolved
pkg/storer/internal/transaction/transaction.go Outdated Show resolved Hide resolved
pkg/node/node.go Outdated Show resolved Hide resolved
pkg/puller/puller.go Outdated Show resolved Hide resolved
pkg/storer/internal/cache/cache.go Show resolved Hide resolved
Copy link
Member

@janos janos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice approach on the transaction handling. It is an improvement, for sure.

The PR is rather large and it needs a lot of attention and with so many changes it is something harder to see if a change is related to the transaction or general code improvement. It would be good to separate code improvements from the new transaction into another PR, but that would require a lot of additional work. It is crucial to have do extensive correctness and load testing on integration.

pkg/node/statestore.go Outdated Show resolved Hide resolved
pkg/puller/puller_test.go Show resolved Hide resolved
@@ -371,13 +371,10 @@ func (s *Service) retrieveChunk(ctx context.Context, quit chan struct{}, chunkAd

func (s *Service) prepareCredit(ctx context.Context, peer, chunk swarm.Address, origin bool) (accounting.Action, error) {

creditCtx, cancel := context.WithTimeout(ctx, time.Second)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this timeout removed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it should be safe to wait as long as the context does not timeout, the one second is very arbitrary

pkg/pushsync/pushsync.go Show resolved Hide resolved
pkg/storer/internal/cache/cache.go Show resolved Hide resolved
pkg/storer/internal/cache/cache_test.go Show resolved Hide resolved
pkg/storer/internal/chunkstore/chunkstore.go Outdated Show resolved Hide resolved
pkg/storer/internal/reserve/items.go Outdated Show resolved Hide resolved
pkg/storer/internal/upload/uploadstore.go Outdated Show resolved Hide resolved
pkg/storer/internal/transaction/mem.go Outdated Show resolved Hide resolved
Copy link
Member

@janos janos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing the comments.

@istae istae merged commit 52c2475 into master May 9, 2024
14 checks passed
@istae istae deleted the localstore-transaction branch May 9, 2024 16:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

replace storer transactions with leveldb batches
3 participants