-
Notifications
You must be signed in to change notification settings - Fork 344
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat!: sharded key-value store for fix-length blobs #2685
Conversation
Kudos, SonarCloud Quality Gate passed!
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 3 files at r2, all commit messages.
Reviewable status: 2 of 3 files reviewed, 5 unresolved discussions (waiting on @acud and @mrekucci)
pkg/sharky/shard.go, line 13 at r2 (raw file):
) // location models the location <shard, offset, length> of a chunk
Comment should start with Location
.
pkg/sharky/shard.go, line 36 at r2 (raw file):
index uint8 // index of the shard limit int64 // max number of items in the shard fh *os.File // the file handle the shard is writing data to
I'd suggest using some more meaningful names instead of the fh
(maybe file
) and ffh
(maybe fileFree
).
pkg/sharky/shards.go, line 31 at r2 (raw file):
ErrTooLong = errors.New("data too long") ErrCapacityReached = errors.New("capacity reached")
This variable is unused.
pkg/sharky/shards.go, line 34 at r2 (raw file):
) // models the sharded chunkdb
The doc comment should start with Shards
.
pkg/sharky/shards.go, line 66 at r2 (raw file):
func (s *Shards) Close() error { close(s.quit) errs := []string{}
I'd suggest using multierror
package we're already using.
pkg/sharky/shards.go, line 109 at r2 (raw file):
wg := &sync.WaitGroup{} if ffi.Size() > 0 { frees, err := ioutil.ReadAll(ffh)
The ioutil
is deprecated, use io
instead: https://go.dev/doc/go1.16
pkg/sharky/shards.go, line 188 at r2 (raw file):
} func (s *Shards) Release(ctx context.Context, loc Location) {
If the context.Context
it's not used it shouldn't be passed as param.
pkg/sharky/sharky_test.go, line 121 at r2 (raw file):
} // check location and data consisency
consisency
-> consistency
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 3 files at r2, all commit messages.
Reviewable status: all files reviewed, 16 unresolved discussions (waiting on @acud and @mrekucci)
pkg/sharky/shard.go, line 36 at r2 (raw file):
Previously, mrekucci wrote…
I'd suggest using some more meaningful names instead of the
fh
(maybefile
) andffh
(maybefileFree
).
also possible to abstract this as a ReaderAt/WriterAt
pkg/sharky/shard.go, line 43 at r2 (raw file):
// forever loop processing func (sh *shard) offer(size int64) {
the free slots could be passed here as args and managed as a FIFO queue on disc
pkg/sharky/shard.go, line 72 at r2 (raw file):
case <-sh.quit: // remember free offset in limbo sh.freed <- offset
double allocation issue here
pkg/sharky/shard.go, line 92 at r2 (raw file):
// this condition checks if an offset is in limbo (popped but not used for write op) if writeOps != nil { sh.freed <- offset
double allocation issue here
pkg/sharky/shard.go, line 118 at r2 (raw file):
// again put back offset in limbo if writeOps != nil { sh.freed <- offset
double allocation issue here
pkg/sharky/shard.go, line 134 at r2 (raw file):
free := []int64{} for offset := range sh.freed { free = append(free, offset)
at the places marked with "double allocation issue here", a last-position location reference is remembered as free, it saved here and offered the next time we start the node
solution: check if offset==size if yes
- then not put in freed channel (caveat: need to apply at 5 places)
- not saved (caveat: here size is not available)
- filter at bootup when free slots file is read (caveat: the slot is sstored in the free slots list, even though we know it cannot win)
pkg/sharky/shard.go, line 147 at r2 (raw file):
} if _, err := sh.ffh.Write(frees); err != nil { return err
we should read the whole file in at once.
pkg/sharky/shard.go, line 170 at r2 (raw file):
if err != nil { go func() { sh.freed <- offset
double allocation issue here
pkg/sharky/shards.go, line 95 at r2 (raw file):
return nil, err } size := fi.Size() / DataSize
@acud here size is initialised to a value based on how many blobs actually fit in the file.
pkg/sharky/shards.go, line 120 at r2 (raw file):
for _, offset := range free { offset := offset if offset/DataSize >= size {
this is actually the solution 3 to the double allocation issue here. So it was false alarm. yet this should be commented here. otherwise not clear why this is here.
pkg/sharky/shards.go, line 124 at r2 (raw file):
} wg.Add(1) go func() {
we should probably just pass the variable free
to offer
A function
pkg/sharky/shards.go, line 143 at r2 (raw file):
} sh.wg.Add(2) // initialisation requires so that s.wg.Wait() does not hold prematurely go sh.offer(size)
@acud so here it is. after unclean shutdown, a shard's slots will count as taken and offer
will make sure to generate slots from size upto limit whwn needed,
as long as there is no deletion during the migration only the one free slot in limbo can be lost.
Looking forward to building this into my testnet nodes! Any reason not to do this once it merges? Anything to watch out for specifically? |
060da23
to
26f14b7
Compare
17d88a2
to
5bce3c8
Compare
6f83887
to
8999251
Compare
b16fb40
to
d1be913
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 17 of 29 files at r13, 1 of 2 files at r14, 4 of 8 files at r15, 1 of 1 files at r16, 2 of 7 files at r17.
Reviewable status: 25 of 39 files reviewed, 18 unresolved discussions (waiting on @acud, @aloknerurkar, @mrekucci, and @zelig)
pkg/sharky/shard_slots_test.go, line 1 at r17 (raw file):
package sharky
code license missing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some work still needed on this one
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 7 files at r11, 15 of 29 files at r13, 1 of 2 files at r14, 3 of 8 files at r15, 1 of 1 files at r16, 1 of 7 files at r17, 17 of 17 files at r18, all commit messages.
Reviewable status: all files reviewed, 49 unresolved discussions (waiting on @acud, @aloknerurkar, @mrekucci, and @zelig)
pkg/localstore/disaster_recovery.go, line 28 at r18 (raw file):
// first define the index instance headerSize := 16 + postage.StampSize
This can be a constant.
pkg/localstore/disaster_recovery_test.go, line 16 at r18 (raw file):
func TestRecovery(t *testing.T) { chunkCount := 150
This can be a constant.
pkg/localstore/export.go, line 60 at r18 (raw file):
} ctx := context.Background()
Since the only usage of the context is in db.sharky.Read
I'd suggest moving it directly there without creating a new variable. Also consider using context.TODO()
instead,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
very nice. minor comments
pkg/sharky/store.go
Outdated
s.metrics.TotalReleaseCalls.Inc() | ||
if err == nil { | ||
shard := strconv.Itoa(int(sh.index)) | ||
s.metrics.CurrentShardSize.WithLabelValues(shard).Sub(1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so this should hold the size of shard in actual stored chunks... but it might be also useful to somehow keep track of the actual size of the shard on the disk (i'm not sure if this can be done in some way which is not too ugly since you might need to leak stuff from the slots component). this would allow us to know for example if there's any leakage of the shard (i.e. shard keeps growing while there are free slots), etc
6516040
to
710346e
Compare
1988685
to
f150756
Compare
/approve |
Checklist
Description
This change is![Reviewable](https://camo.githubusercontent.com/1541c4039185914e83657d3683ec25920c672c6c5c7ab4240ee7bff601adec0b/68747470733a2f2f72657669657761626c652e696f2f7265766965775f627574746f6e2e737667)