-
Notifications
You must be signed in to change notification settings - Fork 340
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
localstore: reduce critical section size on gc #1435
Conversation
78ebe84
to
3541c0b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice, thanks!
@@ -109,6 +117,7 @@ func (db *DB) collectGarbage() (collectedCount uint64, done bool, err error) { | |||
done = true | |||
first := true | |||
start := time.Now() | |||
candidates := make([]shed.Item, 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
candidates array should be package level preallocated with gcBatchSize length
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree this will save on allocations, but then the cleanup would be ugly since we'd want to get rid of entries from the previous iteration, so in order to actually save the allocation we'd have to iterate over every item in the slice, setting it to nil
, otherwise we'd have dangling shed.Item
s in memory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @janos
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:) just remember how many you fill in, no need to go through it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could work nicely, just to make sure that every time before the slice is sliced to zero length, it is iterated up to its length and set every element to "zero" Item. Zero item is the item with all slices as nil (important for memory usage) and integers as 0 (not important for memory usage). That way, no dangling Items will exist. Basically a type that would be []shed.Item
with reset
method and contains
method which would replace swrm.Address.MemberOf
as per other @zelig's comment which I agree on, also. Somehow, this container naturally imposes a new type, even for future improvements as the data structure may change, as suggested in the description.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so.... not sure if i understand but is this a blocker?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:) just remember how many you fill in, no need to go through it
This is incorrect, since you will have danging items on the last gc run after which the gc target has been met, leaving dangling shed.Item
s in slice. We therefore have to iterate over the whole slice every time we finish GCing (either in the last iteration, or on every iteration) for those items to not be referenced anymore, making them available for garbage collection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@acud For the first iteration it is also fine as it is, for me.
|
||
// get rid of dirty entries | ||
for _, item := range candidates { | ||
if swarm.NewAddress(item.Address).MemberOf(db.dirtyAddresses) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the other way round, container is usually the receiver struct
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but then we'd need to introduce a new type alias for []swarm.Address
which is something I'd like to avoid
addrs := make([]swarm.Address, 0) | ||
mtx := new(sync.Mutex) | ||
online := make(chan struct{}) | ||
go func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why need a diff go routine? you waiting for it to terminate anyway
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm waiting for it to be scheduled. There's no defer call on the close
. Otherwise we might get test flakes
t.Error(err) | ||
} | ||
i++ | ||
time.Sleep(100 * time.Millisecond) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why sleep here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because Get
s update the gc and this happens in a different goroutine. So we sleep to guarantee that the update happens, resulting in the dirtyAddresses
to be updated
for i := 0; i < chunkCount; i++ { | ||
ch := generateTestRandomChunk() | ||
|
||
_, err := db.Put(context.Background(), storage.ModePutUpload, ch) | ||
if err != nil { | ||
t.Fatal(err) | ||
} | ||
|
||
err = db.Set(context.Background(), storage.ModeSetSync, ch.Address()) | ||
if err != nil { | ||
t.Fatal(err) | ||
} | ||
mtx.Lock() | ||
addrs = append(addrs, ch.Address()) | ||
mtx.Unlock() | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
chunks := generateTestRandomChunks(chunkCount)
_, err := db.Put(context.Background(), storage.ModePutSync, chunks...)
if err != nil {
t.Fatal(err)
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but then we'd still need to create a for loop to get the individual chunk addresses so that we could Set
and add to the addrs
slice
This PR changes the way we do the GC in localstore. Previous approach was locking the database for quite a long time, causing other protocols to slow down for a significant amount of time. This changeset introduces a flag that indicates whether gc is running, and when it does - chunk addresses that are changed in the database are logged to a slice. Once the gc collects enough candidates, it checks the slice for possible dirty candidates, leaving the dirty ones outside the GC round, then commits the result to leveldb.
Tested on our staging cluster and shows an order of magnitude improvement in the time we spend under lock while garbage collection is running.
The slice of dirty addresses can later be optimized to be a cuckoo or a bloom filter.