Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

store/bucket: wait until chunk loading ends in Close() #6582

Merged
merged 1 commit into from
Aug 16, 2023

Conversation

GiedriusS
Copy link
Member

@GiedriusS GiedriusS commented Aug 4, 2023

The chunk reader needs to wait until the chunk loading ends in Close() because otherwise there will be a race between appending to r.chunkBytes and reading from it. This is because Close() only cancels the context but populateChunk() cannot check the context in a way not to cause a race. So, if the context is canceled between getting data and populating chunks then there's a race.

return errors.Wrap(err, "populate chunk")
}
r.stats.chunksTouched++
r.stats.ChunksTouchedSizeSum += units.Base2Bytes(int(chunkDataLen))

r.block.chunkPool.Put(nb)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, where did this go?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Idea was to put everything into the slice that is Put() back during Close() but it seems like this refactoring was faulty. I removed it.

@GiedriusS GiedriusS force-pushed the wait_until_chunkloading_ends branch from 9bf6078 to 08da27e Compare August 9, 2023 08:14
@pull-request-size pull-request-size bot added size/XS and removed size/S labels Aug 9, 2023
@GiedriusS GiedriusS force-pushed the wait_until_chunkloading_ends branch from 08da27e to cfc6e39 Compare August 9, 2023 09:09
@GiedriusS GiedriusS marked this pull request as draft August 9, 2023 09:53
@GiedriusS GiedriusS force-pushed the wait_until_chunkloading_ends branch from cfc6e39 to 5273244 Compare August 9, 2023 10:51
@pull-request-size pull-request-size bot added size/S and removed size/XS labels Aug 9, 2023
@GiedriusS GiedriusS force-pushed the wait_until_chunkloading_ends branch 4 times, most recently from 1d0523c to e55687c Compare August 9, 2023 13:22
Chunk reader needs to wait until the chunk loading ends in Close()
because otherwise there will be a race between appending to r.chunkBytes
and reading from it.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
@GiedriusS GiedriusS force-pushed the wait_until_chunkloading_ends branch from e55687c to 6eb3eb7 Compare August 9, 2023 14:00
@GiedriusS GiedriusS marked this pull request as ready for review August 9, 2023 14:26
@GiedriusS GiedriusS requested a review from fpetkovski August 9, 2023 14:26
@douglascamata
Copy link
Contributor

This looks like a good place and time to apply https://pkg.go.dev/sync#Cond?

Copy link
Contributor

@douglascamata douglascamata left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaving some suggestions on how I think sync.Cond could be use here. Feel free to decide whether to adopt.

Comment on lines +3164 to +3166
loadingChunksMtx sync.Mutex
loadingChunks bool
finishLoadingChks chan struct{}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
loadingChunksMtx sync.Mutex
loadingChunks bool
finishLoadingChks chan struct{}
loadingChunksCond *sync.Cond
loadingChunks bool

Copy link
Contributor

@fpetkovski fpetkovski Aug 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think sync.Cond is hard to use and understand, and I would prefer to have a simpler solution with a mutex or an atomic variable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fpetkovski that's also a great idea. If we bump our minimum Go version high enough, we could use generic version of atomics to easily swap something of any type.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I guess the critical part here is "waiting for a condition to be fulfilled": lock released when chunks are finally loaded.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An atomic variable won't work because we must wait until the chunk loading finishes. sync.Cond doesn't work because at the end of load() it will signal only once whereas we need a permanent state of either on or off. As far as I can tell, with your suggestion in a normal operation Close() will hang forever because it will never receive a signal.

I think the only way to simplify this is to get rid of the bool and create the channel under the lock. 🤔

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@GiedriusS even with sync.Cond you will see that my suggestions keep the bool variable. The instance of sync.Cond only manages the pause/resume of execution when the bool variable (the condition) fails. You can see it in the suggestion below, where we don't even check the sync.Cond of the bool says blocks are loaded:

r.loadingChunksCond.L.Lock()
if r.loadingChunks {
    r.loadingChunksCond.Wait()
}
r.loadingChunksCond.L.Unlock()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a way, the sync.Cond is only abstracting the channel. We need still the condition's bool variable to be checked.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mhm, so maybe we can merge this now and clean up later to unblock the release?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's go ahead with this then, and iterate on it. 🙂

Comment on lines +3181 to +3184
r.loadingChunksMtx.Lock()
r.loadingChunks = false
r.finishLoadingChks = make(chan struct{})
r.loadingChunksMtx.Unlock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
r.loadingChunksMtx.Lock()
r.loadingChunks = false
r.finishLoadingChks = make(chan struct{})
r.loadingChunksMtx.Unlock()
r.loadingChunksCond = sync.NewCond(&sync.Mutex{})

No need to initialize r.loadingChunks as the default value for a bool is false.

Comment on lines +3188 to +3196
// NOTE(GiedriusS): we need to wait until loading chunks because loading
// chunks modifies r.block.chunkPool.
r.loadingChunksMtx.Lock()
loadingChks := r.loadingChunks
r.loadingChunksMtx.Unlock()

if loadingChks {
<-r.finishLoadingChks
}
Copy link
Contributor

@douglascamata douglascamata Aug 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// NOTE(GiedriusS): we need to wait until loading chunks because loading
// chunks modifies r.block.chunkPool.
r.loadingChunksMtx.Lock()
loadingChks := r.loadingChunks
r.loadingChunksMtx.Unlock()
if loadingChks {
<-r.finishLoadingChks
}
// Locks the condition and wait for a signal.
r.loadingChunksCond.L.Lock()
if r.loadingChunks {
r.loadingChunksCond.Wait()
}
r.loadingChunksCond.L.Unlock()

Comment on lines +3221 to +3232
r.loadingChunksMtx.Lock()
r.loadingChunks = true
r.loadingChunksMtx.Unlock()

defer func() {
r.loadingChunksMtx.Lock()
r.loadingChunks = false
r.loadingChunksMtx.Unlock()

close(r.finishLoadingChks)
}()

Copy link
Contributor

@douglascamata douglascamata Aug 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
r.loadingChunksMtx.Lock()
r.loadingChunks = true
r.loadingChunksMtx.Unlock()
defer func() {
r.loadingChunksMtx.Lock()
r.loadingChunks = false
r.loadingChunksMtx.Unlock()
close(r.finishLoadingChks)
}()
// when done loading, signal to anyone waiting on chunks to be loaded.
defer r.loadingChunksCond.Signal()

Could use r.loadingCunks.Broadcast() here if there could be multiple Go routines waiting for chunks to be loaded.

@saswatamcode saswatamcode merged commit 51da039 into main Aug 16, 2023
harsh-ps-2003 pushed a commit to harsh-ps-2003/thanos that referenced this pull request Aug 22, 2023
Chunk reader needs to wait until the chunk loading ends in Close()
because otherwise there will be a race between appending to r.chunkBytes
and reading from it.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants