-
Notifications
You must be signed in to change notification settings - Fork 249
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf: allow concurrent decompress away from network loop #162
base: master
Are you sure you want to change the base?
Conversation
We want to ensure the transcode passes everything that is expected to be run asynchronously over to the decode loop. In general, memcached calls us back with gotData, then receivedStatus, then complete. We use gotData to launch the work onto the transcode threadpool and return control to memcached, which would immediately then call receivedStatus. Previously, receivedStatus and complete were set up to interact and set a value on the underlying future but only by synchronously blocking for the transcode future. Given this callback is happening nearly immediately after the gotData callback, we were firing the transcode and nearly always performing a blocking get, which triggers a synchronous decompression on the network IO loop. This is of course very detrimental to evcache driver performance, since the driver cannot even accept new requests to issue to them to memcached backends, and must wait until decompression completes. In this fix, we rearrange things a little to ensure that if the async decode is requested, that we push the completion status updates to happen only after the async decode completes. This is a little ugly because of the current arrangement of the memcached decoder. A future change might be to overhaul this integration and pull it out of the memcached transcode framework and use something a bit more friendly.
This change is tricky, actually, since it slightly changes timing it makes the system more scalable without tuning but a little less efficient. It needs some consideration. Maybe an alternative is to always do it in-thread and encourage users to tune poolSize much higher. |
|
||
if (isWrongKeyReturned(key, k)) return; | ||
boolean shouldLog = log.isDebugEnabled() && client.getPool().getEVCacheClientPoolManager().shouldLog(appName); | ||
boolean alwaysSync = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be exposed via a fp to go via async path?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed in 2b60f17
dataSizeDS.record(data.length); | ||
Transcoder<T> transcoder = (tc == null) ? (Transcoder<T>) getTranscoder() : tc; | ||
if (tcService == null) { | ||
log.error("tcService is null, will not be able to decode"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we do latch.countDown() here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed in 2b60f17, I restructured it too. Hopefully reduce chance of bugs like this.
…ecode Previously had a concurrency issue and may have dropped some latch countDown. A refactor to track state differently and tidy up a little helps. Add a property to control sync decode threading behavior
We want to ensure the transcode passes everything that is expected to be run asynchronously over to the decode loop. In general, memcached calls us back with gotData, then receivedStatus, then complete. We use gotData to launch the work onto the transcode threadpool and return control to memcached, which would immediately then call receivedStatus.
Previously, receivedStatus and complete were set up to interact and set a value on the underlying future but only by synchronously blocking for the transcode future. Given this callback is happening nearly immediately after the gotData callback, we were firing the transcode and nearly always performing a blocking get, which triggers a synchronous decompression on the network IO loop. This is of course very detrimental to evcache driver performance, since the driver cannot even accept new requests to issue to them to memcached backends, and must wait until decompression completes.
In this fix, we rearrange things a little to ensure that if the async decode is requested, that we push the completion status updates to happen only after the async decode completes. This is a little ugly because of the current arrangement of the memcached decoder. A future change might be to overhaul this integration and pull it out of the memcached transcode framework and use something a bit more friendly.