Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: allow concurrent decompress away from network loop #162

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

jasonk000
Copy link
Member

We want to ensure the transcode passes everything that is expected to be run asynchronously over to the decode loop. In general, memcached calls us back with gotData, then receivedStatus, then complete. We use gotData to launch the work onto the transcode threadpool and return control to memcached, which would immediately then call receivedStatus.

Previously, receivedStatus and complete were set up to interact and set a value on the underlying future but only by synchronously blocking for the transcode future. Given this callback is happening nearly immediately after the gotData callback, we were firing the transcode and nearly always performing a blocking get, which triggers a synchronous decompression on the network IO loop. This is of course very detrimental to evcache driver performance, since the driver cannot even accept new requests to issue to them to memcached backends, and must wait until decompression completes.

In this fix, we rearrange things a little to ensure that if the async decode is requested, that we push the completion status updates to happen only after the async decode completes. This is a little ugly because of the current arrangement of the memcached decoder. A future change might be to overhaul this integration and pull it out of the memcached transcode framework and use something a bit more friendly.

We want to ensure the transcode passes everything that is expected to be
run asynchronously over to the decode loop. In general, memcached calls
us back with gotData, then receivedStatus, then complete. We use gotData
to launch the work onto the transcode threadpool and return control to
memcached, which would immediately then call receivedStatus.

Previously, receivedStatus and complete were set up to interact and set
a value on the underlying future but only by synchronously blocking for
the transcode future. Given this callback is happening nearly immediately
after the gotData callback, we were firing the transcode and nearly
always performing a blocking get, which triggers a synchronous
decompression on the network IO loop. This is of course very detrimental
to evcache driver performance, since the driver cannot even accept new
requests to issue to them to memcached backends, and must wait until
decompression completes.

In this fix, we rearrange things a little to ensure that if the async
decode is requested, that we push the completion status updates to
happen only after the async decode completes. This is a little ugly
because of the current arrangement of the memcached decoder. A future
change might be to overhaul this integration and pull it out of the
memcached transcode framework and use something a bit more friendly.
@jasonk000
Copy link
Member Author

This change is tricky, actually, since it slightly changes timing it makes the system more scalable without tuning but a little less efficient. It needs some consideration. Maybe an alternative is to always do it in-thread and encourage users to tune poolSize much higher.


if (isWrongKeyReturned(key, k)) return;
boolean shouldLog = log.isDebugEnabled() && client.getPool().getEVCacheClientPoolManager().shouldLog(appName);
boolean alwaysSync = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be exposed via a fp to go via async path?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in 2b60f17

dataSizeDS.record(data.length);
Transcoder<T> transcoder = (tc == null) ? (Transcoder<T>) getTranscoder() : tc;
if (tcService == null) {
log.error("tcService is null, will not be able to decode");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we do latch.countDown() here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in 2b60f17, I restructured it too. Hopefully reduce chance of bugs like this.

…ecode

Previously had a concurrency issue and may have dropped some latch countDown.
A refactor to track state differently and tidy up a little helps.

Add a property to control sync decode threading behavior
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants