Refactor the producer, part 1 #549

eapache · 2015-09-30T15:45:05Z

First stand-alone chunk extracted from #544. This introduces and uses a
produceSet structure which takes care of collecting messages, rejecting ones
which don't encode, and turning the rest into a ProduceRequest.

This has several knock-on effects:

we no longer nil out array entries, so nil checks in several places (e.g.
returnErrors) can be removed
parseResponse gets re-indented since it loops over the partitions in the set
via a callback now rather than a pair of nested loops
groupAndFilter is much simpler, a lot of its logic now lives in the produceSet
the flusher has to use the set to return errors/successes/retries, rather than
batch which may now contain messages not in the eventual request
keyCache and valueCache can be removed

@wvanbergen cc @kvs

First stand-alone chunk extracted from #544. This introduces and uses a produceSet structure which takes care of collecting messages, rejecting ones which don't encode, and turning the rest into a `ProduceRequest`. This has several knock-on effects: - we no longer nil out array entries, so nil checks in several places (e.g. `returnErrors`) can be removed - parseResponse gets re-indented since it loops over the partitions in the set via a callback now rather than a pair of nested loops - groupAndFilter is much simpler, a lot of its logic now lives in the produceSet - the flusher has to use the set to return errors/successes/retries, rather than `batch` which may now contain messages not in the eventual request - keyCache and valueCache can be removed

wvanbergen · 2015-10-05T15:52:25Z

async_producer.go

+			for i := range msgs {
+				msgs[i].Offset = block.Offset + int64(i)
+			}
+			f.parent.returnSuccesses(msgs)


Maybe we can return a success immediately after setting the offset? That prevents a second iteration of the message collection. Might be a micro-optimization that is not worth it.

Ya, not right now anyways, we'd have to split returnSuccesses out into a returnSuccess method too

wvanbergen · 2015-10-05T15:56:08Z

This is a much easier to swallow changeset. 👍

Refactor the producer, part 1

Second stand-alone chunk extracted from #544, (first chunk: #549). This uses the `produceSet` struct in the aggregator as well, and moves the `wouldOverflow` and `readyToFlush` methods to methods on the `produceSet`. Knock-on effects: - now that we do per-partition size tracking in the aggregator we can do much more precise overflow checking (see the compressed-message-batch-size-limit case in `wouldOverflow` which has changed) which will be more efficient in high-volume scenarios - since the produceSet encodes immediately, messages which fail to encode are now rejected from the aggregator and don't count towards batch size - we still have to iterate the messages in the flusher in order to reject those which need retrying due to the state machine; for simplicity I add them to a second produceSet still, which means all messages get encoded twice; this is a definite major performance regression which will go away again in part 3 of this refactor

wvanbergen reviewed Oct 5, 2015
View reviewed changes

eapache added a commit that referenced this pull request Oct 5, 2015

Merge pull request #549 from Shopify/producer-refactor-part-1

f0e1e8a

Refactor the producer, part 1

eapache merged commit f0e1e8a into master Oct 5, 2015

eapache deleted the producer-refactor-part-1 branch October 5, 2015 16:33

eapache mentioned this pull request Oct 5, 2015

Refactor the producer, part 2 #550

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor the producer, part 1 #549

Refactor the producer, part 1 #549

eapache commented Sep 30, 2015

wvanbergen Oct 5, 2015

eapache Oct 5, 2015

wvanbergen commented Oct 5, 2015

Refactor the producer, part 1 #549

Refactor the producer, part 1 #549

Conversation

eapache commented Sep 30, 2015

wvanbergen Oct 5, 2015

Choose a reason for hiding this comment

eapache Oct 5, 2015

Choose a reason for hiding this comment

wvanbergen commented Oct 5, 2015