Remove extra channel hop from source output path #10912
Labels
domain: performance
Anything related to Vector's performance
domain: sources
Anything related to the Vector's sources
domain: topology
Anything related to Vector's topology code
type: enhancement
A value-adding code change that enhances its existing functionality.
The way that sources currently plug into the topology is a bit of a special case. They are passed a
SourceSender
when they're built, which is essentially a glorified channelSender
. The correspondingReceiver
for that channel lives in what we call a "pump" task, which has the simple job of shoveling events written to that channel into the source'sFanout
instance. TheFanout
instance itself contains all of the inputs configured to draw from that particular source. An "input" in this case refers to what's basically another glorified channelSender
whoseReceiver
feeds the downstream component task.So at a high level, the path of an event out of a source looks like something like:
This has a number of downsides:
RunningTopology
struct, adding complexity and making it harder to reason aboutOne potential solution here is to embed the fanout directly within
SourceSender
. This would remove the extra hop and the extra task, hopefully improving efficiency and making sources less of a special case in how they integrate with the topology.That change would require quite a bit of work to
Fanout
, mostly revolving around the requirement ofSourceSender
to beClone
. Internally,Fanout
is mostly a collection offutures::Sink
s and aControlMessage
channel receiver. The sinks can be addressed relatively easily by changing the bound toCloneableSink
from the buffers crate (they're already channel senders in the common case, which areClone
, orTapSink
which can easily be made so).The channel receiver is a bit more complex. Its purpose is to receive
ControlMessage
s that add or remove sinks during config reloads. It's currently an unbounded channel to account for the fact that in theory an unbounded number of reloads could between times that the channel is read, which is currently only when an event is received. We could simply switch it to an unbounded broadcast channel if we found a suitable implementation (I haven't seen one), or a bounded broadcast channel with a bound high enough to be effectively unbounded (this might be ok, reloads are rare), but it could be worth taking the time here to rethink the pattern more deeply.There are some warts that make the current fanout implementation confusing:
Sink
traiti
struct member for tracking sink index across pollsReplace
control messageIt's possible that by addressing those warts in some way we end up in a situation where we no longer need or want an unbounded broadcast channel in order to make
Fanout
cloneable, but the required adjustments to the mechanics of config reloading are probably outside the scope of this issue.The text was updated successfully, but these errors were encountered: