-
Notifications
You must be signed in to change notification settings - Fork 13.8k
[FLINK-12555] Introduce an encapsulated metric group layout for shuffle API #8485
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
The new metric group layout is still to discuss. |
|
Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community Review Progress
Please see the Pull Request Review Guide for a full explanation of the review process. DetailsThe Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commandsThe @flinkbot bot supports the following commands:
|
|
@flinkbot attention @zentol @zhijiangW |
zentol
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As for the deprecation in the documentation, could you write down a list of all metrics that are being modified, with the original and modified scope?
flink-runtime/src/main/java/org/apache/flink/runtime/taskmanager/Task.java
Outdated
Show resolved
Hide resolved
|
|
||
| @Override | ||
| public void incNumBytesInLocalCounter(long inc) { | ||
| super.incNumBytesInLocalCounter(inc); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's quite easy to encapsulate this logic in a Counter implementation, which would require significantly less overall changes and would no longer introduce a new special case on how metrics are being used.
public static class CounterWrapper implements Counter {
private final Counter counter1;
private final Counter counter2;
private CounterWrapper(Counter counter1, Counter counter2) {
this.counter1 = counter1;
this.counter2 = counter2;
}
@Override
public void inc() {
counter1.inc();
counter2.inc();
}
@Override
public void inc(long n) {
counter1.inc(n);
counter2.inc(n);
}
@Override
public void dec() {
counter1.dec();
counter2.dec();
}
@Override
public void dec(long n) {
counter1.dec(n);
counter2.dec(n);
}
@Override
public long getCount() {
// assume that the counters are not accessed directly elsewhere
return counter1.getCount();
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
true, switched to the counter wrapper
|
Thanks @zentol ! addressed comments |
|
|
||
| @Override | ||
| public long getCount() { | ||
| // assume that the counters are not accessed directly elsewhere |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure whether getCounter would be actually used. If used we should keep the return counter as previous structure. That means calling `new InputChannelMetrics(parentGroup, networkGroup) instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for opening this PR @azagrebin !
TBH I also do not like the way of involving in multiple metric related parameters when introducing ShuffleService#createResultPartitionWriters/InputGates, and we also talked about this issue before. My previous thought and way was also providing only a parent metric in creation methods, and make create Network and buffers groups inside createResultPartitionWriters/InputGates . But the concerns were that the metric groups are created more than once which was not consistent with previous way and this is also relied on the internal implementations of metric framework, so we gave up that way then.
But in this PR we still create the Network metric group twice which stills the same as my previous thought. I am not very clear what is the new thoughts behind this?
Actually there are two issues in this PR:
-
One is avoiding multiple metric parameters in creation of partition/gate. And if we think there are no concerns for creating one metric group twice, we could only pass the
metrics.getIOMetricGroup()to replace all, and create thebuffersandNetworkgroups twice insidecreatResultPartitionWriters/InputGatesseparately. The previous metric structure is no need to change for this motivation. -
Another is we want to change the previous metric structure for shuffle service, that means make previous
buffersgroup under theNetworkgroup. I agree with this point which seems more clearly within the same root parent architecture.
I think it is better to make the first issue as a separate hotfix commit.
There are concerns; it prints a warning to the user as it is not how the API is supposed to be used. If any components need access to a shared group then this group should be created once and passed around as needed. |
|
Thanks for the replies @zentol . But in this PR |
|
True, thanks for noticing it @zhijiangW, it was not an intent. I will change it to create group only once. It can be called e.g. |
|
@zentol
|
|
I'd prefer if the infix were The PR also needs a rebase. |
docs/monitoring/metrics.md
Outdated
| <th rowspan="12"><strong>Task</strong></th> | ||
| <td>numBytesInLocal</td> | ||
| <td>The total number of bytes this task has read from a local source.</td> | ||
| <td>The total number of bytes this task has read from a local source. Deprecated: see NettyShuffle.</td> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's remove the description entirely and put a nice attention icon before the deprecation notice.
docs/monitoring/metrics.md
Outdated
| </tbody> | ||
| </table> | ||
|
|
||
| ### NettyShuffle |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do users actually know what NettyShuffle is?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True users do not know about NettyShuffle, we will still have only one at the moment.
It could be Default network shuffle then.
134fef6 to
97144b2
Compare
| final MetricGroup networkGroup = metrics.getIOMetricGroup().addGroup("Network"); | ||
| final MetricGroup outputGroup = networkGroup.addGroup("Output"); | ||
| final MetricGroup inputGroup = networkGroup.addGroup("Input"); | ||
| MetricGroup taskShuffleMetricGroup = networkEnvironment.createTaskMetricGroup(metrics.getIOMetricGroup()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it mean ShuffleEnvironment interface need provide the method createTaskMetricGroup?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My idea was to expose TaskShuffleContext ShuffleEnvironment.createTaskShuffleContext(taskName, execId, parentMetricGroup) in ShuffleEnvironment . TaskShuffleContext would contain taskName, execId and created in/output metric group as well. createPartition/Gate methods would take the TaskShuffleContext. This way ShuffleEnvironment can create its own metric group once and other task common parameters are encapsulated.
| // similar to MetricUtils.instantiateNetworkMetrics() but inside this IOMetricGroup | ||
| final MetricGroup networkGroup = metricGroup.addGroup("Network"); | ||
| final MetricGroup outputGroup = networkGroup.addGroup("Output"); | ||
| final MetricGroup inputGroup = networkGroup.addGroup("Input"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use METRIC_GROUP_OUTPUT and METRIC_GROUP_INPUT instead of Output and Input, also might define a class var for Network.
|
|
||
| private static final String METRIC_GROUP_NETWORK = "Network"; | ||
| @SuppressWarnings("DeprecatedIsStillUsed") | ||
| @Deprecated private static final String METRIC_GROUP_NETWORK_DEPRECATED = "Network"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Put @Deprecated in separate line?
| private static final String METRIC_INPUT_QUEUE_LENGTH = "inputQueueLength"; | ||
| private static final String METRIC_INPUT_POOL_USAGE = "inPoolUsage"; | ||
|
|
||
| private NettyShuffleMetricFactory() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This constructor seems not necessary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an indicator that it is a utility stateless class with only static methods which is not supposed to be instantiated.
|
Thanks for the updates @azagrebin ! |
083b0e4 to
46901b2
Compare
docs/monitoring/metrics.md
Outdated
| <td>Gauge</td> | ||
| </tr> | ||
| <tr> | ||
| <td rowspan="2">Shuffle.Netty.Output.buffers</td> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
buffers is lower-case here, but upper-case above for input
|
|
||
| // we will have to check type of shuffle service later whether it is NetworkEnvironment | ||
| //noinspection deprecation | ||
| networkEnvironment.registerLegacyNetworkMetrics(metrics.getIOMetricGroup(), resultPartitionWriters, gates); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we intend to remove this in the next release It should be fine to use a cast here, so that we don't have to add such a short-lived method to the interface.
|
Thanks for the reviews @zentol @zhijiangW, I've addressed comments |
| final MetricGroup outputGroup = networkGroup.addGroup("Output"); | ||
| final MetricGroup inputGroup = networkGroup.addGroup("Input"); | ||
| ShuffleIOOwnerContext taskShuffleContext = shuffleEnvironment | ||
| .createShuffleIOOwnerContext(taskNameWithSubtaskAndId, executionId, metrics.getIOMetricGroup()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The formatting : shuffleEnvironment.createShuffleIOOwnerContext in one line and make every parameter in separate line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case we do not break the arg list, the whole function is on the new line, same as chained calls.
| .createShuffleIOOwnerContext(taskNameWithSubtaskAndId, executionId, metrics.getIOMetricGroup()); | ||
|
|
||
| // produced intermediate result partitions | ||
| final ResultPartitionWriter[] resultPartitionWriters = shuffleEnvironment.createResultPartitionWriters( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove final for resultPartitionWriters for consistent with other temp vars?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this file actually looks like more using finals atm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But the vars of taskShuffleContext and following gates are not defined as final. ATM only taskNameWithSubtaskAndId and resultPartitionWriters are final. Or we could make the new added taskShuffleContext as final to keep consistency.
|
|
||
| // create the reader and writer structures | ||
|
|
||
| final String taskNameWithSubtaskAndId = taskNameWithSubtask + " (" + executionId + ')'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might remove this local var to use taskNameWithSubtask + " (" + executionId + ')' directly below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would keep it to not mix concerns in one line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could agree your way to keep it. :)
My previous concern is that taskNameWithSubtaskAndId was used in many places before, so it worths defining a local var. Now it is only used in one place and we could reduce one line to make task constructor short to remove it.
Another small consideration is avoiding final issue as above mentioned.
| * <p>This method has to be called only once to avoid duplicated internal metric group registration. | ||
| * | ||
| * @param ownerName the owner name, used for logs | ||
| * @param executionAttemptID execution attempt id of the producer or consumer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
of the producer or consumer -> of the owner?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the moment we have exec id only for tasks/executions, in this context producer or consumer. I thought about owner as a potentially any shuffle IO component within task/execution, possibly even source/sink
| checkNotNull(ownerName), | ||
| checkNotNull(executionAttemptID), | ||
| parentGroup, | ||
| nettyGroup.addGroup(METRIC_GROUP_INPUT), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be better to put these two addGroup covered by above createShuffleIOOwnerMetricGroup then return tuple2 to reference here because all the metrics related operations should be done inside NettyShuffleMetricFactory
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about it, I think the operation is too primitive atm for an extra utility method. Besides, I think that in general tuple2 is very adhoc and not so readable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I also do not like the way of tuple.
But we provide the method of createShuffleIOOwnerMetricGroup in factory, and it is only used for adding group simple. So it might bring confused that which addGroup should be done inside factory and which should be done outside. The scope/rule seems not clear.
| registerOutputMetrics( | ||
| config.isNetworkDetailedMetrics(), | ||
| outputMetricGroup, | ||
| outputMetricGroup.addGroup(METRIC_GROUP_BUFFERS), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be better to cover all the metric related details in NettyShuffleMetricFactory. I mean the logic of outputMetricGroup.addGroup(METRIC_GROUP_BUFFERS) here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can add overloaded methods for this in NettyShuffleMetricFactory.
|
|
||
| MetricGroup networkInputGroup = ownerContext.getInputGroup(); | ||
| @SuppressWarnings("deprecation") | ||
| InputChannelMetrics inputChannelMetrics = new InputChannelMetrics(networkInputGroup, ownerContext.getParentGroup()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we could delay the creation of InputChannelMetrics in the SingleInputGateFactory#createInputChannels, then the following would be
SingleInputGate inputGate = singleInputGateFactory.create(ownerContext, icdd, partitionProducerStateProvider) instead.
To do so we could make the current createInputGates simple and make all the things done when required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The gate creation loop will create InputChannelMetrics multiple times, I would avoid this because InputChannelMetrics are shared by gates at the moment.
| registerInputMetrics( | ||
| config.isNetworkDetailedMetrics(), | ||
| networkInputGroup, | ||
| networkInputGroup.addGroup(METRIC_GROUP_BUFFERS), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto: for the issue of networkInputGroup.addGroup(METRIC_GROUP_BUFFERS)
|
Thanks for the updates @azagrebin . |
b23888c to
8f13b9d
Compare
|
Thanks @zhijiangW , I have addressed comments |
| * @param parentGroup parent of shuffle specific metric group | ||
| * @return context of the shuffle input/output owner used to create partitions or gates belonging to the owner | ||
| */ | ||
| ShuffleIOOwnerContext createShuffleIOOwnerContext( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would be useful the partition lifecycle management if this method would also accept the JobID..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shuffle service does not need it at the moment, would be nice actually to keep it decoupled from the job id, I would suggest we discuss it at the corresponding partition lifecycle PR and add it if needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see this discussion then: #8687 (comment)
| * @param resultPartitionDeploymentDescriptors descriptors of the partition, produced by the owner | ||
| * @return collection of the {@link ResultPartitionWriter ResultPartitionWriters} | ||
| */ | ||
| Collection<P> createResultPartitionWriters( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
have you considered moving this method and createInputGates into the ShuffleIOOwnerContext, and only pass this into the task instead of the entire ShuffleEnvironment? The task only requires access to these 2 methods, while any other users of the ShuffleEnvironment doesn't need these.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an interesting idea, we can consider it as a follow-up issue.
|
Rebased on #8680 |
…le API and deprecate old one
What is the purpose of the change
At the moment, partition/gate create methods in NetworkEnvironment have a lot of metrics arguments to maintain original layout for metric groups. This approach is not quite encapsulated and clean for shuffle API. We can have just one parent group for shuffle metrics. The old layout can be still maintained in parallel and deprecated. At the moment we can do it with a couple of casts (if shuffle implementation is NetworkEnvironment) and adding an additional legacy metric registration which can be removed later.
Brief change log
NetworkEnvironment.createResultPartitionWriters/createInputGatesto have only one parent metric group argument.NetworkEnvironment.registerLegacyNetworkMetricsTaskof legacy code but add call toNetworkEnvironment.registerLegacyNetworkMetricsafter creation of partitions and gates (later needs instanceOf check for actual ShuffleService)Verifying this change
simple refactoring, existing tests
Does this pull request potentially affect one of the following parts:
@Public(Evolving): (yes)Documentation