Skip to content

Conversation

@azagrebin
Copy link
Contributor

@azagrebin azagrebin commented Jun 24, 2019

What is the purpose of the change

ResultPartitionDeploymentDescriptor#releasedOnConsumption shows the intention how the partition is going to be used by the shuffle user and released. The ShuffleDescriptor should provide a way to query which release type is supported by shuffle service for this partition. If the requested release type is not supported by the shuffle service for a certain type of partition, the job should fail fast.

Brief change log

  • Introduce ShuffleDescriptor#ReleaseType.AUTO and MANUAL
  • Introduce ShuffleDescriptor#getSupportedReleaseTypes
  • Add assertion whether the auto-release on consumption or manual are supported for the requested ResultPartitionDeploymentDescriptor#releaseType
  • adjust tests add test for the assertion

Verifying this change

Simple refactoring covered by existing tests, plus ResultPartitionDeploymentDescriptorTest#testIncompatibleReleaseTypeManual

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): (no)
  • The public API, i.e., is any changed class annotated with @Public(Evolving): (no)
  • The serializers: (no)
  • The runtime per-record code paths (performance sensitive): (no)
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)
  • The S3 file system connector: (no)

Documentation

  • Does this pull request introduce a new feature? (no)
  • If yes, how is the feature documented? (not applicable)

@flinkbot
Copy link
Collaborator

flinkbot commented Jun 24, 2019

Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community
to review your pull request. We will use this comment to track the progress of the review.

Review Progress

  • ❓ 1. The [description] looks good.
  • ❓ 2. There is [consensus] that the contribution should go into to Flink.
  • ❗ 3. Needs [attention] from.
  • ❓ 4. The change fits into the overall [architecture].
  • ❓ 5. Overall code [quality] is good.

Please see the Pull Request Review Guide for a full explanation of the review process.

Details
The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands
The @flinkbot bot supports the following commands:

  • @flinkbot approve description to approve one or more aspects (aspects: description, consensus, architecture and quality)
  • @flinkbot approve all to approve all aspects
  • @flinkbot approve-until architecture to approve everything until architecture
  • @flinkbot attention @username1 [@username2 ..] to require somebody's attention
  • @flinkbot disapprove architecture to remove an approval you gave earlier

@azagrebin
Copy link
Contributor Author

@flinkbot attention @zentol @zhijiangW

Copy link
Contributor

@zentol zentol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor comments.

boolean isReleasedOnConsumption,
boolean isBlockingPartition) {
Preconditions.checkArgument(
!isReleasedOnConsumption || isBlockingPartition,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Long-term we probably want to change this a bit; this kind of setup forces users of the shuffle service (i.e. the ones creating a PartitionDescriptor) to be fully aware of it's capabilities.

Basically, both the partition and shuffle descriptor describe an is-state; this partition must be released on consumption, and we expect the shuffle service to fail if it can't fulfill that.

I'd like to see this changed at some point so that the partition descriptor (or some other form of input) describes a wish instead, to facilitate something like:

Scheduler: "hey shuffle service, if you could persist this partition, that would be great."
ShuffleService: "Sorry kiddo, this one must be released on consumption."
Scheduler: "oh alright then, I'll just adjust my scheduling."

The usefulness of this can be easily shown with a simple though experiment: assume there exists a shuffle service implementation that supports pipelined partitions that are consumable multiple times.
Currently, it would be impossible for the runtime to ever make use of this flag, so long as we don't resort to type checks.

@azagrebin azagrebin closed this Jun 25, 2019
@azagrebin azagrebin reopened this Jun 25, 2019
@azagrebin azagrebin force-pushed the FLINK-12960 branch 2 times, most recently from 3281582 to 99fe0ab Compare June 25, 2019 09:56
releasedOnConsumption ?
shuffleDescriptor.getSupportedReleaseTypes().contains(ReleaseType.AUTO) :
shuffleDescriptor.getSupportedReleaseTypes().contains(ReleaseType.MANUAL),
"Release on consumption <%s> is not supported by the shuffle service for this partition, " +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's include the result partition ID as well

* Manually release the partition, the partition has to support consumption multiple times.
*
* <p>The partition requires manual actions to release it once all consumption is done:
* {@link ShuffleMaster#releasePartitionExternally(ShuffleDescriptor)} and {@link ShuffleEnvironment#releasePartitionsLocally(Collection)}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is slight ambiguous as it isn't whether the local resource condition applies to just the ShuffleEnvironment call or both of them.
I'd suggest something along the lines of ShuffleMaster[...] and, if [...], ShuffleEnvironment [...]

sendScheduleOrUpdateConsumersMessage,
// Later we might have to make the scheduling adjust automatically
// if certain release type is not supported by shuffle service implementation at hand
partitionDescriptor.getPartitionType() != ResultPartitionType.BLOCKING);
Copy link
Contributor

@zhijiangW zhijiangW Jun 25, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in future the releasedOnConsumption could be refactored into one ReleaseStrategy as proposed in #8804 , then it could be configured which strategy is used in practice (not only limited to ResultPartitionType) and check whether this config is consistent with shuffle implementation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, the PR just leaves it as it was before at the moment

int maxParallelism,
boolean sendScheduleOrUpdateConsumersMessage,
boolean releasedOnConsumption) {
checkReleaseOnConsumptionIsSupportedForPartition(shuffleDescriptor, releasedOnConsumption);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

checkNotNull(partitionDescriptor) should be put here.

Optional<ResourceID> storesLocalResourcesOn();

/**
* Return release types supported by Shuffle Service for this partition.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove for this partition.
by this shuffle service.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why? if we leave by this shuffle service then it might sound like for all partitions but this method returns supported release types for certain partition.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There might exist two options:

  • One ShuffleService implementation might provide different ShuffleDescriptor implementations for different partitions with different release types. Then the previous comment makes sense.

  • One ShuffleService implementation would only have one kind of ShuffleDescriptor and provide one enum of release types for all the partitions. Just like atm we only have NettyShuffleDescriptor implementation which provides the same release types for all partitions. So my above comment suggestion was based on this. The release type is ShuffleService global level suitable for all partitions.

Copy link
Contributor

@zentol zentol Jun 25, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The NettyShuffleDescriptor does not have the same release types for all partitions. For non-blocking partitions we do not support ReleaseType.MANUAL.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I have no other concerns here now.

* <p>This call triggers release of any resources which are occupied by the given partition in the external systems
* outside of the producer executor. This is mostly relevant for the batch jobs and blocking result partitions.
* This method is not called if {@link ResultPartitionDeploymentDescriptor#isReleasedOnConsumption()} is {@code true}.
* The partition has to support the {@link ReleaseType#MANUAL} in {@link ShuffleDescriptor#getSupportedReleaseTypes()}.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In theory different ShuffleManager implementations could choose which release strategy is supported, we should not limit all the implementations must support one type, otherwise we could make this limitation reflected in interface methods.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal of PR is to let shuffle services decide what to support in ShuffleDescriptor#getSupportedReleaseTypes. This doc comment line is related to when the manual release is possible (releasePartitionLocally/releasePartitionExternally).

Or do you mean, that we implicitly assume at the moment that ReleaseType#MANUAL has to be always supported for blocking partitions? For this, we have assertion ResultPartitionDeploymentDescriptor#checkReleaseOnConsumptionIsSupportedForPartition which will fail the job fast. It is true, we have to make scheduling/release strategies automatically adjust later if something is not supported as written in the comment to the hardcoded releasedOnConsumption = partitionDescriptor.getPartitionType() != ResultPartitionType.BLOCKING.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I have not noticed that this javadoc change was for the method of releasePartitionsExternally, if so I agree it makes sense that this method implementation is for MANUL release.
Before I thought the new info was added on the class javadoc, that means all the ShuffleMaster implementation must support MANUL release type, so I pointed out that concern.

Copy link
Contributor

@zhijiangW zhijiangW left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for opening this PR @azagrebin !

I only reviewed the overall codes and left some minor comments, might further think it if possible.

@azagrebin
Copy link
Contributor Author

Thanks for the review @zentol @zhijiangW ! I updated the PR.

createCopyAndVerifyResultPartitionDeploymentDescriptor(shuffleDescriptor);

ShuffleDescriptor shuffleDescriptorCopy = copy.getShuffleDescriptor();
ShuffleDescriptor shuffleDescriptorCopy = CommonTestUtils.createCopySerializable(shuffleDescriptor);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this change should be a separate hotfix

1,
0),
NettyShuffleDescriptorBuilder.newBuilder().buildLocal(),
NettyShuffleDescriptorBuilder.newBuilder().setBlocking(true).buildLocal(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be proper for using partitionType.isBlocking() instead of direct true

return this;
}

public NettyShuffleDescriptorBuilder setBlocking(boolean blocking) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

setIsBlocking(boolean isBlocking) ?

…est serialization of UnknownShuffleDescriptor without ResultPartitionDeploymentDescriptor
* by {@link ShuffleMaster#releasePartitionExternally(ShuffleDescriptor)}
* and {@link ShuffleEnvironment#releasePartitionsLocally(Collection)}.
*
* <p>The partition has to support the corresponding {@link ReleaseType} in
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might be better to indicate which value true/false should support which releaseType, otherwise it seems not clear the mapping between this value and release type.

@zhijiangW
Copy link
Contributor

@azagrebin I have no other concerns except some nit comments.

@azagrebin azagrebin changed the title [FLINK-12960][coordination][shuffle] Move ResultPartitionDeploymentDescriptor#releasedOnConsumption to PartitionDescriptor#releasedOnConsumption [FLINK-12960] Introduce ShuffleDescriptor#ReleaseType and ShuffleDescriptor#getSupportedReleaseTypes Jun 26, 2019
…aseType and ShuffleDescriptor#getSupportedReleaseTypes

`ResultPartitionDeploymentDescriptor#releasedOnConsumption` shows the intention how the partition is going to be used by the shuffle user and released. The `ShuffleDescriptor` should provide a way to query which release type is supported by shuffle service for this partition. If the requested release type is not supported by the shuffle service for a certain type of partition, the job should fail fast.
@zentol zentol merged commit d1ce804 into apache:master Jun 26, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants