-
Notifications
You must be signed in to change notification settings - Fork 8.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
YARN-11736. Enhance MultiNodeLookupPolicy to allow configuration of extended comparators for better usability. #7121
base: trunk
Are you sure you want to change the base?
Conversation
…xtended comparators for better usability.
💔 -1 overall
This message was automatically generated. |
@yangwwei @sunilgovind @shameersss1 Could you please help to review this PR? Thanks! |
…th the same policy class and improve UT.
💔 -1 overall
This message was automatically generated. |
Hi, @yangjiandan It seems that you were using and contributing for multi-node mechanism recently, could you please help to review this PR? Thanks. |
Sure, will review this week. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add the modification about the configs and new class here : https://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html (you can find the changes to be done for this page in the hadoop code base itself) so that the end users know about its existence.
...he/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java
Show resolved
Hide resolved
...he/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java
Show resolved
Hide resolved
private String policyClassName; | ||
private long sortingInterval; | ||
|
||
public MultiNodePolicySpec(String policyClassName, long timeout) { | ||
public MultiNodePolicySpec(String policyName, String policyClassName, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why both policyName and policyClassName is required here ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In MultiNodeSorter#initPolicy, Policy instance will be created based on policyClassName, and policyName will be used to get conf belong to this policy instance in MultiComparatorPolicy#setConf. This is the only way for every policy instance to know which configuration belong to it, another way is to update the policy interface that I prefer not to use. If there are better approaches, feel free to propose them.
.../java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/placement/MultiNodeSorter.java
Outdated
Show resolved
Hide resolved
} | ||
Configuration policyConf = new Configuration(this.getConfig()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we reuse config object instead of creating new one ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In MultiNodeSortingManager#createAllPolicies, we can see all the MultiNodeSorter instances owns a shared config, policyName will be set in policyConf, which is a instance-level configuration, so that policyInstance can get the configurations belong to itself.
// conf keys and default values | ||
public static final String COMPARATORS_CONF_KEY = "comparators"; | ||
protected static final List<Comparator> DEFAULT_COMPARATORS = Collections | ||
.unmodifiableList(Arrays.asList( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this means the default sorting policy is based on resource utilization, the node having less resource utilization will be given priority
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, default comparators will only be used when there's no comparators configured for this policy or configured incorrectly. Because I believe optimizing the workload distribution among nodes is the primary use case.
} | ||
this.conf = conf; | ||
String policyName = conf.get( | ||
CapacitySchedulerConfiguration.MULTI_NODE_SORTING_POLICY_CURRENT_NAME); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the difference between MULTI_NODE_SORTING_POLICY_CURRENT_NAME
and ``MULTI_NODE_SORTING_POLICY`
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CurrentlyMULTI_NODE_SORTING_POLICY_NAME
is the prefix of all the configurations of multi-node policy: yarn.scheduler.capacity.multi-node-sorting.policy
, it's not a proper name but used in many places so I prefer not to update it. MULTI_NODE_SORTING_POLICY_CURRENT_NAME
is used to transfer the policyName to policy instance.
@shameersss1 There is no introduce about multi-node policy in the document yet, therefore I am unable to update or add any further information. |
@TaoYang526 @shameersss1 I am eager to help, even though I’m not very familiar with this part of YARN. I will do my best. If we can confirm together that the modified code is fine, we can proceed with merging it. I've reviewed @shameersss1 code, and the quality is quite good. |
@slfan1989 Thank you for joining us, it's great to have you help out. |
@shameersss1 Thanks for the review. I have added javadoc for key fields in the last commit. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
@slfan1989 Could you please help to review this PR? Thanks. |
Oh, I will we encoutered the similar problems about multi node placement about different resource spec. And I found some bugs about this features, please refer to: https://zuston.vercel.app/publish/hadoop-yarn/ |
BTW, I think the node sorting policy could be extended by ourself, there is no necessary to change the default policy. |
@zuston Thanks for the feedback.
You are right, I have proposed to fix those bugs in YARN-9598, but it was in dispute after some discussions, and part of it were merged into community in YARN-11573 which you mentioned in your article.
This PR doesn't change the default policy, just add a new policy can be configured to use. |
Thanks for pointing out this, let me take a deep look. Oh, I found the assoicated code. But it may still exist some bugs that my article described. |
@TaoYang526 - Isn't this - https://issues.apache.org/jira/browse/YARN-11728 Same problem as well ? |
Yes, this issue is created by me. |
After reading the In the below refered code, it just will pickup reserved container from another node, but obviously it don't unreserve this in the commit phase, that's the root cause of this bug. From my sight, this bug also exist in the case of multi node placement disable, which is just covered up by the normal node round-robin strategy. FYI, it looks the mechansim of reserved container picked up by another node is not implemented will. Lines 581 to 621 in 964e089
|
@zuston This FYI is just a kind reminder because I thought you are trying to find out the logic about how to allocate on another node for reserved container, according to your article. The bug is still there, and I also mentioned that YARN-9598 tried to fix it but only part of it is merged. It's great if you can help to solve that bug. |
Description of PR
Please refer to JIRA: YARN-11736
How was this patch tested?
UT
For code changes:
LICENSE
,LICENSE-binary
,NOTICE-binary
files?