-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added error or warning facilitation for unscheduled algs. #1734
Conversation
/deploy |
Adding nodes resource for error message info, WIP.
/deploy |
Added ability for pipeline-driver to detect warnings vs errors, upon unScheduling
/deploy |
/deploy |
/deploy |
Error or warning distinction, format fixes, node graph object updates concerning batch use-case. |
1 similar comment
Error or warning distinction, format fixes, node graph object updates concerning batch use-case. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed all commit messages.
Reviewable status: 0 of 7 files reviewed, 4 unresolved discussions (waiting on @RonShvarz)
core/pipeline-driver/lib/datastore/graph-store.js
line 128 at r5 (raw file):
_handleBatch(n) { const isAnyBatchFailed = n.batch.any(b => b.status === 'FailedScheduling');
anyFailedScheduling
Code quote:
isAnyBatchFailed
core/pipeline-driver/lib/state/state-manager.js
line 121 at r5 (raw file):
if (resources && resources[0] && resources[0].unScheduledAlgorithms) { const algorithms = { ...resources[0].unScheduledAlgorithms, ...resources[0].ignoredUnscheduledAlgorithms }; const nodesFromEtcd = resources[0].nodes;
clusterNodes
Code quote:
nodesFromEtcd
core/pipeline-driver/lib/tasks/task-runner.js
line 123 at r5 (raw file):
n.status = event.reason; n.warnings = n.warnings || []; const { resourceMessage, isError } = this._nodeResourceMessageBuilder(event, nodesFromEtcd);
buildNodeRe....
Code quote:
_nodeResourceMessageBuilder
core/pipeline-driver/lib/tasks/task-runner.js
line 156 at r5 (raw file):
else { resourceMessage += `${unScheduledAlg.complexResourceDescriptor.numUnmatchedNodesBySelector} nodes don't match node selector: '${selectors}',\n`; ({ resourceMessage, isError } = this._specificNodesResourceMessageBuilder(unScheduledAlg, resourceMessage, isError, nodesFromEtcd));
buildSpecif....
Code quote:
specificNodesResourceMessageBuilde
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 7 files reviewed, 5 unresolved discussions (waiting on @RonShvarz)
core/pipeline-driver/lib/tasks/task-runner.js
line 191 at r5 (raw file):
resourceMessage += `${k} ,\n`; const currentValue = resourceIdentifier.get(k); resourceIdentifier.set(k, currentValue + 1);
BreachCountPerResource
Code quote:
resourceIdentifier
CR changes and fixes, Robust over-capacity warning, Unit tests overhaul
/deploy |
/deploy |
…hen it is not the first in the running order
/deploy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 7 files reviewed, 5 unresolved discussions (waiting on @golanha)
core/pipeline-driver/lib/datastore/graph-store.js
line 128 at r5 (raw file):
Previously, golanha (Golan Hallel) wrote…
anyFailedScheduling
Done.
core/pipeline-driver/lib/state/state-manager.js
line 121 at r5 (raw file):
Previously, golanha (Golan Hallel) wrote…
clusterNodes
Done.
core/pipeline-driver/lib/tasks/task-runner.js
line 123 at r5 (raw file):
Previously, golanha (Golan Hallel) wrote…
buildNodeRe....
Done.
core/pipeline-driver/lib/tasks/task-runner.js
line 156 at r5 (raw file):
Previously, golanha (Golan Hallel) wrote…
buildSpecif....
Done.
core/pipeline-driver/lib/tasks/task-runner.js
line 191 at r5 (raw file):
Previously, golanha (Golan Hallel) wrote…
BreachCountPerResource
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 6 files at r3, 3 of 4 files at r6, 1 of 1 files at r7, 1 of 1 files at r8, 1 of 1 files at r9, all commit messages.
Reviewable status:complete! all files reviewed, all discussions resolved (waiting on @RonShvarz)
* Added error or warning facilitation for unscheduled algs. * Fixed unit tests, Adding nodes resource for error message info, WIP. * Edited capital letters in task-executer, Added ability for pipeline-driver to detect warnings vs errors, upon unScheduling * Added warnings and errors in case a node has batches in it. * clearing the node object if there are no failures to schedule * Fix wrong function usage @_handleBatch, CR changes and fixes, Robust over-capacity warning, Unit tests overhaul * Round up values of missing resources to 2 decimal points. * handle use-case of node stuck in prescheduling with failing batches when it is not the first in the running order * fixed formatting for resources .... bump version [skip ci]
* Added error or warning facilitation for unscheduled algs. * Fixed unit tests, Adding nodes resource for error message info, WIP. * Edited capital letters in task-executer, Added ability for pipeline-driver to detect warnings vs errors, upon unScheduling * Added warnings and errors in case a node has batches in it. * clearing the node object if there are no failures to schedule * Fix wrong function usage @_handleBatch, CR changes and fixes, Robust over-capacity warning, Unit tests overhaul * Round up values of missing resources to 2 decimal points. * handle use-case of node stuck in prescheduling with failing batches when it is not the first in the running order * fixed formatting for resources .... bump version [skip ci]
#1638
This change is