Better Error Messages - When Pod Creating #1474

aronchick · 2019-06-08T00:09:33Z

I ran several pipelines serially, and the problem was that the containers needed to bind a PVC for each. The PVCs were read only one - so subsequent containers sat around pending, waiting for the PVC to release. Would be great if there was a better error message here than.

elikatsis · 2019-06-10T11:30:18Z

Hello David.

That's weird. First of all, PVCs can be one of the following (info by the official documentation):

ReadWriteOnce (the volume can be mounted as read-write by a single node)
ReadOnlyMany (the volume can be mounted read-only by many nodes)
ReadWriteMany (means the volume can be mounted as read-write by many nodes)

You probably mean that your PVCs where RWO, right?
If yes, then there shouldn't be any pods pending because of that. Instead, they should just be scheduled on the same node (adding load on that node).

On the other hand, if you mean that the access mode was ROM, that would allow pods to be scheduled to any node.

aronchick · 2019-06-19T21:49:31Z

The PVC is RWO - and the second pod that tried to start (both were TFJobs) would fail and crash because the PVC was not mountable. Then it would continue to do so over many minutes, with a CrashLoopBackoff every time, until, eventually, the first job finished, released the PVC, and the second PVC (eventually) restarted and picked it up.

jessiezcc · 2019-07-09T17:34:34Z

@IronPan, should this error handling happen at component level or pipeline level?

stale · 2020-06-25T22:59:51Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Bobgy · 2020-06-30T00:57:13Z

It's now possible checking pod yaml and events directly from KFP UI: #3304

jessiezcc assigned IronPan Jul 9, 2019

jessiezcc added the area/troubleshoot label Aug 1, 2019

stale bot added the lifecycle/stale The issue / pull request is stale, any activities remove this label. label Jun 25, 2020

Bobgy closed this as completed Jun 30, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Better Error Messages - When Pod Creating #1474

Better Error Messages - When Pod Creating #1474

aronchick commented Jun 8, 2019

elikatsis commented Jun 10, 2019

aronchick commented Jun 19, 2019

jessiezcc commented Jul 9, 2019

stale bot commented Jun 25, 2020

Bobgy commented Jun 30, 2020

Better Error Messages - When Pod Creating #1474

Better Error Messages - When Pod Creating #1474

Comments

aronchick commented Jun 8, 2019

elikatsis commented Jun 10, 2019

aronchick commented Jun 19, 2019

jessiezcc commented Jul 9, 2019

stale bot commented Jun 25, 2020

Bobgy commented Jun 30, 2020