Skip to content

Conversation

@Yikun
Copy link
Member

@Yikun Yikun commented Mar 11, 2022

What changes were proposed in this pull request?

Check resource after yaml resource creation to make sure resource exsiting.

Why are the changes needed?

The yaml create is async, we'd better to make sure the resource can be got from server.

Does this PR introduce any user-facing change?

No

How was this patch tested?

[info] VolcanoSuite:
[info] - Run SparkPi with volcano scheduler (11 seconds, 468 milliseconds)
[info] - SPARK-38187: Run SparkPi Jobs with minCPU (30 seconds, 636 milliseconds)
[info] - SPARK-38187: Run SparkPi Jobs with minMemory (31 seconds, 544 milliseconds)
[info] - SPARK-38188: Run SparkPi jobs with 2 queues (only 1 enabled) (13 seconds, 333 milliseconds)
[info] - SPARK-38188: Run SparkPi jobs with 2 queues (all enabled) (25 seconds, 486 milliseconds)
[info] - SPARK-38423: Run driver job to validate priority order (16 seconds, 488 milliseconds)
[info] Run completed in 2 minutes, 14 seconds.
[info] Total number of tests run: 6
[info] Suites: completed 2, aborted 0
[info] Tests: succeeded 6, failed 0, canceled 0, ignored 0, pending 0
[info] All tests passed.
[success] Total time: 163 s (02:43), completed 2022-3-11 21:40:53

@Yikun
Copy link
Member Author

Yikun commented Mar 11, 2022

The test passed also based on #35819

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please hold on until we can decide what volcano version we can depend on.

For a record, the old cluster with old volcano system works fine.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although it looks more protective, what happens for the following?

  1. deleteYAMLResource is also async and we don't do double-check.
  2. This is using createOrReplace.
  3. Given (1) and (2), we don't know it's replaced or not at line 116.

In addition, what is the behavior of Volcano, for the pod group submission with non-existing queue name?

  • If Volcano hangs, we don't need this patch at all.
  • If Volcano assigns the PodGroup to default queue, it sounds like a bug to me.

@Yikun
Copy link
Member Author

Yikun commented Mar 12, 2022

deleteYAMLResource is also async and we don't do double-check.

deleteYAMLResource is sync operation, I think it is need, to make sure keep the independence of each test. Just like other test resource cleanup in K8S IT.

In addition, what is the behavior of Volcano, for the pod group submission with non-existing queue name?

Just like native sheduler, the driver will be pending status.

If Volcano hangs, we don't need this patch at all.

Yes, without this patch, all tests are also passed as before. This patch is not only for queue (such as we also create priorities resource), It's just for more protective in my view. so, I will give +1 for your choice, merge or not merge is both fine for me. : )

@dongjoon-hyun
Copy link
Member

Thank you. In that case, let me close this because pending is enough. Sorry for closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants