Skip to content

Conversation

@sryza
Copy link
Contributor

@sryza sryza commented Aug 16, 2014

...d queue doesn't exist

@SparkQA
Copy link

SparkQA commented Aug 16, 2014

QA tests have started for PR 1984 at commit e33a35e.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Aug 16, 2014

QA tests have finished for PR 1984 at commit e33a35e.

  • This patch fails unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • abstract class Serializer
    • abstract class SerializerInstance
    • abstract class SerializationStream
    • abstract class DeserializationStream
    • class ShuffleBlockManager(blockManager: BlockManager,

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that the culprit is that args.amQueue is null. Should we do a null check on that instead?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, is it safe to call appContext.setQueue(null) in L81? (and setApplicationName)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What makes you think that? amQueue gets set with:

var amQueue = sparkConf.get("QUEUE", "default")

So I don't think it can end up null.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see. Many of the other arguments in ClientArguments were initialized to null so I thought this was the same. Then it's fine.

(On an unrelated note initializing the queue to this value makes little sense to me. We don't set the QUEUE attribute anywhere in the code, and creating a fresh SparkConf won't actually pick this config up because it is not prefixed with "spark". This seems to be always initialized to "default")

@andrewor14
Copy link
Contributor

Hi @sryza, thanks for fixing this. I am actually somewhat confused by this because this code has apparently been around for a while, and we have not seen this issue until recently. Do you know what has been added recently that might have caused this to surface?

Also it would be good to fix the equivalent in org/apache/spark/yarn/alpha/ as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this question is more "what do other apps do?" than anything else. But it sounds like the requested queue not existing would be reason for a more explicit failure?

In the very least I'd promote this to logWarning(), but I'm actually thinking just failing the app, since the requested queue not existing might lead to unwanted behavior.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have also run into this before and the message isn't very user friendly as to why the job failed. The job isn't going to run so I think logError and throwing an exception and exiting would be better. It would actually be nice to move this up to validateArgs but I don't want to have to look it up twice either.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not uncommon for YARN to place an app into a queue different than the one requested. The Fair Scheduler has policies for this and this Capacity Scheduler is about to add them. So I don't think failing the job is the right move. On further thought, it actually might make sense to avoid reporting this queue info at all until we can get it from the queue that the app actually ends up in.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean the fair scheduler placement policies. Placement policies can be configured to ignore the queue passed in (or to only ignore it if "default" is specified).

@sryza
Copy link
Contributor Author

sryza commented Aug 18, 2014

Posted a patch that removes the queue resources log message entirely

@SparkQA
Copy link

SparkQA commented Aug 18, 2014

QA tests have started for PR 1984 at commit fe08c37.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Aug 18, 2014

QA tests have finished for PR 1984 at commit fe08c37.

  • This patch passes unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@tgravescs
Copy link
Contributor

The changes lgtm.

asfgit pushed a commit that referenced this pull request Aug 28, 2014
…ste...

...d queue doesn't exist

Author: Sandy Ryza <sandy@cloudera.com>

Closes #1984 from sryza/sandy-spark-3082 and squashes the following commits:

fe08c37 [Sandy Ryza] Remove log message entirely
85253ad [Sandy Ryza] SPARK-3082. yarn.Client.logClusterResourceDetails throws NPE if requested queue doesn't exist

(cherry picked from commit 92af231)
Signed-off-by: Andrew Or <andrewor14@gmail.com>
@asfgit asfgit closed this in 92af231 Aug 28, 2014
@andrewor14
Copy link
Contributor

Thanks, I've merged this into master and 1.1.

xiliu82 pushed a commit to xiliu82/spark that referenced this pull request Sep 4, 2014
…ste...

...d queue doesn't exist

Author: Sandy Ryza <sandy@cloudera.com>

Closes apache#1984 from sryza/sandy-spark-3082 and squashes the following commits:

fe08c37 [Sandy Ryza] Remove log message entirely
85253ad [Sandy Ryza] SPARK-3082. yarn.Client.logClusterResourceDetails throws NPE if requested queue doesn't exist
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants