Skip to content

Conversation

@andrewor14
Copy link
Contributor

... simply because the code is missing!

@tgravescs
Copy link
Contributor

thanks, hadn't had a chance to finish this. I'll try out your changes on my bad history file to make sure it works.

@andrewor14
Copy link
Contributor Author

Great, thanks Tom.

@tgravescs
Copy link
Contributor

well your change fixes that error and it now displays the file on the screen. But it does throw an exception. I think its due to the missing fields in the history file for the job id.

org.json4s.package$MappingException: Did not find value which can be converted into int

@tgravescs
Copy link
Contributor

+1 pending jenkins.

@andrewor14
Copy link
Contributor Author

I see, because in your event log we only logged the name TaskCommitDenied but not the job ID and other details. That makes sense.

@SparkQA
Copy link

SparkQA commented Sep 18, 2015

Test build #42690 has finished for PR 8828 at commit f144147.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 18, 2015

Test build #42703 has started for PR 8828 at commit 29001f9.

@tgravescs
Copy link
Contributor

Yeah and actually I want to do one more verification to make sure the rest of the history file is useful. Unfortunately my network connection is slow that its a huge history file.

@tgravescs
Copy link
Contributor

ok it finally loaded. So the history UI for that task reports its still RUNNING since it got the error parsing it. I guess that is ok. Ideally it would still show end and the task commit error even if it couldn't report the jod id, etc. Is that something we could do fairly easily?

@andrewor14
Copy link
Contributor Author

Hm, unfortunately it appears that we missed two whole minor versions (1.3.0 and 1.4.0). I wonder if we should add some backward compatible handling for those versions. AFAIK they're not really consumed for any other purpose downstream so we can just put -1 as default values for all of them. What do you think about this @vanzin?

@tgravescs
Copy link
Contributor

Also since those values are missing it also causes duration and completed time to not show up. Makes it difficult for users to debug there job. This particular job I was looking ran for 7 hours so I can't just rerun to get the data again.

Andrew Or added 2 commits September 21, 2015 10:58
For logs that did not have the TaskCommitDenied fields, we should
fail gracefully especially since they're not even consumed
downstream by the UI. Otherwise we'll see exceptions in the
history server when parsing old logs (1.3.x, 1.4.x, 1.5.0).
@vanzin
Copy link
Contributor

vanzin commented Sep 21, 2015

The approach looks ok to me if it works (seems this only affects jobs with speculative execution on?).

@andrewor14
Copy link
Contributor Author

Yes I believe so. @tgravescs can you give it another try?

@tgravescs
Copy link
Contributor

yes it only happens with speculation. I'll try it out.

@SparkQA
Copy link

SparkQA commented Sep 21, 2015

Test build #42760 has finished for PR 8828 at commit b89eb37.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@tgravescs
Copy link
Contributor

That is much better, it reports completed times and duration for the entire application those tasks show up as failed with comment: TaskCommitDenied (Driver denied task commit) for job: -1, partition: -1, attempt: -1

+1, Thanks Andrew!

@andrewor14
Copy link
Contributor Author

Alright, merging into master 1.5. Thanks everyone.

@asfgit asfgit closed this in 61d4c07 Sep 22, 2015
@andrewor14 andrewor14 deleted the task-end-reason-json branch September 22, 2015 23:50
asfgit pushed a commit that referenced this pull request Sep 22, 2015
... simply because the code is missing!

Author: Andrew Or <andrew@databricks.com>

Closes #8828 from andrewor14/task-end-reason-json.

Conflicts:
	core/src/main/scala/org/apache/spark/util/JsonProtocol.scala
	core/src/test/scala/org/apache/spark/util/JsonProtocolSuite.scala
asfgit pushed a commit that referenced this pull request Sep 23, 2015
... simply because the code is missing!

Author: Andrew Or <andrew@databricks.com>

Closes #8828 from andrewor14/task-end-reason-json.

Conflicts:
	core/src/main/scala/org/apache/spark/util/JsonProtocol.scala
	core/src/test/scala/org/apache/spark/util/JsonProtocolSuite.scala
ashangit pushed a commit to ashangit/spark that referenced this pull request Oct 19, 2016
... simply because the code is missing!

Author: Andrew Or <andrew@databricks.com>

Closes apache#8828 from andrewor14/task-end-reason-json.

Conflicts:
	core/src/main/scala/org/apache/spark/util/JsonProtocol.scala
	core/src/test/scala/org/apache/spark/util/JsonProtocolSuite.scala

(cherry picked from commit 5ffd084)
ashangit pushed a commit to ashangit/spark that referenced this pull request Oct 19, 2016
... simply because the code is missing!

Author: Andrew Or <andrew@databricks.com>

Closes apache#8828 from andrewor14/task-end-reason-json.

Conflicts:
	core/src/main/scala/org/apache/spark/util/JsonProtocol.scala
	core/src/test/scala/org/apache/spark/util/JsonProtocolSuite.scala

(cherry picked from commit 26187ab)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants