[SPARK-25521][SQL]Job id showing null in the logs when insert into command Job is finished. #22572

sujith71955 · 2018-09-27T17:58:34Z

What changes were proposed in this pull request?

As part of insert command in FileFormatWriter, a job context is created for handling the write operation , While initializing the job context using setupJob() API in HadoopMapReduceCommitProtocol , we set the jobid in the Jobcontext configuration.In FileFormatWriter since we are directly getting the jobId from the map reduce JobContext the job id will come as null while adding the log. As a solution we shall get the jobID from the configuration of the map reduce Jobcontext.

How was this patch tested?

Manually, verified the logs after the changes.

sujith71955 · 2018-09-27T18:02:48Z

cc @cloud-fan @srowen
Please review and let me know for any suggestions. Thanks

srowen · 2018-09-27T18:39:48Z

Is the value logged here always null?
I am not sure if it's meaningful to log mapreduce.job.id, especially given its name. If there's no meaningful job ID here do we are about it at all? how about deleting the log?
SparkHadoopWriter does something similar.

sujith71955 · 2018-09-27T18:46:03Z

Is the value logged here always null? - Yes its always printing null.
I am not sure if it's meaningful to log mapreduce.job.id, especially given its name. If there's no meaningful job ID here do we are about it at all? how about deleting the log?
SparkHadoopWriter does something similar.

even initially i thought the same, not sure whether mapreduce.job.id makes sense here, but i think we shall not display null . Deleting the log will be the easiest option but just curious to know why the author is trying to log a map reduce job id .

sujith71955 · 2018-09-27T18:47:19Z

cc @gatorsmile

cloud-fan · 2018-09-28T06:57:58Z

sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormatWriter.scala

we should log something here, but mapreduce.job.id is not useful, how about description.uuid?

SparkHadoopWriter needs a similar change, then, BTW

Thanks for the suggestions!! I will update this PR.
@cloud-fan Yes, i think displaying description.uuid makes more sense as the user can get to know about their particular write job status.
I will also update the message with Write Job instead of Job, Hope thats fine.

sujith71955 · 2018-10-03T18:11:49Z

@srowen @cloud-fan
I was testing the SparkHadoopWriter flow, with below steps and i could see in the log with job id printed properly, so is it fine to update this flow also with description.uuid ? Attaching the snapshot of logs based SparkHadoopWriter flow
val rdd=spark.sparkContext.newAPIHadoopFile("D:/data/x.csv",classOf[org.apache.hadoop.mapreduce.lib.input.NLineInputFormat],classOf[org.apache.hadoop.io.LongWritable],classOf[org.apache.hadoop.io.Text])

val hconf=spark.sparkContext.hadoopConfiguration

hconf.set("mapreduce.output.fileoutputformat.outputdir","D:/data/test")

scala> rdd.saveAsNewAPIHadoopDataset(hconf)

sujith71955 · 2018-10-03T18:15:34Z

When i digged the code i could see in SparkHadoopWriter, while creating job context itself job id is been intialized. Let me know for any suggestions.

cloud-fan · 2018-10-04T02:40:12Z

Can we update the PR to use description.uuid first?

…mmand Job is finished. ## What changes were proposed in this pull request? As part of insert command in FileFormatWriter, a job context is created for handling the write operation , While initializing the job context setupJob() API in HadoopMapReduceCommitProtocol sets the jobid in the Jobcontext configuration, Since we are directly getting the jobId from the map reduce JobContext the job id will come as null in the logs. As a solution we shall get the jobID from the configuration of the map reduce Jobcontext ## How was this patch tested? Manually, verified the logs after the changes.

sujith71955 · 2018-10-04T05:53:59Z

Can we update the PR to use description.uuid first?

Updated FileFormatWriter with description.uuid, attaching the verification snapshot .

sujith71955 · 2018-10-04T05:56:42Z

Can we update the PR to use description.uuid first?

Updated FileFormatWriter with description.uuid, attaching the verification snapshot .

Let me know whether we shall update SparkHadoopWriter.scala flow as in this flow currently the jobid is been displayed properly , to display the job description uuid i need to explore as this flow doesnt holds any WriteJobDescription instance.

cloud-fan · 2018-10-04T05:59:31Z

ok to test

SparkQA · 2018-10-04T07:05:02Z

Test build #96925 has finished for PR 22572 at commit 56c5ff5.

This patch fails due to an unknown error code, -9.
This patch merges cleanly.
This patch adds no public classes.

cloud-fan · 2018-10-04T12:20:58Z

retest this please

SparkQA · 2018-10-04T16:13:46Z

Test build #96938 has finished for PR 22572 at commit 56c5ff5.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

cloud-fan · 2018-10-05T08:52:13Z

thanks, merging to master/2.4!

…ommand Job is finished. ## What changes were proposed in this pull request? ``As part of insert command in FileFormatWriter, a job context is created for handling the write operation , While initializing the job context using setupJob() API in HadoopMapReduceCommitProtocol , we set the jobid in the Jobcontext configuration.In FileFormatWriter since we are directly getting the jobId from the map reduce JobContext the job id will come as null while adding the log. As a solution we shall get the jobID from the configuration of the map reduce Jobcontext.`` ## How was this patch tested? Manually, verified the logs after the changes. ![spark-25521 1](https://user-images.githubusercontent.com/12999161/46164933-e95ab700-c2ac-11e8-88e9-49fa5100b872.PNG) Closes #22572 from sujith71955/master_log_issue. Authored-by: s71955 <sujithchacko.2010@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com> (cherry picked from commit 4597007) Signed-off-by: Wenchen Fan <wenchen@databricks.com>

…ommand Job is finished. ## What changes were proposed in this pull request? ``As part of insert command in FileFormatWriter, a job context is created for handling the write operation , While initializing the job context using setupJob() API in HadoopMapReduceCommitProtocol , we set the jobid in the Jobcontext configuration.In FileFormatWriter since we are directly getting the jobId from the map reduce JobContext the job id will come as null while adding the log. As a solution we shall get the jobID from the configuration of the map reduce Jobcontext.`` ## How was this patch tested? Manually, verified the logs after the changes. ![spark-25521 1](https://user-images.githubusercontent.com/12999161/46164933-e95ab700-c2ac-11e8-88e9-49fa5100b872.PNG) Closes apache#22572 from sujith71955/master_log_issue. Authored-by: s71955 <sujithchacko.2010@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>

…to command Job is finished. apache#22572 As part of insert command in FileFormatWriter, a job context is created for handling the write operation , While initializing the job context using setupJob() API in HadoopMapReduceCommitProtocol , we set the jobid in the Jobcontext configuration.In FileFormatWriter since we are directly getting the jobId from the map reduce JobContext the job id will come as null while adding the log. As a solution we shall get the jobID from the configuration of the map reduce Jobcontext.

cloud-fan reviewed Sep 28, 2018

View reviewed changes

sujith71955 force-pushed the master_log_issue branch from 23a1b06 to 56c5ff5 Compare October 4, 2018 05:49

srowen approved these changes Oct 4, 2018

View reviewed changes

asfgit closed this in 4597007 Oct 5, 2018

[SPARK-25521][SQL]Job id showing null in the logs when insert into command Job is finished. #22572

[SPARK-25521][SQL]Job id showing null in the logs when insert into command Job is finished. #22572

Uh oh!

Conversation

sujith71955 commented Sep 27, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

sujith71955 commented Sep 27, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

srowen commented Sep 27, 2018

Uh oh!

sujith71955 commented Sep 27, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sujith71955 commented Sep 27, 2018

Uh oh!

cloud-fan Sep 28, 2018

Choose a reason for hiding this comment

Uh oh!

srowen Sep 28, 2018

Choose a reason for hiding this comment

Uh oh!

sujith71955 Sep 28, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sujith71955 commented Oct 3, 2018

Uh oh!

sujith71955 commented Oct 3, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cloud-fan commented Oct 4, 2018

Uh oh!

sujith71955 commented Oct 4, 2018

Uh oh!

sujith71955 commented Oct 4, 2018

Uh oh!

cloud-fan commented Oct 4, 2018

Uh oh!

SparkQA commented Oct 4, 2018

Uh oh!

cloud-fan commented Oct 4, 2018

Uh oh!

SparkQA commented Oct 4, 2018

Uh oh!

cloud-fan commented Oct 5, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

sujith71955 commented Sep 27, 2018 •

edited

Loading

sujith71955 commented Sep 27, 2018 •

edited

Loading

sujith71955 commented Sep 27, 2018 •

edited

Loading

sujith71955 Sep 28, 2018 •

edited

Loading

sujith71955 commented Oct 3, 2018 •

edited

Loading