Skip to content

Conversation

@rohitagarwal003
Copy link
Contributor

No description provided.

@andrewor14
Copy link
Contributor

ok to test @vanzin

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can avoid all mutability with

val tasks: Seq[Future[_]] = logInfos.grouped(20).map{ batch =>
  replayExecutor.submit(new Runnable {
    override def run(): Unit = mergeApplicationListing(batch)
  })
}

(I know the change to grouped is unrelated, but everytime I look at this code it confuses me for a second why we have overlapping sliding windows -- might as well as clean it up while you are messing around here)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I am a Scala noob - I did it the Java way. :-)
Your suggestion looks much cleaner - I have updated the PR.

@squito
Copy link
Contributor

squito commented Aug 13, 2015

minor style comment, otherwise makes sense to me

@SparkQA
Copy link

SparkQA commented Aug 13, 2015

Test build #40716 has finished for PR 8153 at commit bdb5955.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class QRDecomposition[QType, RType](Q: QType, R: RType)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we're talking about cleaning this up, you could do this also:

logInfos.grouped(20)
  .map { batch =>
    replayExecutor.submit(new Runnable {
      override def run(): Unit = mergeApplicationListing(batch)
    })
  }
  .foreach { task =>
    // Wait for all tasks to finish. This makes sure that checkForLogs is
    // not scheduled again while some tasks are already running in the
    // replayExecutor.
    try {
      task.get()
    } catch {
      case e: InterruptedException =>
        throw e
      case e: Exception =>
        logWarning("Error replaying logs.", e)
    }
  }

Note I added some missing exception handling, which would cause you to revert to the old behavior of piling up executions if an error happened.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I have updated the PR to add exception handling. I have modified the log message because errors while replaying event logs are already caught in the mergeApplicationListing method.

@vanzin
Copy link
Contributor

vanzin commented Aug 13, 2015

In a way I think the underlying issue is more a problem with the aggressive default polling interval (10 seconds?). But this is a way to make it better.

I think in the (not so distant) future we should investigate using the recently added inotify-like API in HDFS, to see whether it helps us avoid polling altogether.

@SparkQA
Copy link

SparkQA commented Aug 13, 2015

Test build #40782 has finished for PR 8153 at commit 249f4ef.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@squito
Copy link
Contributor

squito commented Aug 13, 2015

Jenkins, retest this please

@SparkQA
Copy link

SparkQA commented Aug 14, 2015

Test build #40805 timed out for PR 8153 at commit 249f4ef after a configured wait of 175m.

@SparkQA
Copy link

SparkQA commented Aug 14, 2015

Test build #40857 has finished for PR 8153 at commit 3e22b6c.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@squito
Copy link
Contributor

squito commented Aug 14, 2015

Jenkins, retest this please

@vanzin
Copy link
Contributor

vanzin commented Aug 14, 2015

LGTM.

@SparkQA
Copy link

SparkQA commented Aug 14, 2015

Test build #40900 timed out for PR 8153 at commit cd1ef90 after a configured wait of 175m.

@vanzin
Copy link
Contributor

vanzin commented Aug 14, 2015

retest this please

@SparkQA
Copy link

SparkQA commented Aug 15, 2015

Test build #40933 has finished for PR 8153 at commit cd1ef90.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@rohitagarwal003
Copy link
Contributor Author

Can we retest this please?
The failures are unrelated.

@SparkQA
Copy link

SparkQA commented Aug 15, 2015

Test build #1625 timed out for PR 8153 at commit cd1ef90 after a configured wait of 175m.

@vanzin
Copy link
Contributor

vanzin commented Aug 15, 2015

retest this please

@SparkQA
Copy link

SparkQA commented Aug 16, 2015

Test build #40975 timed out for PR 8153 at commit cd1ef90 after a configured wait of 175m.

@vanzin
Copy link
Contributor

vanzin commented Aug 16, 2015

As usual, flaky tests in other unrelated modules. I'll just give up on jenkins and merge this Monday morning.

@vanzin
Copy link
Contributor

vanzin commented Aug 17, 2015

Merged to master, thanks!

@asfgit asfgit closed this in ed092a0 Aug 17, 2015
tgravescs pushed a commit to tgravescs/spark that referenced this pull request Sep 10, 2015
…are already running.

Author: Rohit Agarwal <rohita@qubole.com>

Closes apache#8153 from mindprince/SPARK-9924.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants