Skip to content

Conversation

@juliuszsompolski
Copy link
Contributor

@juliuszsompolski juliuszsompolski commented Jun 26, 2023

What changes were proposed in this pull request?

Add APIs from #41440 to SparkR:

  • addJobTag(tag)
  • removeJobTag(tag)
  • getJobTags()
  • clearJobTags()
  • cancelJobsWithTag()
  • setInterruptOnCancel(tag)

Additionally:

  • fix a bug in removeJobTag when the last tag is removed (should be left with empty tags, not an empty string tag)
  • fix comments to cancelJobsWithTag
  • add a few defensive reinforcements against an empty string tag as a result of missing property / removing last tag.

Why are the changes needed?

SparkR parity.

Does this PR introduce any user-facing change?

Yes, introduce the APIs introduced in Scala in #41440 to SparkR

How was this patch tested?

Added test.

@github-actions github-actions bot added the R label Jun 26, 2023
R/pkg/R/sparkR.R Outdated
Comment on lines 609 to 610
invisible(callJMethod(sc, "getJobTags"))
# TODO: how to return the Scala Set to R??
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@falaki could you help me figure out how getJobTags, which returns a Set[String] in scala, could be made to return an R collection? Or could this API be ignored from R (the main use case is to be able to addJobTag, and then be able to cancelJobsWithTag; the use cases for getting job tags are limited)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be hard to return a perfect data structure from this method. Who is calling this? The caller can convert the data type from an Array to a Set.

Copy link
Member

@HyukjinKwon HyukjinKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They have to be listed in pkg/NAMESPACE and pkg/pkgdown/_pkgdown_template.yml I believe. Otherwise, it should work from a cursory look. (see also 07479b3)

@github-actions github-actions bot added the CORE label Jun 27, 2023
* Cancel active jobs that have the specified tag. See `org.apache.spark.SparkContext.addJobTag`.
*
* @param tag The tag to be added. Cannot contain ',' (comma) character.
* @param tag The tag to be cancelled. Cannot contain ',' (comma) character.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

piggy back this typy that I noticed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mind fixing JavaSparkContext.cancelJobsWithTag too since we're here? I just copied and pasted it 😂

@juliuszsompolski
Copy link
Contributor Author

> addJobTag("foo")
> getJobTags()
Java ref type scala.collection.convert.Wrappers$SetWrapper id 7

will need to investigate how to unwrap it to an R type...

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM.

@HyukjinKwon HyukjinKwon changed the title [SPARK-44195] Add JobTag APIs to SparkR SparkContext [SPARK-44195][R] Add JobTag APIs to SparkR SparkContext Jun 27, 2023
@juliuszsompolski
Copy link
Contributor Author

CI finished with 1 unrelated flake:
https://github.com/juliuszsompolski/apache-spark/actions/runs/5392816496/jobs/9792933924

[info] - SPARK-38187: Run SparkPi Jobs with minMemory *** FAILED *** (3 minutes, 2 seconds)
[info]   The code passed to eventually never returned normally. Attempted 190 times over 3.015093976666667 minutes. Last failure message: 0 did not equal 2. (VolcanoTestsSuite.scala:331)
...
[info] *** 1 TEST FAILED ***
[error] Failed tests:
[error] 	org.apache.spark.deploy.k8s.integrationtest.VolcanoSuite

Copy link
Contributor Author

@juliuszsompolski juliuszsompolski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

forgot to submit yesterday

Comment on lines +880 to +882
if (newTags.isEmpty) {
clearJobTags()
} else {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the new R test found a bug so fixed it

Option(getLocalProperty(SparkContext.SPARK_JOB_TAGS))
.map(_.split(SparkContext.SPARK_JOB_TAGS_SEP).toSet)
.getOrElse(Set())
.filter(!_.isEmpty) // empty string tag should not happen, but be defensive
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... defensively fix the bug twice :-)

@HyukjinKwon
Copy link
Member

Merged to master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants