[SPARK-50784][TESTS] Fix `lint-scala` not to ignore `scalastyle` errors #49443

dongjoon-hyun · 2025-01-10T15:22:37Z

What changes were proposed in this pull request?

This PR aims to fix lint-scala script not to ignore scalastyle errors.

Why are the changes needed?

This bug was introduced via the following PR at Apache Spark 3.4.0.

[SPARK-40799] [BUILD] [CONNECT] Enforce scalafmt for Spark Connect module. #38258

After the above PR, lint-scala ignores scalastyle error and only considers the exit code of scalafmt like the following CI result.

[SPARK-50776][K8S][TESTS] Fix test assertions on executor service account #49428 (comment)

Does this PR introduce any user-facing change?

No, this is a dev-only tool change.

How was this patch tested?

Manual review.

Was this patch authored or co-authored using generative AI tooling?

No.

dongjoon-hyun · 2025-01-10T15:32:00Z

cc @grundprinzip, @HyukjinKwon

grundprinzip · 2025-01-10T15:46:39Z

dev/lint-scala

This is a bit of a weird patch. The mvn command should actually continue checking for $? and not for the error output being empty. In particular because it parses for "Unformatted files"

Not the second issue is that the lint-scala should not care about the code in

resource-managers/kubernetes/core/src/test/scala/org/apache/spark/deploy/k8s/features/ExecutorKubernetesCredentialsFeatureStepSuite.scala

The mvn command explicitly just checks for

-pl sql/api \ -pl sql/connect/common \ -pl sql/connect/server \ -pl connector/connect/client/jvm \

grundprinzip

Back when I did the first version of the patch I got push-back to enforce scalastyle on pull requests and therefore this script only enforces Spark Connect related modules.

dongjoon-hyun · 2025-01-10T15:52:03Z

I'm not sure why you say like that with -e.

This is a bit of a weird patch. The mvn command should actually continue checking for $? and not for the error output being empty. In particular because it parses for "Unformatted files"

Not the second issue is that the lint-scala should not care about the code in
resource-managers/kubernetes/core/src/test/scala/org/apache/spark/deploy/k8s/features/ExecutorKubernetesCredentialsFeatureStepSuite.scala
The mvn command explicitly just checks for
    -pl sql/api \
    -pl sql/connect/common \
    -pl sql/connect/server \
    -pl connector/connect/client/jvm \

It's a standard way to stop script if some command exist abnormally during the script. Apache Spark utilizes it in many script. Are you aware of it, @grundprinzip ?

$ git grep 'set -e' | wc -l
      33

grundprinzip · 2025-01-10T15:59:35Z

I'm not sure why you say like that with -e.
This is a bit of a weird patch. The mvn command should actually continue checking for $? and not for the error output being empty. In particular because it parses for "Unformatted files"
Not the second issue is that the lint-scala should not care about the code in
resource-managers/kubernetes/core/src/test/scala/org/apache/spark/deploy/k8s/features/ExecutorKubernetesCredentialsFeatureStepSuite.scala
The mvn command explicitly just checks for
    -pl sql/api \
    -pl sql/connect/common \
    -pl sql/connect/server \
    -pl connector/connect/client/jvm \
It's a standard way to stop script if some command exist abnormally during the script. Apache Spark utilizes it in many script. Are you aware of it, @grundprinzip ?
$ git grep 'set -e' | wc -l
      33

yes using -e exits immediately when any bash command returns a non zero exit code. But this is even true for accessing an undefined variable or so. My point was that I'm not sure why the original command was changed from checking for $? to prepping the output.

The main reason for checking for the error is that the goal of the script is to be actionable and have a good error message that the user knows what to do to fix the error. Just failing the command does not help the user to understand how to fix it.

Lastly, my questions still remains why this came up for the ExecutorKubernetesCredentialsFeatureStepSuite.scala file because it's not in scope for the scalastyle check.

dongjoon-hyun · 2025-01-10T16:05:31Z

Just failing the command does not help the user to understand how to fix it.

This sounds like a wrong claim to me. As I mentioned in the PR description, it fails with the rich information like the following.

$ dev/lint-scala
Using SPARK_LOCAL_IP=localhost
Scalastyle checks failed at following occurrences:
[error] /Users/dongjoon/APACHE/spark-merge/resource-managers/kubernetes/core/src/test/scala/org/apache/spark/deploy/k8s/features/ExecutorKubernetesCredentialsFeatureStepSuite.scala:21:0: There should be no empty line separating imports in the same group.
[error] /Users/dongjoon/APACHE/spark-merge/resource-managers/kubernetes/core/src/test/scala/org/apache/spark/deploy/k8s/features/ExecutorKubernetesCredentialsFeatureStepSuite.scala:21:0: io.fabric8.kubernetes.api.model.PodSpec is in wrong order relative to org.scalatest.BeforeAndAfter.
[error] Total time: 16 s, completed Jan 10, 2025, 7:32:59 AM

$ echo $?
1

This also looks wrong to me. ExecutorKubernetesCredentialsFeatureStepSuite.scala is a Scala code which is in the scope of scalastyle script should cover.

Lastly, my questions still remains why this came up for the ExecutorKubernetesCredentialsFeatureStepSuite.scala file because it's not in scope for the scalastyle check.

dongjoon-hyun · 2025-01-10T16:10:24Z

Please note that scalastyle is the following which uses SBT.

spark/dev/scalastyle

Lines 20 to 29 in 9547a47

    
           SPARK_PROFILES=${1:-"-Pkubernetes -Pyarn -Pspark-ganglia-lgpl -Pkinesis-asl -Phive-thriftserver -Phive -Pvolcano -Pjvm-profiler -Phadoop-cloud"} 
        
           # NOTE: echo "q" is needed because SBT prompts the user for input on encountering a build file 
        
           # with failure (either resolution or compilation); the "q" makes SBT quit. 
        
           ERRORS=$(echo -e "q\n" \ 
        
               | build/sbt \ 
        
                   ${SPARK_PROFILES} \ 
        
                   -Pdocker-integration-tests \ 
        
                   -Pkubernetes-integration-tests \ 
        
                   scalastyle test:scalastyle \

dongjoon-hyun · 2025-01-10T16:48:37Z

Could you review this, @panbingkun ? The CI starts to fail at compilation due to this.

dongjoon-hyun · 2025-01-10T16:50:32Z

Also, cc @huaxingao and @viirya . When you have some time, could you review this PR?

grundprinzip · 2025-01-10T16:56:33Z

Hi @dongjoon-hyun , I was checking the history of the script and it seems it has never failed scalastyle for the past 5 years. See here for example: https://github.com/apache/spark/blob/f0b6245ea4cef0dc61d4277f1d6874ccf315e47a/dev/lint-scala

I'm absolutely in favor of enforcing scalastyle :)

I tested your PR to see if the error message for Spark Connect is still present. If the goal is to enforce Scalastyle then I'm +1 on the PR

dongjoon-hyun · 2025-01-10T16:58:03Z

Thank you, @grundprinzip . Yes, you are right. Previously, we didn't hit this corner case before.

dongjoon-hyun · 2025-01-10T16:59:34Z

Let me check my PR once more~

dongjoon-hyun · 2025-01-10T17:07:54Z

To recover master branch, I merged @panbingkun 's PR first because mine is still incomplete.

[SPARK-50776][K8S][TESTS][FOLLOWUP] Fix code style #49444

dongjoon-hyun · 2025-01-10T17:17:40Z

I'll revisit this PR later Today.

dongjoon-hyun · 2025-01-10T17:55:32Z

This PR is ready back for review.

cnauroth

+1 (non-binding) from my perspective. I wonder if we'll find any other pre-existing violations that need a cleanup first.

Thanks for the patch, @dongjoon-hyun .

dongjoon-hyun · 2025-01-10T20:13:07Z

Thank you, @grundprinzip , @cnauroth , @viirya .

Merged to master.

There is no previous violation because technically Spark SBT build enforces this if we explicitly skip it by the env variable, NOLINT_ON_COMPILE.

### What changes were proposed in this pull request? This PR aims to fix `lint-scala` script not to ignore `scalastyle` errors. ### Why are the changes needed? This bug was introduced via the following PR at Apache Spark 3.4.0. - #38258 After the above PR, `lint-scala` ignores `scalastyle` error and only considers the exit code of `scalafmt` like the following CI result. - #49428 (comment) ![Screenshot 2025-01-10 at 07 14 31](https://github.com/user-attachments/assets/bdaa3be3-5daf-401b-a46f-7c02b7610158) ### Does this PR introduce _any_ user-facing change? No, this is a dev-only tool change. ### How was this patch tested? Manual review. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #49443 from dongjoon-hyun/SPARK-50784. Authored-by: Dongjoon Hyun <dongjoon@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org> (cherry picked from commit 9d4b7a5) Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>

github-actions bot added KUBERNETES BUILD labels Jan 10, 2025

dongjoon-hyun mentioned this pull request Jan 10, 2025

[SPARK-40799] [BUILD] [CONNECT] Enforce scalafmt for Spark Connect module. #38258

Closed

dongjoon-hyun mentioned this pull request Jan 10, 2025

[SPARK-50776][K8S][TESTS] Fix test assertions on executor service account #49428

Closed

grundprinzip reviewed Jan 10, 2025

View reviewed changes

panbingkun mentioned this pull request Jan 10, 2025

[SPARK-50776][K8S][TESTS][FOLLOWUP] Fix code style #49444

Closed

dongjoon-hyun marked this pull request as draft January 10, 2025 17:01

[SPARK-50784][TESTS] Fix lint-scala not to ignore scalastyle errors

279f565

dongjoon-hyun force-pushed the SPARK-50784 branch from 38235c1 to 279f565 Compare January 10, 2025 17:35

github-actions bot removed the KUBERNETES label Jan 10, 2025

dongjoon-hyun marked this pull request as ready for review January 10, 2025 17:37

cnauroth approved these changes Jan 10, 2025

View reviewed changes

viirya approved these changes Jan 10, 2025

View reviewed changes

dongjoon-hyun closed this in 9d4b7a5 Jan 10, 2025

dongjoon-hyun deleted the SPARK-50784 branch January 10, 2025 20:14

[SPARK-50784][TESTS] Fix lint-scala not to ignore scalastyle errors #49443

[SPARK-50784][TESTS] Fix lint-scala not to ignore scalastyle errors #49443

Uh oh!

Conversation

dongjoon-hyun commented Jan 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

dongjoon-hyun commented Jan 10, 2025

Uh oh!

grundprinzip Jan 10, 2025

Choose a reason for hiding this comment

Uh oh!

grundprinzip left a comment

Choose a reason for hiding this comment

Uh oh!

dongjoon-hyun commented Jan 10, 2025

Uh oh!

grundprinzip commented Jan 10, 2025

Uh oh!

dongjoon-hyun commented Jan 10, 2025

Uh oh!

dongjoon-hyun commented Jan 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dongjoon-hyun commented Jan 10, 2025

Uh oh!

dongjoon-hyun commented Jan 10, 2025

Uh oh!

grundprinzip commented Jan 10, 2025

Uh oh!

dongjoon-hyun commented Jan 10, 2025

Uh oh!

dongjoon-hyun commented Jan 10, 2025

Uh oh!

dongjoon-hyun commented Jan 10, 2025

Uh oh!

dongjoon-hyun commented Jan 10, 2025

Uh oh!

dongjoon-hyun commented Jan 10, 2025

Uh oh!

cnauroth left a comment

Choose a reason for hiding this comment

Uh oh!

dongjoon-hyun commented Jan 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[SPARK-50784][TESTS] Fix `lint-scala` not to ignore `scalastyle` errors #49443

[SPARK-50784][TESTS] Fix `lint-scala` not to ignore `scalastyle` errors #49443

dongjoon-hyun commented Jan 10, 2025 •

edited

Loading

dongjoon-hyun commented Jan 10, 2025 •

edited

Loading