-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-32249][INFRA][2.4] Run Github Actions builds in branch-2.4 #29465
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This comment has been minimized.
This comment has been minimized.
86e2ccf to
ea8c487
Compare
This comment has been minimized.
This comment has been minimized.
| java: | ||
| - 1.8 | ||
| hadoop: | ||
| - hadoop2.6 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hadoop 2.6 is default.
| - 1.8 | ||
| hadoop: | ||
| - hadoop2.6 | ||
| # TODO(SPARK-32246): We don't test 'streaming-kinesis-asl' for now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hive 2.3 does not exist
a27f7d6 to
69f2509
Compare
| streaming, sql-kafka-0-10, streaming-kafka-0-10, | ||
| mllib-local, mllib, | ||
| yarn, mesos, kubernetes, hadoop-cloud, spark-ganglia-lgpl, | ||
| streaming-flume, streaming-flume-sink, streaming-kafka-0-8 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Modules: streaming-flume, streaming-flume-sink and streaming-kafka-0-8 were not removed at that time.
.github/workflows/master.yml
Outdated
| with: | ||
| python-version: pypy3 | ||
| architecture: x64 | ||
| - name: Install Python 2.7 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Python 2.7 is tested in branch-2.4.
| - name: Run documentation build | ||
| run: | | ||
| cd docs | ||
| jekyll build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
JDK 11 is not supported in branch-2.4.
69f2509 to
c043e0e
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
c043e0e to
4400c16
Compare
This comment has been minimized.
This comment has been minimized.
4400c16 to
b57b290
Compare
| if: contains(matrix.modules, 'pyspark') || (contains(matrix.modules, 'sql') && !contains(matrix.modules, 'sql-')) | ||
| run: | | ||
| python2.7 -m pip install numpy pyarrow pandas scipy xmlrunner | ||
| python2.7 -m pip list |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In branch-2.4, python 2 is installed last so the default python is Python 2.
This is because we use python shebang in most of scripts, and Python 3 was not supported at that time in these scripts.
This comment has been minimized.
This comment has been minimized.
96331e6 to
7e6efd9
Compare
| - >- | ||
| sparkr | ||
| - >- | ||
| sql |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
org.apache.spark.tags.ExtendedSQLTest does not exist in branch-2.4 (SPARK-29191). So, SQL tests are not split in branch-2.4. It's split in branch-3.0 and master.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This PR aims to run the Spark tests in Github Actions. To briefly explain the main idea: - Reuse `dev/run-tests.py` with SBT build - Reuse the modules in `dev/sparktestsupport/modules.py` to test each module - Pass the modules to test into `dev/run-tests.py` directly via `TEST_ONLY_MODULES` environment variable. For example, `pyspark-sql,core,sql,hive`. - `dev/run-tests.py` _does not_ take the dependent modules into account but solely the specified modules to test. Another thing to note might be `SlowHiveTest` annotation. Running the tests in Hive modules takes too much so the slow tests are extracted and it runs as a separate job. It was extracted from the actual elapsed time in Jenkins:  So, Hive tests are separated into to jobs. One is slow test cases, and the other one is the other test cases. _Note that_ the current GitHub Actions build virtually copies what the default PR builder on Jenkins does (without other profiles such as JDK 11, Hadoop 2, etc.). The only exception is Kinesis https://github.com/apache/spark/pull/29057/files#diff-04eb107ee163a50b61281ca08f4e4c7bR23 Last week and onwards, the Jenkins machines became very unstable for many reasons: - Apparently, the machines became extremely slow. Almost all tests can't pass. - One machine (worker 4) started to have the corrupt `.m2` which fails the build. - Documentation build fails time to time for an unknown reason in Jenkins machine specifically. This is disabled for now at apache#29017. - Almost all PRs are basically blocked by this instability currently. The advantages of using Github Actions: - To avoid depending on few persons who can access to the cluster. - To reduce the elapsed time in the build - we could split the tests (e.g., SQL, ML, CORE), and run them in parallel so the total build time will significantly reduce. - To control the environment more flexibly. - Other contributors can test and propose to fix Github Actions configurations so we can distribute this build management cost. Note that: - The current build in Jenkins takes _more than 7 hours_. With Github actions it takes _less than 2 hours_ - We can now control the environments especially for Python easily. - The test and build look more stable than the Jenkins'. No, dev-only change. Tested at #4 Closes apache#29057 from HyukjinKwon/migrate-to-github-actions. Authored-by: HyukjinKwon <gurwls223@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
This PR reenables GitHub Action on every commit as a next step. We carefully enabled GitHub Action on every PRs, and it looks good so far. As we saw at apache#29072, GitHub Action is already triggered at every commits on every PRs. Enabling GitHub Action on `master` branch commit doesn't make a big difference. And, we need to start to test at every commit as a next step. No. Manual. Closes apache#29076 from dongjoon-hyun/reenable_gha_commit. Authored-by: Dongjoon Hyun <dongjoon@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
…ub Actions
This PR mainly proposes to run only relevant tests just like Jenkins PR builder does. Currently, GitHub Actions always run full tests which wastes the resources.
In addition, this PR also fixes 3 more issues very closely related together while I am here.
1. The main idea here is: It reuses the existing logic embedded in `dev/run-tests.py` which Jenkins PR builder use in order to run only the related test cases.
2. While I am here, I fixed SPARK-32292 too to run the doc tests. It was because other references were not available when it is cloned via `checkoutv2`. With `fetch-depth: 0`, the history is available.
3. In addition, it fixes the `dev/run-tests.py` to match with `python/run-tests.py` in terms of its options. Environment variables such as `TEST_ONLY_XXX` were moved as proper options. For example,
```bash
dev/run-tests.py --modules sql,core
```
which is consistent with `python/run-tests.py`, for example,
```bash
python/run-tests.py --modules pyspark-core,pyspark-ml
```
4. Lastly, also fixed the formatting issue in module specification in the matrix:
```diff
- network_common, network_shuffle, repl, launcher
+ network-common, network-shuffle, repl, launcher,
```
which incorrectly runs build/test the modules.
By running only related tests, we can hugely save the resources and avoid unrelated flaky tests, etc.
Also, now it runs the doctest of `dev/run-tests.py` properly, the usages are similar between `dev/run-tests.py` and `python/run-tests.py`, and run `network-common`, `network-shuffle`, `launcher` and `examples` modules too.
No, dev-only.
Manually tested in my own forked Spark:
#7
#8
#9
#10
#11
#12
Closes apache#29086 from HyukjinKwon/SPARK-32292.
Authored-by: Hyukjin Kwon <gurwls223@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
…tions This PR aims to test PySpark with Python 3.8 in Github Actions. In the script side, it is already ready: https://github.com/apache/spark/blob/4ad9bfd53b84a6d2497668c73af6899bae14c187/python/run-tests.py#L161 This PR includes small related fixes together: 1. Install Python 3.8 2. Only install one Python implementation instead of installing many for SQL and Yarn test cases because they need one Python executable in their test cases that is higher than Python 2. 3. Do not install Python 2 which is not needed anymore after we dropped Python 2 at SPARK-32138 4. Remove a comment about installing PyPy3 on Jenkins - SPARK-32278. It is already installed. Currently, only PyPy3 and Python 3.6 are being tested with PySpark in Github Actions. We should test the latest version of Python as well because some optimizations can be only enabled with Python 3.8+. See also apache#29114 No, dev-only. Was not tested. Github Actions build in this PR will test it out. Closes apache#29116 from HyukjinKwon/test-python3.8-togehter. Authored-by: HyukjinKwon <gurwls223@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
### What changes were proposed in this pull request? This PR proposes to enable `corssPaths` back for now to match with the build as it was. It still indeterministically doesn't run JUnit tests given my observation, and this PR basically reverts the partial fix from apache#29057. See also apache#29205 for the full context. ### Why are the changes needed? To prevent the side effects from crossPaths such as SPARK_PREPEND_CLASSES or tests that run conditionally if the test classes are present in PySpark. ### Does this PR introduce _any_ user-facing change? No, dev-only. ### How was this patch tested? Manually tested: ```bash build/sbt -Phadoop-2.7 -Phive -Phive-2.3 -Phive-thriftserver -DskipTests clean test:package ./python/run-tests --python-executable=python3 --testname="pyspark.sql.tests.test_dataframe QueryExecutionListenerTests" ``` Closes apache#29218 from HyukjinKwon/SPARK-32408-1. Authored-by: HyukjinKwon <gurwls223@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
…llation in PIP test Currently the Jenkins PIP packaging test fails as below intermediately: ``` Installing dist into virtual env Processing ./python/dist/pyspark-3.1.0.dev0.tar.gz Collecting py4j==0.10.9 (from pyspark==3.1.0.dev0) Downloading https://files.pythonhosted.org/packages/9e/b6/6a4fb90cd235dc8e265a6a2067f2a2c99f0d91787f06aca4bcf7c23f3f80/py4j-0.10.9-py2.py3-none-any.whl (198kB) Installing collected packages: py4j, pyspark Found existing installation: py4j 0.10.9 Uninstalling py4j-0.10.9: Successfully uninstalled py4j-0.10.9 Found existing installation: pyspark 3.1.0.dev0 Exception: Traceback (most recent call last): File "/home/anaconda/envs/py36/lib/python3.6/site-packages/pip/_internal/cli/base_command.py", line 179, in main status = self.run(options, args) File "/home/anaconda/envs/py36/lib/python3.6/site-packages/pip/_internal/commands/install.py", line 393, in run use_user_site=options.use_user_site, File "/home/anaconda/envs/py36/lib/python3.6/site-packages/pip/_internal/req/__init__.py", line 50, in install_given_reqs auto_confirm=True File "/home/anaconda/envs/py36/lib/python3.6/site-packages/pip/_internal/req/req_install.py", line 816, in uninstall uninstalled_pathset = UninstallPathSet.from_dist(dist) File "/home/anaconda/envs/py36/lib/python3.6/site-packages/pip/_internal/req/req_uninstall.py", line 505, in from_dist '(at %s)' % (link_pointer, dist.project_name, dist.location) AssertionError: Egg-link /home/jenkins/workspace/SparkPullRequestBuilder3/python does not match installed ``` - apache#29099 (comment) (amp-jenkins-worker-04) - apache#29090 (comment) (amp-jenkins-worker-03) Seems like the previous installation of editable mode affects other PRs. This PR simply works around by removing the symbolic link from the previous editable installation. This is a common workaround up to my knowledge. To recover the Jenkins build. No, dev-only. Jenkins build will test it out. Closes apache#29102 from HyukjinKwon/SPARK-32303. Lead-authored-by: HyukjinKwon <gurwls223@apache.org> Co-authored-by: Hyukjin Kwon <gurwls223@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
…Jenkins This PR proposes: - Don't use `--user` in pip packaging test - Pull `source` out of the subshell, and place it first. - Exclude user sitepackages in Python path during pip installation test to address the flakiness of the pip packaging test in Jenkins. (I think) apache#29116 caused this flakiness given my observation in the Jenkins log. I had to work around by specifying `--user` but it turned out that it does not properly work in old Conda on Jenkins for some reasons. Therefore, reverting this change back. (I think) the installation at user site-packages affects other environments created by Conda in the old Conda version that Jenkins has. Seems it fails to isolate the environments for some reasons. So, it excludes user sitepackages in the Python path during the test. In addition, apache#29116 also added some fallback logics of `conda (de)activate` and `source (de)activate` because Conda prefers to use `conda (de)activate` now per the official documentation and `source (de)activate` doesn't work for some reasons in certain environments (see also conda/conda#7980). The problem was that `source` loads things to the current shell so does not affect the current shell. Therefore, this PR pulls `source` out of the subshell. Disclaimer: I made the analysis purely based on Jenkins machine's log in this PR. It may have a different reason I missed during my observation. To make the build and tests pass in Jenkins. No, dev-only. Jenkins tests should test it out. Closes apache#29117 from HyukjinKwon/debug-conda. Authored-by: HyukjinKwon <gurwls223@apache.org> Signed-off-by: HyukjinKwon <gurwls223@apache.org>
…ctivation in pip packaging test ### What changes were proposed in this pull request? This PR proposes to avoid using subshell when it activates Conda environment. Looks like it ends up with activating the env within the subshell even if you use `conda` command. ### Why are the changes needed? If you take a close look for GitHub Actions log: ``` Installing dist into virtual env Processing ./python/dist/pyspark-3.1.0.dev0.tar.gz Collecting py4j==0.10.9 Downloading py4j-0.10.9-py2.py3-none-any.whl (198 kB) Using legacy setup.py install for pyspark, since package 'wheel' is not installed. Installing collected packages: py4j, pyspark Running setup.py install for pyspark: started Running setup.py install for pyspark: finished with status 'done' Successfully installed py4j-0.10.9 pyspark-3.1.0.dev0 ... Installing dist into virtual env Obtaining file:///home/runner/work/spark/spark/python Collecting py4j==0.10.9 Downloading py4j-0.10.9-py2.py3-none-any.whl (198 kB) Installing collected packages: py4j, pyspark Attempting uninstall: py4j Found existing installation: py4j 0.10.9 Uninstalling py4j-0.10.9: Successfully uninstalled py4j-0.10.9 Attempting uninstall: pyspark Found existing installation: pyspark 3.1.0.dev0 Uninstalling pyspark-3.1.0.dev0: Successfully uninstalled pyspark-3.1.0.dev0 Running setup.py develop for pyspark Successfully installed py4j-0.10.9 pyspark ``` It looks not properly using Conda as it removes the previously installed one when it reinstalls again. We should ideally test it with Conda environment as it's intended. ### Does this PR introduce _any_ user-facing change? No, dev-only. ### How was this patch tested? GitHub Actions will test. I also manually tested in my local. Closes apache#29212 from HyukjinKwon/SPARK-32419. Authored-by: HyukjinKwon <gurwls223@apache.org> Signed-off-by: HyukjinKwon <gurwls223@apache.org>
…ng script This PR proposes to skip SparkR installation that is to run R linters (see SPARK-8505) in the test-only mode at `dev/run-tests.py` script. As of SPARK-32292, the test-only mode in `dev/run-tests.py` was introduced, for example: ``` dev/run-tests.py --modules sql,core ``` which only runs the relevant tests and does not run other tests such as linters. Therefore, we don't need to install SparkR when `--modules` are specified. GitHub Actions build is currently failed as below: ``` ERROR: this R is version 3.4.4, package 'SparkR' requires R >= 3.5 [error] running /home/runner/work/spark/spark/R/install-dev.sh ; received return code 1 ``` For some reasons, looks GitHub Actions started to have R 3.4.4 installed by default; however, R 3.4 was dropped as of SPARK-32073. When SparkR tests are not needed, GitHub Actions still builds SparkR with a low R version and it causes the test failure. This PR partially fixes it by avoid the installation of SparkR. No, dev-only. GitHub Actions tests should run to confirm this fix is correct. Closes apache#29300 from HyukjinKwon/install-r. Authored-by: HyukjinKwon <gurwls223@apache.org> Signed-off-by: HyukjinKwon <gurwls223@apache.org>
…itHub Actions This PR proposes to manually install R instead of using `setup-r` which seems broken. Currently, GitHub Actions uses its default R 3.4.4 installed, which we dropped as of SPARK-32073. While I am here, I am also upgrading R version to 4.0. Jenkins will test the old version and GitHub Actions tests the new version. AppVeyor uses R 4.0 but it does not check CRAN which is important when we make a release. To recover GitHub Actions build. No, dev-only Manually tested at #15 Closes apache#29302 from HyukjinKwon/SPARK-32493. Authored-by: HyukjinKwon <gurwls223@apache.org> Signed-off-by: HyukjinKwon <gurwls223@apache.org>
…ting ### What changes were proposed in this pull request? apache#26556 excluded `.github/workflows/master.yml`. So tests are skipped if the GitHub Actions configuration file is changed. As of SPARK-32245, we now run the regular tests via the testing script. We should include it to test to make sure GitHub Actions build does not break due to some changes such as Python versions. ### Why are the changes needed? For better test coverage in GitHub Actions build. ### Does this PR introduce _any_ user-facing change? No, dev-only. ### How was this patch tested? GitHub Actions in this PR will test. Closes apache#29305 from HyukjinKwon/SPARK-32496. Authored-by: HyukjinKwon <gurwls223@apache.org> Signed-off-by: HyukjinKwon <gurwls223@apache.org>
…ctions ### What changes were proposed in this pull request? CRAN check fails due to the size of the generated PDF docs as below: ``` ... WARNING ‘qpdf’ is needed for checks on size reduction of PDFs ... Status: 1 WARNING, 1 NOTE See ‘/home/runner/work/spark/spark/R/SparkR.Rcheck/00check.log’ for details. ``` This PR proposes to install `qpdf` in GitHub Actions. Note that I cannot reproduce in my local with the same R version so I am not documenting it for now. Also, while I am here, I piggyback to install SparkR when the module includes `sparkr`. it is rather a followup of SPARK-32491. ### Why are the changes needed? To fix SparkR CRAN check failure. ### Does this PR introduce _any_ user-facing change? No, dev-only. ### How was this patch tested? GitHub Actions will test it out. Closes apache#29306 from HyukjinKwon/SPARK-32497. Authored-by: HyukjinKwon <gurwls223@apache.org> Signed-off-by: HyukjinKwon <gurwls223@apache.org>
…Hub Actions This PR proposes to report the failed and succeeded tests in GitHub Actions in order to improve the development velocity by leveraging [ScaCap/action-surefire-report](https://github.com/ScaCap/action-surefire-report). See the example below:  Note that we cannot just use [ScaCap/action-surefire-report](https://github.com/ScaCap/action-surefire-report) in Apache Spark because PRs are from the forked repository, and GitHub secrets are unavailable for the security reason. This plugin and all similar plugins require to have the GitHub token that has the write access in order to post test results but it is unavailable in PRs. To work around this limitation, I took this approach: 1. In workflow A, run the tests and upload the JUnit XML test results. GitHub provides to upload and download some files. 2. GitHub introduced new event type [`workflow_run`](https://github.blog/2020-08-03-github-actions-improvements-for-fork-and-pull-request-workflows/) 10 days ago. By leveraging this, it triggers another workflow B. 3. Workflow B is in the main repo instead of fork repo, and has the write access the plugin needs. In workflow B, it downloads the artifact uploaded from workflow A (from the forked repository). 4. Workflow B generates the test reports to port from JUnit xml files. 5. Workflow B looks up the PR and posts the test reports. The `workflow_run` event is very new feature, and looks not so many GitHub Actions plugins support. In order to make this working with [ScaCap/action-surefire-report](https://github.com/ScaCap/action-surefire-report), I had to fork two GitHub Actions plugins to use: - [ScaCap/action-surefire-report](https://github.com/ScaCap/action-surefire-report) to have this custom fix: HyukjinKwon/action-surefire-report@c96094c It added `commit` argument to specify the commit to post the test reports. With `workflow_run`, it can access, in workflow B, to the commit from workflow A. - [dawidd6/action-download-artifact](https://github.com/dawidd6/action-download-artifact) to have this custom fix: HyukjinKwon/action-download-artifact@750b71a It added the support of downloading all artifacts from workflow A, in workflow B. By default, it only supports to specify the name of artifact. Note that I was not able to use the official [actions/download-artifact](https://github.com/actions/download-artifact) because: - It does not support to download artifacts between different workflows, see also actions/download-artifact#3. Once this issue is resolved, we can switch it back to [actions/download-artifact](https://github.com/actions/download-artifact). I plan to make a pull request for both repositories so we don't have to rely on forks. Currently, it's difficult to check the failed tests. You should scroll down long logs from GitHub Actions logs. No, dev-only. Manually tested at: #17, #18, #19, #20, and master branch of my forked repository. Closes apache#29333 from HyukjinKwon/SPARK-32357-fix. Lead-authored-by: Hyukjin Kwon <gurwls223@apache.org> Co-authored-by: HyukjinKwon <gurwls223@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
…-report and action-download-artifact in test_report.yml This PR proposes to remove the usage of my own forks and use the original plugins in GitHub Actions testing report. SPARK-32357 introduced the GitHub Actions test reporting by leveraging two plugins: - [ScaCap/action-surefire-report](https://github.com/ScaCap/action-surefire-report) - [dawidd6/action-download-artifact](https://github.com/dawidd6/action-download-artifact) In order to make it working, it had to fork two repositories with custom fixes: - HyukjinKwon/action-surefire-reportc96094c - HyukjinKwon/action-download-artifact@f86c565 The two custom fixes are thankfully merged at ScaCap/action-surefire-report#14 and dawidd6/action-download-artifact#24, and they released new ones to use at [ScaCap/action-surefire-report/commits/v1](https://github.com/ScaCap/action-surefire-report/commits/v1) and [dawidd6/action-download-artifact/commits/v2](https://github.com/dawidd6/action-download-artifact/commits/v2) - thanks jmisur and dawidd6 again. To avoid relying on forks and code duplications. No, dev-only. Logically there is no diff. I tested it at https://github.com/HyukjinKwon/spark/runs/992824229 for doubly sure. NOTE that this PR cannot be tested here within the workflow triggered by this PR without merging the changes in `test_report.yml` into the master. Closes apache#29449 from HyukjinKwon/SPARK-32606-SPARK-32605. Authored-by: HyukjinKwon <gurwls223@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
### What changes were proposed in this pull request? This PR proposes to upload `target/unit-tests.log` into the artifact so it will be able to download here:  ### Why are the changes needed? Jenkins has this feature. It should be best to have the same dev functionalities with it. Also, note that this was pointed out apache#29225 (comment). ### Does this PR introduce _any_ user-facing change? No, dev-only ### How was this patch tested? https://github.com/apache/spark/actions/runs/213000777 should demonstrate it Closes apache#29454 from HyukjinKwon/SPARK-32645. Authored-by: HyukjinKwon <gurwls223@apache.org> Signed-off-by: HyukjinKwon <gurwls223@apache.org>
### What changes were proposed in this pull request? This PR renames `master.yml` to `build_and_test.yml` to indicate this is the workflow that builds and runs the tests. ### Why are the changes needed? Just for readability. `master.yml` looks like the name of the branch (to me). ### Does this PR introduce _any_ user-facing change? No, dev-only. ### How was this patch tested? GitHub Actions build in this PR will test it out. Closes apache#29459 from HyukjinKwon/minor-rename. Authored-by: HyukjinKwon <gurwls223@apache.org> Signed-off-by: Gengliang Wang <gengliang.wang@databricks.com>
1393174 to
84846a8
Compare
|
It should be ready to be reviewed or merged now. |
| - name: Install Python packages (Python 2.7) | ||
| if: contains(matrix.modules, 'pyspark') || (contains(matrix.modules, 'sql') && !contains(matrix.modules, 'sql-')) | ||
| run: | | ||
| # Some tests do not pass in PySpark with PyArrow, for example, pyspark.sql.tests.ArrowTests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some tests do not pass in PySpark with PyArrow, for example, pyspark.sql.tests.ArrowTests with Python 2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not GitHub Actions specific. Jenkins does not test Python 2 with PyArrow and I can reproduce it in my local as well.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
Test build #127638 has finished for PR 29465 at commit
|
|
The test failure in PySpark ML will be addressed at #29481. @ScrapCodes and @dongjoon-hyun, can you review this and #29481? I think it's ready to go. |
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. +1, LGTM. Merged to branch-2.4
### What changes were proposed in this pull request? This PR proposes to backport the following JIRAs: - SPARK-32245 - SPARK-32292 - SPARK-32252 - SPARK-32408 - SPARK-32303 - SPARK-32363 - SPARK-32419 - SPARK-32491 - SPARK-32493 - SPARK-32496 - SPARK-32497 - SPARK-32357 - SPARK-32606 - SPARK-32605 - SPARK-32645 - Minor renaming d0dfe49#diff-02d9c370a663741451423342d5869b21 in order to enable GitHub Actions in branch-2.4. ### Why are the changes needed? To be able to run the tests in branch-2.4. Jenkins jobs are unstable. ### Does this PR introduce _any_ user-facing change? No, dev-only. ### How was this patch tested? Build in this PR will test. Closes #29465 from HyukjinKwon/SPARK-32249-2.4. Lead-authored-by: HyukjinKwon <gurwls223@apache.org> Co-authored-by: Hyukjin Kwon <gurwls223@apache.org> Co-authored-by: Dongjoon Hyun <dongjoon@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
|
Thank you @dongjoon-hyun! |
What changes were proposed in this pull request?
This PR proposes to backport the following JIRAs:
in order to enable GitHub Actions in branch-2.4.
Why are the changes needed?
To be able to run the tests in branch-2.4. Jenkins jobs are unstable.
Does this PR introduce any user-facing change?
No, dev-only.
How was this patch tested?
Build in this PR will test.