Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix 'can not find /tmp/xxx/yyy.tar.gz' error when use spark cluster mode #3111

Merged
merged 3 commits into from
Oct 1, 2021

Conversation

Chopinxb
Copy link
Contributor

@Chopinxb Chopinxb commented Sep 27, 2021

when spark use cluster deploy-mode, the run_path will be created on the submitting host instead of the host where the driver is located. This will casuse error below:
Traceback (most recent call last): File "pyspark_runner.py", line 143, in <module> _get_runner_class()(*sys.argv[1:]).run() File "pyspark_runner.py", line 119, in run self.job.setup_remote(sc) File "/opt/tiger/ss_lib/python_package/lib/python2.7/site-packages/luigi/contrib/spark.py", line 307, in setup_remote self._setup_packages(sc) File "/opt/tiger/ss_lib/python_package/lib/python2.7/site-packages/luigi/contrib/spark.py", line 364, in _setup_packages tar = tarfile.open(tar_path, "w:gz") File "/usr/lib/python2.7/tarfile.py", line 1693, in open return func(name, filemode, fileobj, **kwargs) File "/usr/lib/python2.7/tarfile.py", line 1740, in gzopen fileobj = gzip.GzipFile(name, mode, compresslevel, fileobj) File "/usr/lib/python2.7/gzip.py", line 94, in __init__ fileobj = self.myfileobj = __builtin__.open(filename, mode or 'rb') IOError: [Errno 2] No such file or directory: '/tmp/xxxYcUXC/yyy.tar.gz'

Description

In this PR, we will create the parent directory before compresse and upload packages in the host where driver is located.
The driver is the role who will run _setup_packages func but not run func in class SparkSubmitTask

when spark use cluster deploy-mode,  the run_path will be created on the submitting host instead of the host where the driver is located
fix can not find  /tmp/xxx/yyy.tar.gz
@Chopinxb Chopinxb requested review from dlstadther, Tarrasch and a team as code owners September 27, 2021 02:36
@dlstadther dlstadther merged commit 0c5c53b into spotify:master Oct 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants