-
-
Notifications
You must be signed in to change notification settings - Fork 8.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[jvm-packages] XGBoost4J-Spark 2.0.0-RC1 fails for Spark 3.4.0 on EMR #9512
Comments
Can you install XGBoost4J-Spark from Maven Central? Locally building JARs is more complex, as you might have issues with bundling the native library (libxgboost4j.so). |
I assume you mean to build the packages from source? I tried that on the master of the EMR cluster, but ran into errors. I ran the following steps: Installing maven:
Cloning Repo and switching to 1.7.6 and then packaging it up (following steps in tutorial):
This ended up in the following error:
which seems to be caused by this:
Similarly I tried to install it via running
I feel like I am well of the beaten path here and probably miss something quite obvious... |
Do you have a working Python 3 installation? I didn't realize you have to build from the source when using EMR. Do you need an uber-JAR where all dependencies are included? I found it hard to build such a JAR. |
Yes I do have a python3 installation, but it seems like this error is caused by a python2 invocation on the "create_jni.py" file. Invoking I just want something that reliably works in production, building these uber-JARs hasn't failed me so far. |
@nawidsayed, I guess you can hack the python path from here https://github.com/dmlc/xgboost/blob/master/jvm-packages/xgboost4j/pom.xml#L88 |
Thanks for your help so far everybody! I noticed that I am running on EMR Graviton 2 processors (
I haven't found a remedy for this. Almost all references for this error are usually caused when using the GPU implementation. This however is not the case for me here. It's confusing because if I check
|
So it seems like it's related to Spark & XGBoost versioning. Using Spark 3.4.0 on Scala 2.12 and XGBoost packages versions 1.7.6 I get the aforementioned error which is probably related to the Rabbit Tracker. StdOut prints However I don't have any issues when running with Spark 2.4.8 on Scala 2.11 and using XGBoost4j and XGBoost4j-spark with version 1.1.2 . In that case just before the training routine, stdOut read: |
Is there any way to make it properly work for Spark 3.4 ? |
Thanks for pointing this out. Unfortunately adding the library according to instructions here, fails in the following way when running
|
Even when manually adding the 2.0.0-RC1 packages to the jar, we run into the Rabbit Tracker error:
Even after this error, the executors still commence with training, according to their log:
|
I think we should prioritize the refactoring of the tracker, otherwise JVM related issues are quite difficult to resolve |
is it possible the tracker is also running with python 2? |
I don't know, isn't it written in C? The default If it helps, I could write out a minimal example that leads to aforementioned success and failure respectively. |
I bumped into the exact same generic error reported by the OP, using a very similar setup (EMR 6.5.0, Spark 3.1.2). Even if I am using Scala Spark, there is a python dependence through RabitTracker, which requires python >= 3.8. But EMR 6.5.0 provides python3.7. Setting up a virtual environtment that allows the cluster to use a higher python version solved the problem for me. |
Coming back again, after the solution I suggested in my post in Nov 28 didn't seem to work out on a second attempt. For me it was important to activate the virtual environment with the right python version before starting my spark-shell session in the master node. So in the master node I would run source pyspark_venv_python_3.9.9/bin/activate and then I would launch my spark-shell session with:
Only then is the tracker able to start with a sensible environment:
If I am not in the virtual environment before launching the shell, the tracker fails. |
That's caused by the Python dependency. We have removed the use of Python in the master branch. |
Thanks @trivialfis. I am bound to use version 1.7.3, but it's great to hear the python dependency has been removed in recent versions. It was really a pain to deal with. |
Hello everybody,
I am trying to implement XGBoost4J-Spark in a scala project. Everything works fine locally (on an intel MacBook), however when deploying to EMR, I receive the following error (running on EMR 6.12.0 and Spark 3.4.0 with Scala 2.12.17):
For my
build.sbt
I added the following lines tolibraryDependencies
, as suggested by the tutorial (running with sbt 1.9.2):I packaged everything up into a single JAR via the
sbt-assembly
plugin. I believe that this would pack all the dependencies into the JAR that is needed to run the Spark Application on EMR, so I am really out of ideas about this error. Not sure if this is an error on my end or an actual bug. Help is appreciated!The text was updated successfully, but these errors were encountered: