-
Hi, I'm looking to implement GPU-based distributed AI processing in Spark by applying the Rapids plugin. I've been creating code to build deep learning models like CNN, LSTM, and RNN for classification directly in Tensorflow or PyTorch, which I've been using. I want to perform tasks involving passing 3-color images through a CNN model for classification by using RAPIDS. Can I perform the same tasks using spark-rapids-ml? If not, would it be acceptable to directly import Tensorflow and Keras into Spark code for use? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
@leehaoun spark-rapids-ml is more targeted towards GPU-acceleration of the traditional ML algorithms found in Spark MLlib. For distributed inference using models trained in deep-learning frameworks like TensorFlow and PyTorch, you will pretty much need to directly import the model inferencing code into Spark. We have examples of how to do this in these examples. If you are trying to do distributed training on Spark, you will need to use the distributed training APIs of the respective DL frameworks combined with Spark integrations like Horovod-on-Spark, TensorFlow Distributor, or Pytorch Distributor. |
Beta Was this translation helpful? Give feedback.
@leehaoun spark-rapids-ml is more targeted towards GPU-acceleration of the traditional ML algorithms found in Spark MLlib.
For distributed inference using models trained in deep-learning frameworks like TensorFlow and PyTorch, you will pretty much need to directly import the model inferencing code into Spark. We have examples of how to do this in these examples.
If you are trying to do distributed training on Spark, you will need to use the distributed training APIs of the respective DL frameworks combined with Spark integrations like Horovod-on-Spark, TensorFlow Distributor, or Pytorch Distributor.