You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the talk "RayDP: Build Large-scale End-to-end Data Analytics and AI Pipelines Using Spark and Ray" https://youtu.be/ELSrR1Geqg4?t=819, @carsonwang mentioned that RayDP would have better performance.
We are curious which type of queries / workflows you run and your analysis on the performance differences.
Thanks a lot!
The text was updated successfully, but these errors were encountered:
Hi @chenya-zhang , there is a plan to integrate RayDP with Gluten which offloads the sql operations to native engine such as Velox. For TPC-H or TPC-DS like benchmark, we observed more than 2x speedup. You can find more details from the Gluten project https://github.com/oap-project/gluten.
We are also running RayDP + XGBoost on Ray workflows and observed performance advantage over running XGBoost on Spark. We will share more once the data is ready to publish.
Hi there,
In the talk "RayDP: Build Large-scale End-to-end Data Analytics and AI Pipelines Using Spark and Ray" https://youtu.be/ELSrR1Geqg4?t=819, @carsonwang mentioned that RayDP would have better performance.
We are curious which type of queries / workflows you run and your analysis on the performance differences.
Thanks a lot!
The text was updated successfully, but these errors were encountered: