-
Notifications
You must be signed in to change notification settings - Fork 12
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[ML-282] Refactor CPU & GPU examples (#306)
* First move * Move device discover for scala * Delete old gpu discover * Add run-all-gpu * Add clean up * Add tmp utils file * Add exe * Rename run script * Scala gpu donw * Scala cpu done * For ci * pyspark ci * Rename scala * Rename scala file in scripts * Pyspark unit done * Update pyspark utils * Update ci * Remove tmp utils * Reaname utils * Change absolute path, rm als gpu.sh * Scala absolute path * Change sanity check * Rename ci * Split random_forest * Fix name change in ci * Fix path typo * Fix typo
- Loading branch information
Showing
64 changed files
with
212 additions
and
90 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
File renamed without changes.
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
#!/usr/bin/env bash | ||
|
||
exampleDirs=(kmeans-scala pca-scala als-scala naive-bayes-scala \ | ||
linear-regression-scala correlation-scala summarizer-scala) | ||
|
||
cd scala | ||
|
||
for dir in ${exampleDirs[*]} | ||
do | ||
cd $dir | ||
echo | ||
echo ========================== | ||
echo Cleaning $dir ... | ||
echo ========================== | ||
echo | ||
rm -rf ./target/ | ||
cd .. | ||
done | ||
|
||
cd .. |
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
File renamed without changes.
3 changes: 2 additions & 1 deletion
3
examples/als-pyspark/run.sh → examples/python/als-pyspark/run-cpu.sh
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
File renamed without changes.
3 changes: 2 additions & 1 deletion
3
examples/kmeans-pyspark/run.sh → examples/python/kmeans-pyspark/run-cpu.sh
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,40 @@ | ||
#!/usr/bin/env bash | ||
|
||
CONF_PATH=$PWD/../../../conf | ||
source $CONF_PATH/env.sh | ||
|
||
# Data file is from Spark Examples (data/mllib/sample_kmeans_data.txt) and put in examples/data | ||
# The data file should be copied to $HDFS_ROOT before running examples | ||
DATA_FILE=$HDFS_ROOT/data/sample_kmeans_data.txt | ||
|
||
DEVICE=GPU | ||
RESOURCE_FILE=$CONF_PATH/IntelGpuResourceFile.json | ||
WORKER_GPU_AMOUNT=4 | ||
EXECUTOR_GPU_AMOUNT=1 | ||
TASK_GPU_AMOUNT=1 | ||
APP_PY=kmeans-pyspark.py | ||
|
||
|
||
# Should run in standalone mode | ||
time $SPARK_HOME/bin/spark-submit --master $SPARK_MASTER \ | ||
--num-executors $SPARK_NUM_EXECUTORS \ | ||
--executor-cores $SPARK_EXECUTOR_CORES \ | ||
--total-executor-cores $SPARK_TOTAL_CORES \ | ||
--driver-memory $SPARK_DRIVER_MEMORY \ | ||
--executor-memory $SPARK_EXECUTOR_MEMORY \ | ||
--conf "spark.serializer=org.apache.spark.serializer.KryoSerializer" \ | ||
--conf "spark.default.parallelism=$SPARK_DEFAULT_PARALLELISM" \ | ||
--conf "spark.sql.shuffle.partitions=$SPARK_DEFAULT_PARALLELISM" \ | ||
--conf "spark.driver.extraClassPath=$SPARK_DRIVER_CLASSPATH" \ | ||
--conf "spark.executor.extraClassPath=$SPARK_EXECUTOR_CLASSPATH" \ | ||
--conf "spark.oap.mllib.device=$DEVICE" \ | ||
--conf "spark.worker.resourcesFile=$RESOURCE_FILE" \ | ||
--conf "spark.worker.resource.gpu.amount=$WORKER_GPU_AMOUNT" \ | ||
--conf "spark.executor.resource.gpu.amount=$EXECUTOR_GPU_AMOUNT" \ | ||
--conf "spark.task.resource.gpu.amount=$TASK_GPU_AMOUNT" \ | ||
--conf "spark.shuffle.reduceLocality.enabled=false" \ | ||
--conf "spark.network.timeout=1200s" \ | ||
--conf "spark.task.maxFailures=1" \ | ||
--jars $OAP_MLLIB_JAR \ | ||
$APP_PY $DATA_FILE \ | ||
2>&1 | tee KMeans-$(date +%m%d_%H_%M_%S).log |
File renamed without changes.
3 changes: 2 additions & 1 deletion
3
examples/pca-pyspark/run.sh → examples/python/pca-pyspark/run-cpu.sh
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,40 @@ | ||
#!/usr/bin/env bash | ||
|
||
CONF_PATH=$PWD/../../../conf | ||
source $CONF_PATH/env.sh | ||
|
||
# CSV data is the same as in Spark example "ml/pca_example.py" | ||
# The data file should be copied to $HDFS_ROOT before running examples | ||
DATA_FILE=$HDFS_ROOT/data/pca_data.csv | ||
|
||
DEVICE=GPU | ||
RESOURCE_FILE=$CONF_PATH/IntelGpuResourceFile.json | ||
WORKER_GPU_AMOUNT=4 | ||
EXECUTOR_GPU_AMOUNT=1 | ||
TASK_GPU_AMOUNT=1 | ||
APP_PY=pca-pyspark.py | ||
|
||
|
||
# Should run in standalone mode | ||
time $SPARK_HOME/bin/spark-submit --master $SPARK_MASTER \ | ||
--num-executors $SPARK_NUM_EXECUTORS \ | ||
--executor-cores $SPARK_EXECUTOR_CORES \ | ||
--total-executor-cores $SPARK_TOTAL_CORES \ | ||
--driver-memory $SPARK_DRIVER_MEMORY \ | ||
--executor-memory $SPARK_EXECUTOR_MEMORY \ | ||
--conf "spark.serializer=org.apache.spark.serializer.KryoSerializer" \ | ||
--conf "spark.default.parallelism=$SPARK_DEFAULT_PARALLELISM" \ | ||
--conf "spark.sql.shuffle.partitions=$SPARK_DEFAULT_PARALLELISM" \ | ||
--conf "spark.driver.extraClassPath=$SPARK_DRIVER_CLASSPATH" \ | ||
--conf "spark.executor.extraClassPath=$SPARK_EXECUTOR_CLASSPATH" \ | ||
--conf "spark.oap.mllib.device=$DEVICE" \ | ||
--conf "spark.worker.resourcesFile=$RESOURCE_FILE" \ | ||
--conf "spark.worker.resource.gpu.amount=$WORKER_GPU_AMOUNT" \ | ||
--conf "spark.executor.resource.gpu.amount=$EXECUTOR_GPU_AMOUNT" \ | ||
--conf "spark.task.resource.gpu.amount=$TASK_GPU_AMOUNT" \ | ||
--conf "spark.shuffle.reduceLocality.enabled=false" \ | ||
--conf "spark.network.timeout=1200s" \ | ||
--conf "spark.task.maxFailures=1" \ | ||
--jars $OAP_MLLIB_JAR \ | ||
$APP_PY $DATA_FILE \ | ||
2>&1 | tee PCA-$(date +%m%d_%H_%M_%S).log |
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
#!/usr/bin/env bash | ||
|
||
exampleDirs=(kmeans-pyspark pca-pyspark als-pyspark \ | ||
random-forest-regressor-pyspark random-forest-classifier-pyspark) | ||
|
||
cd python | ||
|
||
for dir in ${exampleDirs[*]} | ||
do | ||
cd $dir | ||
echo | ||
echo ========================== | ||
echo Running $dir ... | ||
echo ========================== | ||
echo | ||
./run-gpu.sh | ||
cd .. | ||
done | ||
|
||
cd .. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
#!/usr/bin/env bash | ||
|
||
exampleDirs=(kmeans-scala pca-scala als-scala naive-bayes-scala \ | ||
linear-regression-scala correlation-scala summarizer-scala) | ||
|
||
cd scala | ||
|
||
for dir in ${exampleDirs[*]} | ||
do | ||
cd $dir | ||
echo | ||
echo ========================== | ||
echo Running $dir ... | ||
echo ========================== | ||
echo | ||
./run-cpu.sh | ||
cd .. | ||
done | ||
|
||
cd .. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
File renamed without changes.
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
File renamed without changes.
File renamed without changes.
File renamed without changes.
Oops, something went wrong.