Run CubingByLayer on Cluster #20

TatianaJin · 2018-11-29T05:02:30Z

Run an application in a similar way to running Husky applications.

Run master ./HuskyMaster -C <your conf file path>
Run application ./CubingByLayer -C <your conf file path>
Alternatively, you may run the application distributedly by ./exec.sh ./CubingByLayer -C <your conf file path>
The exec.sh file is the one in Husky, which looks like

# This points to a file, which should contains hostnames (one per line).
# E.g.,
#
# worker1
# worker2
# worker3
#
MACHINE_CFG=/data/opt/tmp/tati/husky/build/slaves

# This point to the directory where Husky binaries live.
# If Husky is running in a cluster, this directory should be available
# to all machines.
BIN_DIR=/data/opt/tmp/tati/husky/build
time pssh -t 0 -P -h ${MACHINE_CFG} \
    -x "-t -t" "cd $BIN_DIR && ls $BIN_DIR > /dev/null && ./$@"

The config file looks like this:

 master_host=w10                                                                                                                                                                                                                          
 master_port=56789
 comm_port=45678
 
 hdfs_namenode=proj99
 hdfs_namenode_port=9000
 
 serve=0
 
 meta_url=kylin_metadata@hdfs,path=hdfs://localhost:9000/kylin/kylin_metadata/metadata/69a4e318-c3ff-45d4-bfc3-2dcaeaa164d7
 hive_table=hdfs:///kylin/kylin_metadata/kylin-86dffb72-3bf9-4150-b9bd-52332d9a7af5/kylin_intermediate_simple_sales_model_69a4e318_c3ff_45d4_bfc3_2dcaeaa164d7
 table_format=ORC
 output_path=hdfs://proj99:9000/kylin/kylin_metadata/kylin-86dffb72-3bf9-4150-b9bd-52332d9a7af5/simple_sales_model/cuboid/
 
 [worker]
 info=w10:4

The meta_url parameter is the same as the Kylin input to Spark/MR; hive_table is the HDFS path to the flat join table; table_format is the format of the flat join table; and output_path is somewhere on HDFS in which you want to put the cuboids. The sample parameters are for building the example cube in Kylin which is very small. If you want to try large-scale data, you may deploy your own Kylin instance on the cluster, import TPC-H benchmark to get cube descriptions, and run the Kylin pipeline to create the flat join table.

The text was updated successfully, but these errors were encountered:

TatianaJin added the FYI label Nov 29, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Run CubingByLayer on Cluster #20

Run CubingByLayer on Cluster #20

TatianaJin commented Nov 29, 2018

Run CubingByLayer on Cluster #20

Run CubingByLayer on Cluster #20

Comments

TatianaJin commented Nov 29, 2018

Run an application in a similar way to running Husky applications.