-
Notifications
You must be signed in to change notification settings - Fork 0
Troubleshooting
At this page you can check your setup.
At the picture above you can see a example of a computer cluster which consists of several nodes. One master node and several slaves (slave00, slave01, ...). The master node is connected over ssh with the slave nodes. In the instructions are the names used as in the picture. Your nodes may have another names! You can see the names under `/etc/hosts`.
-
Correct permissions
All these folders with its contents should be owned by the user starql in the group cluster at every node:user@master:/opt$ ls -lha drwxr-xr-x 10 starql cluster 4,0K Jan 19 21:00 hadoop drwxr-xr-x 3 starql cluster 4,0K Jan 19 20:52 hadoop_tmp drwxr-xr-x 16 starql cluster 4,0K Mär 1 09:20 spark
-
SSH
At the master node you should be able to connect to the master node:user@master:~$ sudo su starql starql@master:~$ ssh master
Now you should be connect the the master node
At the master node you should be able to connect to every slave node:
user@master:~$ sudo su starql starql@master:~$ ssh slave00 starql@master:~$ ssh slave01 ...
-
PostgreSQL
You should be able at every node to connect to PostgreSQL which is running at the master node. Try this at every node:sudo -u postgres psql -h master -p 5432 -U postgres
-
Apache Spark
You can set the logging level back to INFO in thelog4j.properties
. For this change the linelog4j.rootCategory=ERROR, console
to
log4j.rootCategory=INFO, console
It can help a lot, when something is not running as it should.
You can visit the Spark UI in your Browser under http://master:7077. There are also helpful informations. You should have at least one worker there in the list.
When you get a out of memory exception while running spark, then you have set up Spark not correctly. You can try to increase the partitions for shuffle and partitioning under the
spark-defaults.conf
file:spark.default.parallelism [number] spark.sql.shuffle.partitions [number]
I recommend to use the values as described under Spark Cluster Setup. When you then get a out of memory exception then try to increase the values.