Troubleshooting

At this page you can check your setup.

Cluster example visualization

At the picture above you can see a example of a computer cluster which consists of several nodes. One master node and several slaves (slave00, slave01, ...). The master node is connected over ssh with the slave nodes. In the instructions are the names used as in the picture. Your nodes may have another names! You can see the names under `/etc/hosts`.

Correct permissions
All these folders with its contents should be owned by the user starql in the group cluster at every node:

user@master:/opt$ ls -lha
drwxr-xr-x 10 starql cluster 4,0K Jan 19 21:00 hadoop
drwxr-xr-x  3 starql cluster 4,0K Jan 19 20:52 hadoop_tmp
drwxr-xr-x 16 starql cluster 4,0K Mär  1 09:20 spark

SSH
At the master node you should be able to connect to the master node:
```
user@master:~$ sudo su starql
starql@master:~$ ssh master
```
Now you should be connect the the master node

At the master node you should be able to connect to every slave node:
```
user@master:~$ sudo su starql
starql@master:~$ ssh slave00
starql@master:~$ ssh slave01
...
```
PostgreSQL
You should be able at every node to connect to PostgreSQL which is running at the master node. Try this at every node:
```
sudo -u postgres psql -h master -p 5432 -U postgres
```
Apache Spark
You can set the logging level back to INFO in the log4j.properties. For this change the line
```
log4j.rootCategory=ERROR, console
```
to
```
log4j.rootCategory=INFO, console
```
It can help a lot, when something is not running as it should.

You can visit the Spark UI in your Browser under http://master:7077. There are also helpful informations. You should have at least one worker there in the list.

When you get a out of memory exception while running spark, then you have set up Spark not correctly. You can try to increase the partitions for shuffle and partitioning under the spark-defaults.conf file:
```
spark.default.parallelism      [number]
spark.sql.shuffle.partitions   [number]
```
I recommend to use the values as described under Spark Cluster Setup. When you then get a out of memory exception then try to increase the values.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Troubleshooting

Clone this wiki locally