-
Notifications
You must be signed in to change notification settings - Fork 4
Setup Hadoop 2.7.7 in multi node cluster
First go to /etc/hosts
and define your nodes like this.
192.168.1.1 master
192.168.1.2 slave-1
192.168.1.3 slave-2
And make your ssh between servers password less like this documentation
Install jdk version 1.8 in all nodes and add JAVA_HOME
to environment variables.
Then download hadoop from this link.
Untar downloaded file and copy to /var/local/hadoop
.
Edit the ~/.bashrc
file and add this lines.
export HADOOP_HOME=/var/local/hadoop
export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
export HDFS_NAMENODE_USER="root"
export HDFS_DATANODE_USER="root"
export HDFS_SECONDARYNAMENODE_USER="root"
export YARN_RESOURCEMANAGER_USER="root"
export YARN_NODEMANAGER_USER="root"
Resource it with command source ~/.bashrc
Go to etc/hadoop
directory in HADOOP_HOME
and edit following files and add this lines in <configuration>
.
Edit core-site.xml
and add this lines to configurations
<property>
<name>fs.default.name</name>
<value>hdfs://master:9000</value>
</property>
Edit hdfs-site.xml
and add this lines
<property>
<name>dfs.namenode.name.dir</name>
<value>/var/local/hadoop/data/nameNode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/var/local/hadoop/data/dataNode</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value> // besides original file, keep a copy
</property>
Rename mapred-site.xml.templete
to mapred-site.xml
and add this lines to it.
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
Edit yarn-site.xml
and add this lines
<property>
<name>yarn.acl.enable</name>
<value>0</value>
</property>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>master</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
Edit slaves
file as below:
slave-1
slave-2
Edit hadoop-env.sh
and set Java path explicitly.
export JAVA_HOME=/var/local/jdk
If you want to change your default ssh port:
export HADOOP_SSH_OPTS="-p <your custom port>"
Propagate all configurations on all servers with scp
command.
For the first time, format HDFS with below command:
hdfs namenode -format
After that, start hadoop in master node by this command start-all.sh
.