Installation Cluster Single-Node

From Tuxunix
Jump to: navigation, search

Link :

[[1]]

Environnement

  • OS : Debian 7.4 (64Bits)
  • Hadoop : 2.4.1 *[[2]]
  • Java : java-7-openjdk-amd64


Master

Pre-requis
addgroup hadoop
adduser --ingroup hadoop hduser
ssh-keygen -t rsa -P ""
cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys
ssh localhost

vim $HOME/.bashrc

# Set Hadoop-related environment variables
export HADOOP_HOME=/usr/local/hadoop

# Set JAVA_HOME (we will also configure JAVA_HOME directly for Hadoop later on)
export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64

# Some convenient aliases and functions for running Hadoop-related commands
unalias fs &> /dev/null
alias fs="hadoop fs"
unalias hls &> /dev/null
alias hls="fs -ls"

# If you have LZO compression enabled in your Hadoop cluster and
# compress job outputs with LZOP (not covered in this tutorial):
# Conveniently inspect an LZOP compressed file from the command
# line; run via:
#
# $ lzohead /hdfs/path/to/lzop/compressed/file.lzo
#
# Requires installed 'lzop' command.
#
lzohead () {
    hadoop fs -cat $1 | lzop -dc | head -1000 | less
}

# Add Hadoop bin/ directory to PATH
export PATH=$PATH:$HADOOP_HOME/bin


Hadoop ne supporte pas IPv6, on le désactive :

echo '# disable ipv6
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1' >> /etc/sysctl.conf

root@NodeMaster1:# sysctl -p
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1

Installation d'Hadoop :

mkdir -p /usr/local/hadoop && cd /usr/local/hadoop && tar xvzf /tmp/hadoop-2.4.1.tar.gz
mkdir -p /app/hadoop/tmp
chown hduser:hadoop /usr/local/hadoop && chown hduser:hadoop /app/hadoop/tmp
chmod 750 /app/hadoop/tmp/
Configuration
su - hduser
vim /usr/local/hadoop/etc/hadoop/hadoop-env.sh
...

mv hadoop/mapred-site.xml.template hadoop/mapred-site.xml

vim hadoop/mapred-site.xml
...
vim hadoop/hdfs-site.xml
...
  • Format hdfs :
/usr/local/hadoop/bin/hadoop namenode -format

Start du cluster :

/usr/local/hadoop/sbin/start-all.sh

Check :

hduser@NodeMaster1:~$ jps
29216 DataNode
29541 ResourceManager
29099 NameNode
30088 Jps