Hadoop installation 2.7.2 Guide

Hadoop 2.7.2 installation Guide latest one by somappa Srinivasan

Step1 : Install Vmworkstion or oracle virtual Box in your machine(computer)

Download link for vmworkstation


Download Oracle Virtual Box


Step 2: Install Ubuntu os

Download Link for Ubuntu Os


Step 3: Update Ubuntu packages

Command : Sudo apt-get update

step 4 : Install Java 1.7 or 1.8

Command : sudo apt-get install openjdk-7-jdk

Step 5: Check whether java installed or not

Command : java -version

Step 6 : Check the Java path where JAVA installed in Ubuntu :

command : cd /usr/lib/jvm/java-1.7.0-openjdk-amd64

Step 7 : Set JAVA path in .bashrc file

Command : Sudo gedit .bashrc

Step 8 : set Java path in .bashrc file

Command :

export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-amd64

export PATH=$PATH:$JAVA_HOME/bin

Step 9 : update the .bashrc file

Command : Source .bashrc

Step 9 : install SSH

Step 10 : Download Hadoop latest tar file hadoop 2.7.2 in apache.org website

url : https://archive.apache.org/dist/hadoop/core/hadoop-2.7.2/

SStep 11: List of out the tar file in hadoop file

Step 12: Give Permission to hadoop tar file

Step 13: Extract tar file in ubntu terminal

command : sudo tar -xvf hadoop-2.7.2.tar.gz

Step 14: List out the Hadoop Configuration files

Command :cd hadoop/etc/hadoop

Step 15 : List out bin directory files in hadoop

command : cd hadoop/sbin

Step 16 : Edit Hadoop Configuration Files

Step 17 : Create log directory in hadoop

Step 18 : Edit hadoop-env.sh file

Code :

# The java implementation to use.

export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-amd64

Step 18 : Create mapred-site.xml file & Edit mapred-site.xml

Code :





<description>The host and port that the MapReduce job tracker runs

at. If “local”, then jobs are run in-process as a single map

and reduce task.




Step 20 :Edit yarn-site.xml

Code :











Step 21 : edit hdfs site.xml before editing hdfs-site.xml

Create two empty directory for namenode and datanode

Step 22 : Edit hdfs-site.xml

Code :





<description>Default block replication.

The actual number of replications can be specified when the file is created.

The default is used if replication is not specified in create time.












Step 23 : Hadoop namenode format

Step 24 : Start-all .sh .. to start all deamons

command : ./start-all.sh

Step 25: to stop all deamonds

command : ./stop-all.sh

Step 26 : BROWSER UI

Namoenoe : localhost:50070


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s