I was working on setting up Hadoop on Ubuntu as a Single node cluster.
I came across a very nice blog about it here. (Must read to setup your single node cluster).
While I was at it, I was creating / Installing Hadoop multiple time in different system, then I though to create a script of my own, based on the blog above.
Here is the link to my script which is on GITHUB anyone interested can check-out and enhance the script.
https://github.com/zubayr/hadoopscript/blob/master/initScriptHadoop.sh
README : https://github.com/zubayr/hadoopscript/blob/master/README.txt
Requirement.
1. Hadoop 1.0.3
2. Ubuntu 10.04 or above (Tested on 11.04, 11.10 and 12.04 32bit platform)
Here is the details on how to install Hadoop using the script
Please Readme
- hadoop script to setup Single Node Cluster - For Hadoop 1.0.3 Only.
- Tested on Ubuntu 11.10, 12.04 - Fresh Install.
- Scripts assumes nothing is installed for Hadoop and installs Required Components for Hadoop to run.
- This Script was created using the Installation Guide by Micheal Noll.
http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/
Steps For Executing Script: Currently script only takes single option at a time :(
-----------------------------------------------------------------------
Execute Help
]$ sudo ./initScriptHadoop.sh --help
usage: ./initScriptHadoop.sh <single-parameter>
Optional parameters:
--install-init, -i Initialization script To Install Hadoop as Single Node Cluster.
Use the below Options, Once you are logged-in as Hadoop User 'hduser' created in the -i init script above.
--install-ssh, -s Install ssh-keygen -t rsa -P
--install-bashrc, -b Updated '.bashrc' with JAVA_HOME, HADOOP_HOME.
--ipv6-disable, -v IPv6 Support Disable.[ Might Not be required.
Updating 'conf/hadoop-env.sh' with
'HADOOP_OPTS=-Djava.net.preferIPv4Stack=true' option in -e]
--hostname-update, -u Update Hostname for the system.
--config-update, -c Update Configuration with default values
(Single Node) in core-site.xml, mapred-site.xml, hdfs-site.xml.
--update-hadoop-env, -e Update Hadoop Env Script with JAVA_HOME.
--help, -h Display this Message.
1. First Install prerequisites using -i Option
ahmed@ahmed-on-Edge:~$ ./initScriptHadoop.sh -i
Welcome to Precofiguration For Hadoop single node setup wizard
Would you like install Java 1.6 ? (y/n) y
Would you like to setup user 'hduser' and 'hadoop Group'? (y/n) y
Would you like to download Hadoop 1.0.3 and extract to /usr/local? (y/n) y
Would you like to make 'hduser' owner /usr/local/hadoop/ directory? (y/n) y
Would you like to login into 'htuser' once done? (y/n) y
Review your choices:
Install Java 1.6 : y
Setup 'hduser' user : y
Download Hadoop 1.0.3 : y
Setup 'hduser' as Owner : y
Login to 'hduser' : y
Proceed with setup? (y/n)y
During the installation it will ask for password for hduser
2. Login to 'hduser' (which will be created in the -i options).
3. Execute options -s, -b, -c, -e
/initScriptHadoop.sh -s;
/initScriptHadoop.sh -b;
/initScriptHadoop.sh -c;
/initScriptHadoop.sh -e;
Once you are done installing. Login as hduser.
First Format the node.
] $ cd /usr/local/hadoop
] $ hduser@ubuntu:/usr/local/hadoop$ bin/hadoop namenode -format
And format the node will generate output similar to this
hduser@ubuntu:/usr/local/hadoop$ bin/hadoop namenode -format
10/05/08 16:59:56 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = ubuntu/127.0.1.1
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 0.20.2
STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/..
************************************************************/
10/05/08 16:59:56 INFO namenode.FSNamesystem: fsOwner=hduser,hadoop
10/05/08 16:59:56 INFO namenode.FSNamesystem: supergroup=supergroup
10/05/08 16:59:56 INFO namenode.FSNamesystem: isPermissionEnabled=true
10/05/08 16:59:56 INFO common.Storage: Image file of size 96 saved in 0 seconds.
10/05/08 16:59:57 INFO common.Storage: Storage . . . as been successfully formatted.
10/05/08 16:59:57 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at ubuntu/127.0.1.1
************************************************************/
hduser@ubuntu:/usr/local/hadoop$
Next lets start the Single node Cluster and then check if all is well with “jps” command.
hduser@ubuntu:/usr/local/hadoop$ bin/start-all.sh
starting namenode, logging to /usr/local/hadoop/bin/../logs/hadoop-hduser-namenode-ubuntu.out
localhost: starting datanode, logging to /usr/local/hadoop/bin/../logs/hadoop-hduser-datanode-ubuntu.out
localhost: starting secondarynamenode, logging to /usr/local/hadoop/bin/../logs/hadoop-hduser-secondarynamenode-ubuntu.out
starting jobtracker, logging to /usr/local/hadoop/bin/../logs/hadoop-hduser-jobtracker-ubuntu.out
localhost: starting tasktracker, logging to /usr/local/hadoop/bin/../logs/hadoop-hduser-tasktracker-ubuntu.out
hduser@ubuntu:/usr/local/hadoop$
hduser@ubuntu:/usr/local/hadoop$ jps
2287 TaskTracker
2149 JobTracker
1938 DataNode
2085 SecondaryNameNode
2349 Jps
1788 NameNode
This shows that all the processes are working fine. Now we are ready to run some good old word count MapReduce Code.
Comments
Post a Comment