Skip to main content

Installing Hadoop 1.0.3 on Ubuntu Single Node Cluster using shell script

I was working on setting up Hadoop on Ubuntu as a Single node cluster.

I came across a very nice blog about it here. (Must read to setup your single node cluster).
While I was at it, I was creating / Installing Hadoop multiple time in different system, then I though to create a script of my own, based on the blog above.

Here is the link to my script which is on GITHUB anyone interested can check-out and enhance the script.

https://github.com/zubayr/hadoopscript/blob/master/initScriptHadoop.sh

README : https://github.com/zubayr/hadoopscript/blob/master/README.txt

Requirement.

1. Hadoop 1.0.3

2. Ubuntu 10.04  or above (Tested on 11.04, 11.10 and 12.04 32bit platform)

Here is the details on how to install Hadoop using the script

Please Readme

- hadoop script to setup Single Node Cluster - For Hadoop 1.0.3 Only.

- Tested on Ubuntu 11.10, 12.04 - Fresh Install.

- Scripts assumes nothing is installed for Hadoop and installs Required Components for Hadoop to run.

- This Script was created using the Installation Guide by Micheal Noll.

http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/

 

Steps For Executing Script: Currently script only takes single option at a time :(

-----------------------------------------------------------------------

Execute Help

]$ sudo ./initScriptHadoop.sh --help

 

usage: ./initScriptHadoop.sh <single-parameter>

 

  Optional parameters:

     --install-init, -i Initialization script To Install Hadoop as Single Node Cluster.

     Use the below Options, Once you are logged-in as Hadoop User 'hduser' created in the -i init script above.

     --install-ssh, -s Install ssh-keygen -t rsa -P

     --install-bashrc, -b Updated '.bashrc' with JAVA_HOME, HADOOP_HOME.

     --ipv6-disable, -v IPv6 Support Disable.[ Might Not be required.

                                Updating 'conf/hadoop-env.sh' with

'HADOOP_OPTS=-Djava.net.preferIPv4Stack=true' option in -e]

     --hostname-update, -u Update Hostname for the system.

     --config-update, -c Update Configuration with default values

(Single Node) in core-site.xml, mapred-site.xml, hdfs-site.xml.

     --update-hadoop-env, -e Update Hadoop Env Script with JAVA_HOME.

     --help, -h Display this Message.

 

1. First Install prerequisites using -i Option

     ahmed@ahmed-on-Edge:~$ ./initScriptHadoop.sh -i

      Welcome to Precofiguration For Hadoop single node setup wizard

    

     Would you like install Java 1.6 ? (y/n) y

     Would you like to setup user 'hduser' and 'hadoop Group'? (y/n) y

     Would you like to download Hadoop 1.0.3 and extract to /usr/local? (y/n) y

     Would you like to make 'hduser' owner /usr/local/hadoop/ directory? (y/n) y

     Would you like to login into 'htuser' once done? (y/n) y

    

      Review your choices:

    

      Install Java 1.6 : y

      Setup 'hduser' user : y

      Download Hadoop 1.0.3 : y

      Setup 'hduser' as Owner : y

      Login to 'hduser' : y

    

     Proceed with setup? (y/n)y

 

During the installation it will ask for password for hduser

 

2. Login to 'hduser' (which will be created in the -i options).

3. Execute options -s, -b, -c, -e

   /initScriptHadoop.sh -s;

   /initScriptHadoop.sh -b;

   /initScriptHadoop.sh -c;

   /initScriptHadoop.sh -e;


 


Once you are done installing. Login as hduser.


First Format the node.



] $ cd /usr/local/hadoop


] $ hduser@ubuntu:/usr/local/hadoop$ bin/hadoop namenode -format


And format the node will generate output similar to this


hduser@ubuntu:/usr/local/hadoop$ bin/hadoop namenode -format
10/05/08 16:59:56 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = ubuntu/127.0.1.1
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 0.20.2
STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/..
************************************************************/
10/05/08
16:59:56 INFO namenode.FSNamesystem: fsOwner=hduser,hadoop
10/05/08 16:59:56 INFO namenode.FSNamesystem: supergroup=supergroup
10/05/08 16:59:56 INFO namenode.FSNamesystem: isPermissionEnabled=true
10/05/08 16:59:56 INFO common.Storage: Image file of size 96 saved in 0 seconds.
10/05/08 16:59:57 INFO common.Storage: Storage . . . as been successfully formatted.
10/05/08 16:59:57 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at ubuntu/127.0.1.1
************************************************************/
hduser@ubuntu:/usr/local/hadoop$

Next lets start the Single node Cluster and then check if all is well with “jps” command.


hduser@ubuntu:/usr/local/hadoop$ bin/start-all.sh
starting namenode, logging to /usr/local/hadoop/bin/../logs/hadoop-hduser-namenode-ubuntu.out
localhost: starting datanode, logging to /usr/local/hadoop/bin/../logs/hadoop-hduser-datanode-ubuntu.out
localhost: starting secondarynamenode, logging to /usr/local/hadoop/bin/../logs/hadoop-hduser-secondarynamenode-ubuntu.out
starting jobtracker, logging to /usr/local/hadoop/bin/../logs/hadoop-hduser-jobtracker-ubuntu.out
localhost: starting tasktracker, logging to /usr/local/hadoop/bin/../logs/hadoop-hduser-tasktracker-ubuntu.out
hduser@ubuntu:/usr/local/hadoop$

hduser@ubuntu:/usr/local/hadoop$ jps
2287 TaskTracker
2149 JobTracker
1938 DataNode
2085 SecondaryNameNode
2349 Jps
1788 NameNode

This shows that all the processes are working fine. Now we are ready to run some good old word count MapReduce Code.

Comments

Popular posts from this blog

Zabbix History Table Clean Up

Zabbix history table gets really big, and if you are in a situation where you want to clean it up. Then we can do so, using the below steps. Stop zabbix server. Take table backup - just in case. Create a temporary table. Update the temporary table with data required, upto a specific date using epoch . Move old table to a different table name. Move updated (new temporary) table to original table which needs to be cleaned-up. Drop the old table. (Optional) Restart Zabbix Since this is not offical procedure, but it has worked for me so use it at your own risk. Here is another post which will help is reducing the size of history tables - http://zabbixzone.com/zabbix/history-and-trends/ Zabbix Version : Zabbix v2.4 Make sure MySql 5.1 is set with InnoDB as innodb_file_per_table=ON Step 1 Stop the Zabbix server sudo service zabbix-server stop Script. echo "------------------------------------------" echo " 1. Stopping Zabbix Server ...

Access Filter in SSSD `ldap_access_filter` [SSSD Access denied / Permission denied ]

Access Filter Setup with SSSD ldap_access_filter (string) If using access_provider = ldap , this option is mandatory. It specifies an LDAP search filter criteria that must be met for the user to be granted access on this host. If access_provider = ldap and this option is not set, it will result in all users being denied access. Use access_provider = allow to change this default behaviour. Example: access_provider = ldap ldap_access_filter = memberOf=cn=allowed_user_groups,ou=Groups,dc=example,dc=com Prerequisites yum install sssd Single LDAP Group Under domain/default in /etc/sssd/sssd.conf add: access_provider = ldap ldap_access_filter = memberOf=cn=Group Name,ou=Groups,dc=example,dc=com Multiple LDAP Groups Under domain/default in /etc/sssd/sssd.conf add: access_provider = ldap ldap_access_filter = (|(memberOf=cn=System Adminstrators,ou=Groups,dc=example,dc=com)(memberOf=cn=Database Users,ou=Groups,dc=example,dc=com)) ldap_access_filter accepts standa...

Installing Zabbix Version 2.4 Offline (Zabbix Server without Internet).

There might be situations where you have a remote/zabbix server which does not have internet connectivity, due to security or other reasons. So we create a custom repo on the remote/zabbix server so that we can install zabbix using rpms Here is how we are planning to do this. Download all the dependency rpms on a machine which has internet connection, using yum-downloadonly or repotrack . Transfer all the rpms to the remote server. Create a repo on the remote server. Update yum configuration. Install. NOTE: This method can be used to install any application, but here we have used zabbix as we had this requirement for a zabbix server. Download dependent rpms . On a machine which has internet connection install the package below. And download all the rpms . Make sure the system are similar (not required to be identical - At-least the OS should be of same version) mkdir /zabbix_rpms yum install yum-downloadonly Downloading all the rpms to location /zabbix_rpms/ ,...