Skip to main content

Installing Hadoop 1.0.3 on Ubuntu Single Node Cluster using shell script

I was working on setting up Hadoop on Ubuntu as a Single node cluster.

I came across a very nice blog about it here. (Must read to setup your single node cluster).
While I was at it, I was creating / Installing Hadoop multiple time in different system, then I though to create a script of my own, based on the blog above.

Here is the link to my script which is on GITHUB anyone interested can check-out and enhance the script.

https://github.com/zubayr/hadoopscript/blob/master/initScriptHadoop.sh

README : https://github.com/zubayr/hadoopscript/blob/master/README.txt

Requirement.

1. Hadoop 1.0.3

2. Ubuntu 10.04  or above (Tested on 11.04, 11.10 and 12.04 32bit platform)

Here is the details on how to install Hadoop using the script

Please Readme

- hadoop script to setup Single Node Cluster - For Hadoop 1.0.3 Only.

- Tested on Ubuntu 11.10, 12.04 - Fresh Install.

- Scripts assumes nothing is installed for Hadoop and installs Required Components for Hadoop to run.

- This Script was created using the Installation Guide by Micheal Noll.

http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/

 

Steps For Executing Script: Currently script only takes single option at a time :(

-----------------------------------------------------------------------

Execute Help

]$ sudo ./initScriptHadoop.sh --help

 

usage: ./initScriptHadoop.sh <single-parameter>

 

  Optional parameters:

     --install-init, -i Initialization script To Install Hadoop as Single Node Cluster.

     Use the below Options, Once you are logged-in as Hadoop User 'hduser' created in the -i init script above.

     --install-ssh, -s Install ssh-keygen -t rsa -P

     --install-bashrc, -b Updated '.bashrc' with JAVA_HOME, HADOOP_HOME.

     --ipv6-disable, -v IPv6 Support Disable.[ Might Not be required.

                                Updating 'conf/hadoop-env.sh' with

'HADOOP_OPTS=-Djava.net.preferIPv4Stack=true' option in -e]

     --hostname-update, -u Update Hostname for the system.

     --config-update, -c Update Configuration with default values

(Single Node) in core-site.xml, mapred-site.xml, hdfs-site.xml.

     --update-hadoop-env, -e Update Hadoop Env Script with JAVA_HOME.

     --help, -h Display this Message.

 

1. First Install prerequisites using -i Option

     ahmed@ahmed-on-Edge:~$ ./initScriptHadoop.sh -i

      Welcome to Precofiguration For Hadoop single node setup wizard

    

     Would you like install Java 1.6 ? (y/n) y

     Would you like to setup user 'hduser' and 'hadoop Group'? (y/n) y

     Would you like to download Hadoop 1.0.3 and extract to /usr/local? (y/n) y

     Would you like to make 'hduser' owner /usr/local/hadoop/ directory? (y/n) y

     Would you like to login into 'htuser' once done? (y/n) y

    

      Review your choices:

    

      Install Java 1.6 : y

      Setup 'hduser' user : y

      Download Hadoop 1.0.3 : y

      Setup 'hduser' as Owner : y

      Login to 'hduser' : y

    

     Proceed with setup? (y/n)y

 

During the installation it will ask for password for hduser

 

2. Login to 'hduser' (which will be created in the -i options).

3. Execute options -s, -b, -c, -e

   /initScriptHadoop.sh -s;

   /initScriptHadoop.sh -b;

   /initScriptHadoop.sh -c;

   /initScriptHadoop.sh -e;


 


Once you are done installing. Login as hduser.


First Format the node.



] $ cd /usr/local/hadoop


] $ hduser@ubuntu:/usr/local/hadoop$ bin/hadoop namenode -format


And format the node will generate output similar to this


hduser@ubuntu:/usr/local/hadoop$ bin/hadoop namenode -format
10/05/08 16:59:56 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = ubuntu/127.0.1.1
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 0.20.2
STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/..
************************************************************/
10/05/08
16:59:56 INFO namenode.FSNamesystem: fsOwner=hduser,hadoop
10/05/08 16:59:56 INFO namenode.FSNamesystem: supergroup=supergroup
10/05/08 16:59:56 INFO namenode.FSNamesystem: isPermissionEnabled=true
10/05/08 16:59:56 INFO common.Storage: Image file of size 96 saved in 0 seconds.
10/05/08 16:59:57 INFO common.Storage: Storage . . . as been successfully formatted.
10/05/08 16:59:57 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at ubuntu/127.0.1.1
************************************************************/
hduser@ubuntu:/usr/local/hadoop$

Next lets start the Single node Cluster and then check if all is well with “jps” command.


hduser@ubuntu:/usr/local/hadoop$ bin/start-all.sh
starting namenode, logging to /usr/local/hadoop/bin/../logs/hadoop-hduser-namenode-ubuntu.out
localhost: starting datanode, logging to /usr/local/hadoop/bin/../logs/hadoop-hduser-datanode-ubuntu.out
localhost: starting secondarynamenode, logging to /usr/local/hadoop/bin/../logs/hadoop-hduser-secondarynamenode-ubuntu.out
starting jobtracker, logging to /usr/local/hadoop/bin/../logs/hadoop-hduser-jobtracker-ubuntu.out
localhost: starting tasktracker, logging to /usr/local/hadoop/bin/../logs/hadoop-hduser-tasktracker-ubuntu.out
hduser@ubuntu:/usr/local/hadoop$

hduser@ubuntu:/usr/local/hadoop$ jps
2287 TaskTracker
2149 JobTracker
1938 DataNode
2085 SecondaryNameNode
2349 Jps
1788 NameNode

This shows that all the processes are working fine. Now we are ready to run some good old word count MapReduce Code.

Comments

Popular posts from this blog

Cloudera Manager - Duplicate entry 'zookeeper' for key 'NAME'.

We had recently built a cluster using cloudera API’s and had all the services running on it with Kerberos enabled. Next we had a requirement to add another kafka cluster to our already exsisting cluster in cloudera manager. Since it is a quick task to get the zookeeper and kafka up and running. We decided to get this done using the cloudera manager instead of the API’s. But we faced the Duplicate entry 'zookeeper' for key 'NAME' issue as described in the bug below. https://issues.cloudera.org/browse/DISTRO-790 I have set up two clusters that share a Cloudera Manger. The first I set up with the API and created the services with capital letter names, e.g., ZOOKEEPER, HDFS, HIVE. Now, I add the second cluster using the Wizard. Add Cluster->Select Hosts->Distribute Parcels->Select base HDFS Cluster install On the next page i get SQL errros telling that the services i want to add already exist. I suspect that the check for existing service names does n

Zabbix History Table Clean Up

Zabbix history table gets really big, and if you are in a situation where you want to clean it up. Then we can do so, using the below steps. Stop zabbix server. Take table backup - just in case. Create a temporary table. Update the temporary table with data required, upto a specific date using epoch . Move old table to a different table name. Move updated (new temporary) table to original table which needs to be cleaned-up. Drop the old table. (Optional) Restart Zabbix Since this is not offical procedure, but it has worked for me so use it at your own risk. Here is another post which will help is reducing the size of history tables - http://zabbixzone.com/zabbix/history-and-trends/ Zabbix Version : Zabbix v2.4 Make sure MySql 5.1 is set with InnoDB as innodb_file_per_table=ON Step 1 Stop the Zabbix server sudo service zabbix-server stop Script. echo "------------------------------------------" echo " 1. Stopping Zabbix Server &quo

Access Filter in SSSD `ldap_access_filter` [SSSD Access denied / Permission denied ]

Access Filter Setup with SSSD ldap_access_filter (string) If using access_provider = ldap , this option is mandatory. It specifies an LDAP search filter criteria that must be met for the user to be granted access on this host. If access_provider = ldap and this option is not set, it will result in all users being denied access. Use access_provider = allow to change this default behaviour. Example: access_provider = ldap ldap_access_filter = memberOf=cn=allowed_user_groups,ou=Groups,dc=example,dc=com Prerequisites yum install sssd Single LDAP Group Under domain/default in /etc/sssd/sssd.conf add: access_provider = ldap ldap_access_filter = memberOf=cn=Group Name,ou=Groups,dc=example,dc=com Multiple LDAP Groups Under domain/default in /etc/sssd/sssd.conf add: access_provider = ldap ldap_access_filter = (|(memberOf=cn=System Adminstrators,ou=Groups,dc=example,dc=com)(memberOf=cn=Database Users,ou=Groups,dc=example,dc=com)) ldap_access_filter accepts standa