Skip to main content

Posts

Showing posts from January, 2014

Pig Installation on existing cloudera HDFS

sudo apt-get install pig To start the Grunt Shell (MRv1): this will start pig connecting to local HDFS. $ export PIG_CONF_DIR=/usr/lib/pig/conf $ export PIG_CLASSPATH=/usr/lib/hbase/hbase-0.94.2-cdh4.2.0-security.jar:/usr/lib/zookeeper/zookeeper-3.4.5-cdh4.2.0.jar $ pig 2012-02-08 23:39:41,819 [main] INFO  org.apache.pig.Main - Logging error messages to: /home/arvind/pig-0.9.2-cdh4b1/bin/pig_1328773181817.log 2012-02-08 23:39:41,994 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://localhost/ ... grunt> To Start Local pig - this will run the pig on local filesystem $ pig -x local

Hbase Export script

Basic Script to Export Hbase Backup. Incremental and Complete. NOTE : Import is still in progress and needs Testing. More Information on Hbase backup is here :  http://hadoop-hbase.blogspot.in/2012/04/timestamp-consistent-backups-in-hbase.html  and  http://hbase.apache.org/book/ops_mgt.html #!/bin/bash # # Text Formating # BOLD="\033[1m"; NORM="\033[0m"; BLACK_F="\033[30m"; BLACK_B="\033[40m" RED_F="\033[31m"; RED_B="\033[41m" GREEN_F="\033[32m"; GREEN_B="\033[42m" YELLOW_F="\033[33m"; YELLOW_B="\033[43m" BLUE_F="\033[34m"; BLUE_B="\033[44m" MAGENTA_F="\033[35m"; MAGENTA_B="\033[45m" CYAN_F="\033[36m"; CYAN_B="\033[46m" WHITE_F="\033[37m"; WHITE_B="\033[47m" CURRENTTIME="$(date +'%Y%m%d%H%M')" DATE="$(date +'%Y%m%d')" # This should come from commandline TABLE_NAME

FTP using wget

Below is a small script snippet to collect data from FTP server using wget . # Getting data from FTP server now. wget --ftp-user='username' --ftp-password='passwd' --no-passive-ftp ftp://ftpserver//FileName.zip -P $FTP_DOWNLOAD_BASE_PATH/ - q # Sample Code below to do some ftp collection currentdate=$1 loopenddate=$2 if [ "$currentdate" -lt "$loopenddate" ] then until [ "$currentdate" -eq "$loopenddate" ] do echo "$currentdate:Started wget" wget --ftp-user=username --ftp-password='passwd' -m --no-passive-ftp ftp://ftpserver//$currentdate/ -P /ftp/$currentdate - q echo "$d:Finished wget" currentdate=$(/bin/date --date "$currentdate 1 day" +%Y%m%d) done fi Complete FTP using wget will be on git. Will update soon.