Here is a note on my technical journey on installing Pseudo-distributed HBase on Hadoop. I use hadoop 0.20.205 (hadoop-0.20.205.0.tar.gz) and hbase-0.90 (hbase-0.90.5.tar.gz). You can download them from http://hadoop.apache.org/common/releases.html and http://www.apache.org/dyn/closer.cgi/hbase/
Prerequisites :
You need to have Pseudo-distributed Hadoop installed and working first. In my case, I have install Hadoop on /usr/local/hadoop20 and put Hadoop on PATH.
export HADOOP_HOME=/usr/local/hadoop20
export PATH=$PATH:$HADOOP_HOME/bin
Steps to Install HBase
1. Download hbase-0.90.5.tar.gz and place it at /usr/local and untar it at /usr/local
cd /usr/local
tar xvf hbase-0.90.5.tar.gz
ln -s hbase-0.90.5 hbase
2. Config network settings:
Run the command:
lei:lei$ ifconfig
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> mtu 16384
inet6 ::1 prefixlen 128
inet6 fe80::1%lo0 prefixlen 64 scopeid 0x1
inet 127.0.0.1 netmask 0xff000000
gif0: flags=8010<POINTOPOINT,MULTICAST> mtu 1280
stf0: flags=0<> mtu 1280
en0: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 1500
ether 00:25:4c:e3:b9:7d
media: autoselect
status: inactive
en1: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 1500
ether 00:25:00:4d:74:9f
inet6 fe80::225:ff:fe4c:749e%en1 prefixlen 64 scopeid 0x5
inet 192.168.5.62 netmask 0xffffff00 broadcast 192.168.5.255
media: autoselect
status: active
fw0: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 4078
lladdr 00:25:4c:gg:gf:e3:b9:7d
media: autoselect <full-duplex>
status: inactive
Here is content of my /etc/hosts
$ cat /etc/hosts
127.0.0.1 lei.hadoop.local hbase localhost lei
192.168.5.62 home.lei.local
3 Config Hbase
$vi hbase-env.sh
add the following line:
export JAVA_HOME=/Library/Java/Home
$vi hbase-site.xml
add the following content:
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>hbase.zookeeper.quorum</name>
<value>home.lei.local</value>
<description></description>
</property>
<property>
<name>hbase.regionserver.dns.nameserver</name>
<value>lei.hadoop.local</value>
<description></description>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/tmp/zookeeper</value>
<description></description>
</property>
<property>
<name>hbase.rootdir</name>
<value>hdfs://lei.hadoop.local:9000/hbase2</value>
<description></description>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
<description></description>
</property>
<property>
<name>hbase.master</name>
<value>lei.hadoop.local:60000</value>
<description></description>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
<description></description>
</property>
</configuration>
Entry hbase.rootdir needs to match core-site.xml's fs.default.name from Hadoop
Here is my settings:
lei:conf lei$ cat ../../hadoop20/conf/core-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://lei.hadoop.local:9000</value>
<description></description>
</property>
</configuration>
lei$ cat regionservers
lei.hadoop.local
home.lei.local
Start and Run Hadoop and HBase:
1. start Hadoop:
lei:lei$ start-all.sh
lei:lei$ hadoop fs -ls
lei:lei$ hadoop fs -mkdir /hbase2
lei:lei$ jps
84756 NameNode
84926 SecondaryNameNode
1412
85095 Jps
84841 DataNode
lei:lei$
Note: somehow, you need to copy *.jar under hadoop20/share/hadoop/lib/ to hbase/lib.
2. start Hbase:
lei:hbase lei$ ./bin/start-hbase.sh
lei:conf lei$ jps
84756 NameNode
84926 SecondaryNameNode
1412
85439 Jps
85388 HRegionServer
85268 HQuorumPeer
84841 DataNode
85298 HMaster
3. Verify HBase:
lei:hbase lei$ bin/hbase shell
HBase Shell; enter 'help<RETURN>' for list of supported commands.
Type "exit<RETURN>" to leave the HBase Shell
Version 0.90.5, r1212209, Fri Dec 9 05:40:36 UTC 2011
hbase(main):001:0> status
1 servers, 0 dead, 2.0000 average load
hbase(main):002:0> create 'test', 'cf'
0 row(s) in 3.9240 seconds
hbase(main):003:0> list 'test'
TABLE
test
1 row(s) in 0.0820 seconds
hbase(main):004:0> put 'test', 'row1', 'cf:a', 'value1'
0 row(s) in 0.3320 seconds
hbase(main):005:0> put 'test', 'row2', 'cf:b', 'value2'
0 row(s) in 0.0490 seconds
hbase(main):006:0> scan 'test'
ROW COLUMN+CELL
row1 column=cf:a, timestamp=1328987613037, value=value1
row2 column=cf:b, timestamp=1328987619457, value=value2
2 row(s) in 0.1020 seconds
hbase(main):007:0> disable 'test'
0 row(s) in 2.0980 seconds
hbase(main):008:0> drop 'test'
0 row(s) in 1.2400 seconds
hbase(main):009:0> exit
lei:hbase lei$
4. stop HBase and Hadoop:
lei:hbase lei$ ./bin/stop-hbase.sh
stopping hbase.....
home.lei.local: stopping zookeeper.
lei:hbase lei$
lei:hbase lei$ stop-all.sh
no jobtracker to stop
lei.hadoop.local: no tasktracker to stop
stopping namenode
lei.hadoop.local: stopping datanode
lei.hadoop.local: stopping secondarynamenode
lei:hbase lei$
You can find useful trouble shooting link here: http://wiki.apache.org/hadoop/Hbase/Troubleshooting
Enjoy.
Thank you .The guidelines are perfectly apt.
ReplyDeleteI ended up fixing a few errors.
Firstly I was not able to start the HMaster
It was a permission issue
I had to chmod 777 -R to my hbase directory
Secondly there is a need to copy the hadoop core jar and all the other jars in Hadoop to Hbase lib directory
Many of the forums have emphasized the need for maintaining compatibility between the JAR's
Thanks a lot!
ReplyDeleteThe above setup worked for me at my home. However when I went to university, the region servers were not working. Digging inside, I realized, I also have to set up forward and reverse resolution for the domains specified. How to? - https://help.ubuntu.com/community/BIND9ServerHowto
ReplyDelete