Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive, mail # user - integration issure about hive and hbase


Copy link to this message
-
Re: integration issure about hive and hbase
Sanjay Subramanian 2013-07-09, 21:29
I am attaching portions from a document  I had written last year while investigating Hbase and Hive. You may have already crossed that bridge….nevertheless…

Please forgive me :-) if some steps seamy hacky and not very well explained….I was on a solo mission to build a Hive Data platform from scratch and QDBW  (Quick and Dirty But Works) was my philosophy to go ahead !!!

Good luck

Sanjay
================================================================================================================================================================================================================================
Hive and Hbase integration on local Fedora desktop guide<https://wizecommerce.atlassian.net/wiki/display/traffic/Hive+and+Hbase+integration+on+local+Fedora+desktop+guide>

Pre-requisites

  *   Hadoop needs to be installed and HDFS needs to be be running  (Hadoop HDFS setup on local Fedora desktop guide<https://wizecommerce.atlassian.net/wiki/display/traffic/Hadoop+HDFS+setup+on+local+Fedora+desktop+guide>)
  *   Hive needs to be installed (Hive setup on local Fedora desktop guide<https://wizecommerce.atlassian.net/wiki/display/traffic/Hive+setup+on+local+Fedora+desktop+guide>)
  *   HBase needs to be installed and running.(Hbase setup on local Fedora desktop guide<https://wizecommerce.atlassian.net/wiki/display/traffic/Hbase+setup+on+local+Fedora+desktop+guide>)
     *   Make sure ZooKeeper is running on port 2181. If not stop Hbase , change $HBASE_HOME/conf/hbase-site.xml and restart HBase

Copying JARS to HADOOP_CLASSPATH

Before you query tables , copy these jars from $HIVE_HOME/lib ----> $HADOOP_HOME/lib

  1.  Make sure zookeeper-3.4.3.jar is not there
     *   ls -latr   $HADOOP_HOME/lib/zookeeper-3.4.3.jar
  2.  Copy zookeeper-3.4.3.jar
     *   sudo cp -av $HIVE_HOME/zookeeper-3.4.3.jar $HADOOP_HOME/lib
  3.  Make sure hive-common-0.9.0.jar is not there
     *   ls -latr   $HADOOP_HOME/lib/hive-common-0.9.0.jar
  4.  Copy hive-common-0.9.0.jar
     *   sudo cp -av $HIVE_HOME/hive-common-0.9.0.jar $HADOOP_HOME/lib
  5.  Make sure hive-hbase-handler-0.9.0.jar is not there
     *   ls -latr   $HADOOP_HOME/lib/hive-hbase-handler-0.9.0.jar
  6.  Copy hive-hbase-handler-0.9.0.jar
     *   sudo cp -av $HIVE_HOME/hive-hbase-handler-0.9.0.jar $HADOOP_HOME/lib
  7.  Exit from Hive Shell (type exit;)

  8.  Exit from HBase shell

  9.  Stop Hbase
     *   $HBASE_HOME/bin/stop-hbase.sh
  10. Stop Hadoop/HDFS
     *   $HADOOP_HOME/bin/stop-all.sh
  11. Check if NO java processes related to Hadoop/HDFS/Hbase/Hive exist
     *    ps auxw | grep java
  12. Start Hadoop/HDFS
     *   $HADOOP_HOME/bin/start-all.sh
  13. Start Hbase
     *   $HBASE_HOME/bin/start-hbase.sh
  14. Check ALL  java processes related to Hadoop/HDFS/Hbase/Hive exist
     *   ps auxw | grep java

Create tables in HBase

  *   Refer Hbase setup on local Fedora desktop guide<https://wizecommerce.atlassian.net/wiki/display/traffic/Hbase+setup+on+local+Fedora+desktop+guide> and create the tables mentioned there
     *   hbase_2_hive_food
     *   hbase_2_hive_names

Create tables in HIVE

To run Hive type

$HIVE_HOME/bin/hive

This will take you to Hive shell. In the shell, create these two tables

  *   CREATE EXTERNAL TABLE hbase_hive_names(hbid INT, id INT,  fn STRING, ln STRING, age INT) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,id:id,name:fn,name:ln,age:age") TBLPROPERTIES("hbase.table.name" = "hbase_2_hive_names");
     *   This HIVE table will map to Hbase table hbase_2_hive_names
  *   CREATE EXTERNAL TABLE hbase_hive_food(hbid INT, id INT,  name STRING) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,id:id,name:name") TBLPROPERTIES("hbase.table.name" =id:id,name:name<http://idid%2Cnamename/>") TBLPROPERTIES("hbase.table.name" = "hbase_2_hive_food");
     *   This HIVE table will map to Hbase table hbase_2_hive_food
Creating & Loading tables in HBase through Hive
  *   Make sure there is no table in Hbase called 'hive2hbase_names_table'
  *   In Hive shell
     *   CREATE TABLE hive2hbase_names_table (hb_id int, fn string, ln string, age_dnq INT) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,student:fn,student:ln,student:age") TBLPROPERTIES ("hbase.table.name" = "hive2hbase_names_table") ;
  *   Go to HBase shell
     *   check that table hive2hbase_names_table is created.
  *   In Hive Shell
     *   create a Hive table and populate with data which we will use to populate the HiveHBase table
     *   CREATE TABLE names_tab (hb_id int, fn string, ln string, age_dnq INT) PARTITIONED BY (age INT) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
     *   LOAD DATA LOCAL INPATH '/data/mycode/impressions/inputfiles/names1.tsv.4fields' OVERWRITE INTO TABLE names_tab PARTITION (age=60);
     *   LOAD DATA LOCAL INPATH '/data/mycode/impressions/inputfiles/names2.tsv.4fields' OVERWRITE INTO TABLE names_tab PARTITION (age=70);
     *   INSERT OVERWRITE TABLE hive2hbase_names_table SELECT hb_id, fn, ln, age_dnq FROM names_tab WHERE age=60;
        *   The data files will look like this (separated by "\t")

1 paul simon 60
2 paul mccartney 60
3 paul anka 60

     *   INSERT OVERWRITE TABLE hive2hbase_names_table SELECT hb_id, fn, ln, age_dnq FROM names_tab WHERE age=70;
        *   The data files will look like this (separated by "\t")

4 brian may 70
5 george harrison 70
6 john glover 70

     *   Now u can query in Hive Shell
        *   select * from hive2hbase_names_table;
        *   select * from hive2hbase_names_table where age_dnq=60;

To detete the HBase table

  *   Go In HBase shell.
     *   Disable 'hive2hbase_names_table'
  *   Go to Hive shell.
     *   Drop table 'hive2hbase_names_table'  (This deletes the table from Hbase)
To check , go to HBase shell

  *   you will see that the table