Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive >> mail # user >> integration issure about hive and hbase


+
ch huang 2013-07-08, 05:54
+
Cheng Su 2013-07-08, 06:52
+
ch huang 2013-07-08, 08:32
+
ch huang 2013-07-08, 08:23
+
ch huang 2013-07-08, 08:40
+
bejoy_ks@... 2013-07-08, 10:16
Copy link to this message
-
Re: integration issure about hive and hbase
I am attaching portions from a document  I had written last year while investigating Hbase and Hive. You may have already crossed that bridge….nevertheless…

Please forgive me :-) if some steps seamy hacky and not very well explained….I was on a solo mission to build a Hive Data platform from scratch and QDBW  (Quick and Dirty But Works) was my philosophy to go ahead !!!

Good luck

Sanjay
================================================================================================================================================================================================================================
Hive and Hbase integration on local Fedora desktop guide<https://wizecommerce.atlassian.net/wiki/display/traffic/Hive+and+Hbase+integration+on+local+Fedora+desktop+guide>

Pre-requisites

  *   Hadoop needs to be installed and HDFS needs to be be running  (Hadoop HDFS setup on local Fedora desktop guide<https://wizecommerce.atlassian.net/wiki/display/traffic/Hadoop+HDFS+setup+on+local+Fedora+desktop+guide>)
  *   Hive needs to be installed (Hive setup on local Fedora desktop guide<https://wizecommerce.atlassian.net/wiki/display/traffic/Hive+setup+on+local+Fedora+desktop+guide>)
  *   HBase needs to be installed and running.(Hbase setup on local Fedora desktop guide<https://wizecommerce.atlassian.net/wiki/display/traffic/Hbase+setup+on+local+Fedora+desktop+guide>)
     *   Make sure ZooKeeper is running on port 2181. If not stop Hbase , change $HBASE_HOME/conf/hbase-site.xml and restart HBase

Copying JARS to HADOOP_CLASSPATH

Before you query tables , copy these jars from $HIVE_HOME/lib ----> $HADOOP_HOME/lib

  1.  Make sure zookeeper-3.4.3.jar is not there
     *   ls -latr   $HADOOP_HOME/lib/zookeeper-3.4.3.jar
  2.  Copy zookeeper-3.4.3.jar
     *   sudo cp -av $HIVE_HOME/zookeeper-3.4.3.jar $HADOOP_HOME/lib
  3.  Make sure hive-common-0.9.0.jar is not there
     *   ls -latr   $HADOOP_HOME/lib/hive-common-0.9.0.jar
  4.  Copy hive-common-0.9.0.jar
     *   sudo cp -av $HIVE_HOME/hive-common-0.9.0.jar $HADOOP_HOME/lib
  5.  Make sure hive-hbase-handler-0.9.0.jar is not there
     *   ls -latr   $HADOOP_HOME/lib/hive-hbase-handler-0.9.0.jar
  6.  Copy hive-hbase-handler-0.9.0.jar
     *   sudo cp -av $HIVE_HOME/hive-hbase-handler-0.9.0.jar $HADOOP_HOME/lib
  7.  Exit from Hive Shell (type exit;)

  8.  Exit from HBase shell

  9.  Stop Hbase
     *   $HBASE_HOME/bin/stop-hbase.sh
  10. Stop Hadoop/HDFS
     *   $HADOOP_HOME/bin/stop-all.sh
  11. Check if NO java processes related to Hadoop/HDFS/Hbase/Hive exist
     *    ps auxw | grep java
  12. Start Hadoop/HDFS
     *   $HADOOP_HOME/bin/start-all.sh
  13. Start Hbase
     *   $HBASE_HOME/bin/start-hbase.sh
  14. Check ALL  java processes related to Hadoop/HDFS/Hbase/Hive exist
     *   ps auxw | grep java

Create tables in HBase

  *   Refer Hbase setup on local Fedora desktop guide<https://wizecommerce.atlassian.net/wiki/display/traffic/Hbase+setup+on+local+Fedora+desktop+guide> and create the tables mentioned there
     *   hbase_2_hive_food
     *   hbase_2_hive_names

Create tables in HIVE

To run Hive type

$HIVE_HOME/bin/hive

This will take you to Hive shell. In the shell, create these two tables

  *   CREATE EXTERNAL TABLE hbase_hive_names(hbid INT, id INT,  fn STRING, ln STRING, age INT) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,id:id,name:fn,name:ln,age:age") TBLPROPERTIES("hbase.table.name" = "hbase_2_hive_names");
     *   This HIVE table will map to Hbase table hbase_2_hive_names
  *   CREATE EXTERNAL TABLE hbase_hive_food(hbid INT, id INT,  name STRING) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,id:id,name:name") TBLPROPERTIES("hbase.table.name" =id:id,name:name<http://idid%2Cnamename/>") TBLPROPERTIES("hbase.table.name" = "hbase_2_hive_food");
     *   This HIVE table will map to Hbase table hbase_2_hive_food
Creating & Loading tables in HBase through Hive
  *   Make sure there is no table in Hbase called 'hive2hbase_names_table'
  *   In Hive shell
     *   CREATE TABLE hive2hbase_names_table (hb_id int, fn string, ln string, age_dnq INT) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,student:fn,student:ln,student:age") TBLPROPERTIES ("hbase.table.name" = "hive2hbase_names_table") ;
  *   Go to HBase shell
     *   check that table hive2hbase_names_table is created.
  *   In Hive Shell
     *   create a Hive table and populate with data which we will use to populate the HiveHBase table
     *   CREATE TABLE names_tab (hb_id int, fn string, ln string, age_dnq INT) PARTITIONED BY (age INT) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
     *   LOAD DATA LOCAL INPATH '/data/mycode/impressions/inputfiles/names1.tsv.4fields' OVERWRITE INTO TABLE names_tab PARTITION (age=60);
     *   LOAD DATA LOCAL INPATH '/data/mycode/impressions/inputfiles/names2.tsv.4fields' OVERWRITE INTO TABLE names_tab PARTITION (age=70);
     *   INSERT OVERWRITE TABLE hive2hbase_names_table SELECT hb_id, fn, ln, age_dnq FROM names_tab WHERE age=60;
        *   The data files will look like this (separated by "\t")

1 paul simon 60
2 paul mccartney 60
3 paul anka 60

     *   INSERT OVERWRITE TABLE hive2hbase_names_table SELECT hb_id, fn, ln, age_dnq FROM names_tab WHERE age=70;
        *   The data files will look like this (separated by "\t")

4 brian may 70
5 george harrison 70
6 john glover 70

     *   Now u can query in Hive Shell
        *   select * from hive2hbase_names_table;
        *   select * from hive2hbase_names_table where age_dnq=60;

To detete the HBase table

  *   Go In HBase shell.
     *   Disable 'hive2hbase_names_table'
  *   Go to Hive shell.
     *   Drop table 'hive2hbase_names_table'  (This deletes the table from Hbase)
To check , go to HBase shell

  *   you will see that the table
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB