|
Mohammad Tariq
2011-12-01, 12:19
Mohammad Tariq
2011-12-17, 18:02
Bejoy Ks
2011-12-17, 19:05
Mohammad Tariq
2011-12-17, 20:39
Savant, Keshav
2012-02-28, 12:28
|
-
Hive-Hbase integrationMohammad Tariq 2011-12-01, 12:19
Hello list,
Could anyone tell me the basic (and must) requirements for integrating Hive and Hbase??? I have followed the " Hive HBase Integration " link on cwiki but I am not able to do it. Need some urgent help.Many thanks in advance. Regards, Mohammad Tariq +
Mohammad Tariq 2011-12-01, 12:19
-
Hive-Hbase integrationMohammad Tariq 2011-12-17, 18:02
Hello list,
I have a small demo table in Hbase and I want to operate it through Hive.Here is my table in Hbase - hbase(main):021:0> scan 'employee' ROW COLUMN+CELL emp1 column=address:, timestamp=1324119715536, value=#12-bangalore emp1 column=name:, timestamp=1324119698581, value=tariq emp1 column=no:, timestamp=1324119688511, value=001 emp2 column=address:, timestamp=1324120893996, value=#13-bangalore emp2 column=name:, timestamp=1324120883612, value=vishal emp2 column=no:, timestamp=1324120866981, value=002 2 row(s) in 0.0260 seconds I have 2 rows in the employee table, each corresponding to a particular user. And I have 3 column families (each having only 1 column) - no, name and address. For this table I have created an external table in Hive using the following command - hive> CREATE EXTERNAL TABLE employee(key string,no string,name string,address string) > STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' > WITH SERDEPROPERTIES ("hbase.columns.mapping" "no:,name:,address:") > TBLPROPERTIES("hbase.table.name" = "employee"); But i am getting the following error - FAILED: Error in metadata: java.lang.RuntimeException: MetaException(message:org.apache.hadoop.hive.serde2.SerDeException org.apache.hadoop.hive.hbase.HBaseSerDe: hbase column family 'no' should be mapped to Map<String,?> but is mapped to string)FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask Could someone point out my mistake??Also, I would like to know whether the field "key" corresponds to each row in the Hbase table i.e emp1 and emp2 or am I getting the concept wrong??I was going through the wiki, but could not find the proper explanation there.Sorry if my question seems childish. Many thanks. Regards, Mohammad Tariq +
Mohammad Tariq 2011-12-17, 18:02
-
Re: Hive-Hbase integrationBejoy Ks 2011-12-17, 19:05
Hi Tariq
From the stack trace, I believe the issue could be due to the fact that you are just providing Column Families but no Qualifiers in thehbase.columns.mapping. If you don't specify the qualifier for a column family then the hive column would be mapped to all the Qualifiers corresponding to that hbase Column Family. So here what happens is that ,all the qualifiers for each column family is made to map and this map is supposed to be stored in hive tables, but in your query you are mapping these maps to primitives and it results in the exception. In hive wiki such an operation is mentioned illegal, please refer https://cwiki.apache.org/Hive/hbaseintegration.html#HBaseIntegration-ColumnMapping https://cwiki.apache.org/Hive/hbaseintegration.html#HBaseIntegration-Illegal%253AHivePrimitivetoHBaseColumnFamily You can get your query working by just changing the data type of Hbase columns also better to add key in your mapping, CREATE EXTERNAL TABLE employee(key string,no map<string,string>,name map<string,string>,address map<string,string>) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,no:,name:,address:") TBLPROPERTIES("hbase.table.name"= "employee"); For your second question, In Hbase every row is uniquely identified by the ROW_KEY, here with :key in our mapping we are mapping this row key to one of our hive table column. From your output the two values of row key in hbase employee table are emp1 and emp2. I believe your confusion is from the hbase CLI output. In RDBMS/hive query we see a record in a line on querying, but in hbase shell one line represents a column family not an entire record unlike hive. If there are 10 column families in your hbase table, then on scan command you get 10 lines for one record (by record in Hbase i refer to all the attributes corresponding to a row key). Here since you have 3 Column Families, you see 3 lines represent a record(attributes of emp*) . Hope it helps!... Regards Bejoy.K.S ________________________________ From: Mohammad Tariq <[EMAIL PROTECTED]> To: user <[EMAIL PROTECTED]> Sent: Saturday, December 17, 2011 11:32 PM Subject: Hive-Hbase integration Hello list, I have a small demo table in Hbase and I want to operate it through Hive.Here is my table in Hbase - hbase(main):021:0> scan 'employee' ROW COLUMN+CELL emp1 column=address:, timestamp=1324119715536, value=#12-bangalore emp1 column=name:, timestamp=1324119698581, value=tariq emp1 column=no:, timestamp=1324119688511, value=001 emp2 column=address:, timestamp=1324120893996, value=#13-bangalore emp2 column=name:, timestamp=1324120883612, value=vishal emp2 column=no:, timestamp=1324120866981, value=002 2 row(s) in 0.0260 seconds I have 2 rows in the employee table, each corresponding to a particular user. And I have 3 column families (each having only 1 column) - no, name and address. For this table I have created an external table in Hive using the following command - hive> CREATE EXTERNAL TABLE employee(key string,no string,name string,address string) > STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' > WITH SERDEPROPERTIES ("hbase.columns.mapping" "no:,name:,address:") > TBLPROPERTIES("hbase.table.name" = "employee"); But i am getting the following error - FAILED: Error in metadata: java.lang.RuntimeException: MetaException(message:org.apache.hadoop.hive.serde2.SerDeException org.apache.hadoop.hive.hbase.HBaseSerDe: hbase column family 'no' should be mapped to Map<String,?> but is mapped to string)FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask Could someone point out my mistake??Also, I would like to know whether the field "key" corresponds to each row in the Hbase table i.e emp1 and emp2 or am I getting the concept wrong??I was going through the wiki, but could not find the proper explanation there.Sorry if my question seems childish. Many thanks. Regards, Mohammad Tariq +
Bejoy Ks 2011-12-17, 19:05
-
Re: Hive-Hbase integrationMohammad Tariq 2011-12-17, 20:39
Hi Bejoy,
Thank you so much for your help again..Your way of explaining things is really great..And the query provided by you is working absolutely fine. Regards, Mohammad Tariq On Sun, Dec 18, 2011 at 12:35 AM, Bejoy Ks <[EMAIL PROTECTED]> wrote: > Hi Tariq > From the stack trace, I believe the issue could be due to the fact that > you are just providing Column Families but no Qualifiers in the > hbase.columns.mapping. If you don't specify the qualifier for a column > family then the hive column would be mapped to all the Qualifiers > corresponding to that hbase Column Family. So here what happens is that ,all > the qualifiers for each column family is made to map and this map is > supposed to be stored in hive tables, but in your query you are mapping > these maps to primitives and it results in the exception. In hive wiki such > an operation is mentioned illegal, please refer > https://cwiki.apache.org/Hive/hbaseintegration.html#HBaseIntegration-ColumnMapping > https://cwiki.apache.org/Hive/hbaseintegration.html#HBaseIntegration-Illegal%253AHivePrimitivetoHBaseColumnFamily > > You can get your query working by just changing the data type of Hbase > columns also better to add key in your mapping, > > CREATE EXTERNAL TABLE employee(key string,no map<string,string>,name > map<string,string>,address map<string,string>) > STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' > WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,no:,name:,address:") > TBLPROPERTIES("hbase.table.name"= "employee"); > > For your second question, In Hbase every row is uniquely identified by the > ROW_KEY, here with :key in our mapping we are mapping this row key to one of > our hive table column. From your output the two values of row key in hbase > employee table are emp1 and emp2. I believe your confusion is from the hbase > CLI output. In RDBMS/hive query we see a record in a line on querying, but > in hbase shell one line represents a column family not an entire record > unlike hive. If there are 10 column families in your hbase table, then on > scan command you get 10 lines for one record (by record in Hbase i refer to > all the attributes corresponding to a row key). Here since you have 3 Column > Families, you see 3 lines represent a record(attributes of emp*) . > > Hope it helps!... > > Regards > Bejoy.K.S > > > ________________________________ > From: Mohammad Tariq <[EMAIL PROTECTED]> > To: user <[EMAIL PROTECTED]> > Sent: Saturday, December 17, 2011 11:32 PM > Subject: Hive-Hbase integration > > Hello list, > > I have a small demo table in Hbase and I want to operate it > through Hive.Here is my table in Hbase - > > hbase(main):021:0> scan 'employee' > ROW COLUMN+CELL > emp1 column=address:, > timestamp=1324119715536, value=#12-bangalore > emp1 column=name:, > timestamp=1324119698581, value=tariq > emp1 column=no:, > timestamp=1324119688511, value=001 > emp2 column=address:, > timestamp=1324120893996, value=#13-bangalore > emp2 column=name:, > timestamp=1324120883612, value=vishal > emp2 column=no:, > timestamp=1324120866981, value=002 > 2 row(s) in 0.0260 seconds > > I have 2 rows in the employee table, each corresponding to a > particular user. And I have 3 column families (each having only 1 > column) - no, name and address. > > For this table I have created an external table in Hive using the > following command - > > hive> CREATE EXTERNAL TABLE employee(key string,no string,name > string,address string) > STORED BY > 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' > > WITH SERDEPROPERTIES ("hbase.columns.mapping" > "no:,name:,address:") > TBLPROPERTIES("hbase.table.name" > = "employee"); > But i am getting the following error - > FAILED: Error in metadata: java.lang.RuntimeException: +
Mohammad Tariq 2011-12-17, 20:39
-
Hive | HBase IntegrationSavant, Keshav 2012-02-28, 12:28
Hi All,
We did a successful setup of hadoop-0.20.203.0 and hive-0.7.1. In our next step we are eyeing HBase integration with Hive. As far as we understand from articles available on internet and apache site, we can use HBase instead of derby as a metastore of Hive, this gives us more flexibility while handling very large data. We are using hbase-0.92.0 to integrate it with Hive, till now HBase has been setup and we can create sample table on it and insert sample data in it, but we are not able to integrate it with Hive, because when we issue the command to create hive specific table on HBase (below in box) the command does not executes completely and a new command line is shown with an asterisk (*), and table does not gets created. CREATE TABLE hive_hbasetable_k(key int, value string) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf1:val") TBLPROPERTIES ("hbase.table.name" = "hivehbasek"); Please provide us some pointers (steps to follow) for doing this integration or what we are not doing correctly. Till now we got these below URLs to do this, any help is appreciated http://mevivs.wordpress.com/2010/11/24/hivehbase-integration/ https://cwiki.apache.org/confluence/display/Hive/HBaseIntegration Kind regards, Keshav C Savant _____________ The information contained in this message is proprietary and/or confidential. If you are not the intended recipient, please: (i) delete the message and all copies; (ii) do not disclose, distribute or use the message in any manner; and (iii) notify the sender immediately. In addition, please be aware that any message addressed to our domain is subject to archiving and review by persons other than the intended recipient. Thank you. +
Savant, Keshav 2012-02-28, 12:28
|