|
|
+
Nikhil Gupta 2009-08-19, 19:49
+
Alan Gates 2009-09-09, 20:34
+
Liu Xianglong 2009-09-10, 01:20
+
Nikhil Gupta 2009-09-10, 04:51
+
Alan Gates 2009-09-18, 22:10
-
Re: Storing Pig output into HBase tablesVincent BARAT 2009-09-19, 17:13
Thanks Alan,
I also definitively needs this functionality, and I plan to write it soon. I was actually on the process of doing what you explained, but I was blocked on the best way to specify the name of the HBase table where to store the data (and also the associated storage schema) using the "store A into B using C;" paradigm. Do you have any recommendation about that ? Alan Gates a �crit : > In order to store information in HBase, you will need to use an > OutputFormat that is HBase compatible. There exists a TableOutputFormat > in Hbase that will write data. The trick is to get Pig to use that > OutputFormat. It is possible, but Pig does not yet do a good job of > making it easy. > > You will need to write a StoreFunc that returns TableOuputFormat from > getStoragePreparationClass. You will then need to have the putNext call > in StoreFunc write to TableOutputFormat's RecordWriter. For an example > of how to do this, see > contrib/zebra/src/java/org/apache/hadoop/zebra/pig/TableStorer.java in > Pig's contrib directory. > > Alan. > > On Sep 9, 2009, at 6:20 PM, Liu Xianglong wrote: > >> Hi, Alan. I am interest in this store function, could you mind sending >> me some details? >> >> -------------------------------------------------- >> From: "Alan Gates" <[EMAIL PROTECTED]> >> Sent: Thursday, September 10, 2009 4:34 AM >> To: <[EMAIL PROTECTED]> >> Subject: Re: Storing Pig output into HBase tables >> >>> I do not know if there is a general hbase load/import tool. That >>> would be a good question for the hbase-user list. >>> >>> Right now Pig does not have a store function to write data into >>> hbase. It is possible to write such a function. If you are >>> interested I can send you specific details on how to do it. >>> >>> Alan. >>> >>> On Aug 19, 2009, at 12:49 PM, Nikhil Gupta wrote: >>> >>>> Hi all, >>>> >>>> I am working no building a analytics kind of engine which takes >>>> daily server >>>> logs, crunches the data using Pig scripts and (for now) outputs >>>> data to >>>> HDFS. Later, this data is to be stored on HBase to enable efficient >>>> querying >>>> from front-end. >>>> >>>> Currently, I am searching for efficient ways of moving the Pig >>>> output on >>>> HDFS to the HBase tables. Though this seems to be a very basic >>>> task, I could >>>> not find any easy way of doing that, except for writing some Java >>>> code. The >>>> problem is I'll have many different kind of output formats, and >>>> writing java >>>> code for loading each such file seems wrong. Probably I am missing >>>> something. >>>> >>>> Is there any way of storing Pig output directly in a Hbase table >>>> [loading is >>>> possible by HBaseStorage, but that doesn't talk of storing]. Or is >>>> there any >>>> general data load/import tool for Hbase? >>>> >>>> Thanks! >>>> Nikhil Gupta >>>> Graduate Student, >>>> Stanford University >>> > > > +
Alan Gates 2009-09-21, 20:12
|