Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - adding data


Copy link to this message
-
Re: adding data
anil gupta 2012-08-04, 05:39
Hi Rita,

HBase Bulk Loader is a viable solution for loading such huge data set. Even
if your import file has a separator other than tab you can use ImportTsv as
long as the separator is single character. If in case you want to put in
your business logic while writing the data to HBase then you can write your
own mapper class and use it with bulk loader. Hence, you can heavily
customize the bulk loader as per your needs.
These links might be helpful for you:
http://hbase.apache.org/book.html#arch.bulk.load
http://bigdatanoob.blogspot.com/2012/03/bulk-load-csv-file-into-hbase.html

HTH,
Anil Gupta

On Fri, Aug 3, 2012 at 9:54 PM, Bijeet Singh <[EMAIL PROTECTED]> wrote:

> Well, if the file that you have contains TSV, you can directly use the
> ImportTSV utility of HBase to do a bulk load.
> More details about that can be found here :
>
> http://hbase.apache.org/book/ops_mgt.html#importtsv
>
> The other option for you is to run a MR job on the file that you have, to
> generate the HFiles, which you can later import
> to HBase using completebulkload.  HFiles are created using the
> HFileOutputFormat class.The output of Map should
> be Put or KeyValue. For Reduce you need to use configureIncrementalLoad
> which sets up reduce tasks.
>
> Bijeet
>
>
> On Sat, Aug 4, 2012 at 8:13 AM, Rita <[EMAIL PROTECTED]> wrote:
>
> > I have a file which has 13 billion rows of key an value which I would
> like
> > to place in Hbase. I was wondering if anyone has a good example to
> provide
> > with mapreduce for some sort of work like this.
> >
> >
> > tia
> >
> >
> > --
> > --- Get your facts first, then you can distort them as you please.--
> >
>

--
Thanks & Regards,
Anil Gupta