Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Creating a Table using HFileOutputFormat


Copy link to this message
-
Creating a Table using HFileOutputFormat
 Hi,

we are trying to create a hbase table from scratch using map-reduce and
HFileOutputFormat. However, we haven't really find examples or tutorials
on how to do this, and there is some aspects which are still unclear for
us. We are using hbase 0.20.x.

First, what is the correct way to use HFileOutputFormat and to create
HFile ?
We are simply using a map function which output <ImmutableBytesWritable
(key), Put (value)>, an identity reducer, and we configure the job to
use HFileOutputFormat as an output format class.
However, we have seen in hbase 0.89.x a more complex way to do it,
involving sorting (KeyValueSortReducer, or PutSortReducer) and a
partitioner (TotalOrderPartitioner). The HFileOutputFormat provides a
convenience method, configureIncrementalLoad, to automatically configure
the hadoop job. Is this method needed in our case ? Ir is this only
necessary in the case where the table already exists (incremental bulk
load) ?
Do we have to reimplement this for 0.20.x ?

Then, one time the table creation job is successful, how do we import
the hfiles into hbase ? Is it by using the hbase cli import command ?

Thanks in advance for your answers,
Regards
--
Renaud Delbru
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB