Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - Presplitting regions + Bulk import data into table


Copy link to this message
-
Re: Presplitting regions + Bulk import data into table
Bryan Beaudreault 2012-07-25, 01:55
Change the output of your job (or whatever you are using to seed this
reducer -- mapper, whatever), to output ImmutableBytesWritable as the key.
 Then wrap your bytes in the writable.  Basically, Bytes.toBytes() only
returns a raw byte[] object.  You need an object that implements
WritableComparable, and ImmutableBytesWritable is what you should use.  Use
it like this:

ImmutableBytesWritable outKey = new
ImmutableBytesWritable(Bytes.toBytes(String.valueOf(#somenumber)));

or use it's setter:

ImmutableBytesWritable outKey = new ImmutableBytesWritable();
outKey.set(Bytes.toBytes(String.valueOf(#somenumber)));

On Tue, Jul 24, 2012 at 3:40 PM, Ioakim Perros <[EMAIL PROTECTED]> wrote:

> Hi,
>
> I am bulk importing data through code and presplitting regions of a table
> - though I see all data to lead to the first server.
>
> The byte objects to compare with ( so to decide for each reducer' s output
> to which region it should go to ) are of the form :
> Bytes.toBytes(String.valueOf(#**somenumber))
>
> and the reducer's output key is an ImmutableBytesWritable - its' bytes are
> being formed like this :
> byte[] ckBytes = Bytes.toBytes(String.valueOf(#**reducer_task_id));
>
> The thing is that the reducer (KeyValueSortReducer) class allows only
> ImmutableBytesWritable objects to be the key of each table's record.
>
> Does anyone have an idea on how this comparison (between
> ImmutableBytesWritable and Bytes)is done and what should I do in order to
> make the comparison work?
>
> Thanks in advance!
> IP
>