|
|
-
How to retrieve all columns of a CF and adding it in a put call
Shrijeet Paliwal 2011-09-29, 18:58
Hello,
I am trying to create a new table with data exactly same as BUT make the row key in new table set as a column_value in the old table.
Following is my map method ( using a map only MR job)
public void map(ImmutableBytesWritable key, Result value, Context context) { throws IOException, InterruptedException { byte[] mod_key = value.getValue(INPUT_FAMILY, INPUT_COLUMN); if (mod_key != null) { Map<byte[], NavigableMap<byte[], byte[]>> cf value.getNoVersionMap(); Put put = new Put(mod_key); for (byte[] c : cf.keySet()) { for (byte[] q : cf.get(c).keySet()) { put.add(c, q, cf.get(c).get(q)); } } context.write(key, put);}
I am inclined to think there has to be a more efficient way to do this. By that I mean, not have to iterate through all the columns. Thoughts?
Browsing code I found some usages like this :
outval.add(INPUT_FAMILY, null, value.getValue(INPUT_FAMILY, null);
What does above mean? Does it mean get bytes representing all columns for INPUT_FAMILY and add it in put object?
-
Re: How to retrieve all columns of a CF and adding it in a put call
Jean-Daniel Cryans 2011-10-06, 18:48
Well you need to insert all the columns so yes you need to iterate them all. There's a shorter way to do it tho, look at the Import class in the HBase code:
private static Put resultToPut(ImmutableBytesWritable key, Result result) throws IOException { Put put = new Put(key.get()); for (KeyValue kv : result.raw()) { put.add(kv); } return put; }
Regarding your last question, what that line does is just setting the value of the input family with an empty qualifier. Not the whole family.
J-D
On Thu, Sep 29, 2011 at 11:58 AM, Shrijeet Paliwal <[EMAIL PROTECTED]> wrote: > Hello, > > I am trying to create a new table with data exactly same as BUT make > the row key in new table set as a column_value in the old table. > > Following is my map method ( using a map only MR job) > > public void map(ImmutableBytesWritable key, Result value, Context context) { > throws IOException, InterruptedException { > byte[] mod_key = value.getValue(INPUT_FAMILY, INPUT_COLUMN); > if (mod_key != null) { > Map<byte[], NavigableMap<byte[], byte[]>> cf > value.getNoVersionMap(); > Put put = new Put(mod_key); > for (byte[] c : cf.keySet()) { > for (byte[] q : cf.get(c).keySet()) { > put.add(c, q, cf.get(c).get(q)); > } > } > context.write(key, put);} > > I am inclined to think there has to be a more efficient way to do > this. By that I mean, not have to iterate through all the columns. > Thoughts? > > Browsing code I found some usages like this : > > outval.add(INPUT_FAMILY, null, value.getValue(INPUT_FAMILY, null); > > What does above mean? Does it mean get bytes representing all columns > for INPUT_FAMILY and add it in put object? >
-
Re: How to retrieve all columns of a CF and adding it in a put call
Shrijeet Paliwal 2011-10-06, 21:02
Understood. Thank you J-D.
On Thu, Oct 6, 2011 at 11:48 AM, Jean-Daniel Cryans <[EMAIL PROTECTED]> wrote: > Well you need to insert all the columns so yes you need to iterate > them all. There's a shorter way to do it tho, look at the Import class > in the HBase code: > > private static Put resultToPut(ImmutableBytesWritable key, Result result) > throws IOException { > Put put = new Put(key.get()); > for (KeyValue kv : result.raw()) { > put.add(kv); > } > return put; > } > > Regarding your last question, what that line does is just setting the > value of the input family with an empty qualifier. Not the whole > family. > > J-D > > On Thu, Sep 29, 2011 at 11:58 AM, Shrijeet Paliwal > <[EMAIL PROTECTED]> wrote: >> Hello, >> >> I am trying to create a new table with data exactly same as BUT make >> the row key in new table set as a column_value in the old table. >> >> Following is my map method ( using a map only MR job) >> >> public void map(ImmutableBytesWritable key, Result value, Context context) { >> throws IOException, InterruptedException { >> byte[] mod_key = value.getValue(INPUT_FAMILY, INPUT_COLUMN); >> if (mod_key != null) { >> Map<byte[], NavigableMap<byte[], byte[]>> cf >> value.getNoVersionMap(); >> Put put = new Put(mod_key); >> for (byte[] c : cf.keySet()) { >> for (byte[] q : cf.get(c).keySet()) { >> put.add(c, q, cf.get(c).get(q)); >> } >> } >> context.write(key, put);} >> >> I am inclined to think there has to be a more efficient way to do >> this. By that I mean, not have to iterate through all the columns. >> Thoughts? >> >> Browsing code I found some usages like this : >> >> outval.add(INPUT_FAMILY, null, value.getValue(INPUT_FAMILY, null); >> >> What does above mean? Does it mean get bytes representing all columns >> for INPUT_FAMILY and add it in put object? >> >
|
|