|
|
-
Re: MapReduce to load data in HBaseMohammad Tariq 2013-02-07, 11:28
Hello Panshul,
My answers : 1- You can serialize the entire jSON into a byte[ ] and store it in a cell.(Is it important for you extract individual values from your JSON and then put them into the table?) 2- You can write your own datatype to pass your object to the reducer. But, it must be a Writable+Comparable. Alternatively you van use Avro. 3- For generating unique keys, you can use MR counters. Warm Regards, Tariq https://mtariq.jux.com/ cloudfront.blogspot.com On Thu, Feb 7, 2013 at 4:52 PM, Panshul Whisper <[EMAIL PROTECTED]>wrote: > Hello, > > I am trying to write MapReduce jobs to read data from JSON files and load > it into HBase tables. > Please suggest me an efficient way to do it. I am trying to do it using > Spring Data Hbase Template to make it thread safe and enable table locking. > > I use the Map methods to read and parse the JSON files. I use the Reduce > methods to call the HBase Template and store the data into the HBase tables. > > My questions: > 1. Is this the right approach or should I do all of the above the Map > method? > 2. How can I pass the Java Object I create holding the data read from the > Json file to the Reduce method, which needs to be saved to the HBase table? > I can only pass the inbuilt data types to the reduce method from my mapper. > 3. I thought of using the distributed cache for the above problem, to > store the object in the cache and pass only the key to the reduce method. > But how do I generate the unique key for all the objects I store in the > distributed cache. > > Please help me with the above. Please tell me if I am missing some detail > or over looking some important detail. > > Thanking You, > > > -- > Regards, > Ouch Whisper > 010101010101 > |