Panshul Whisper 2013-02-07, 14:24
I am trying to write MapReduce jobs to read data from JSON files and load
it into HBase tables.
Please suggest me an efficient way to do it. I am trying to do it using
Spring Data Hbase Template to make it thread safe and enable table locking.
I use the Map methods to read and parse the JSON files. I use the Reduce
methods to call the HBase Template and store the data into the HBase tables.
1. Is this the right approach or should I do all of the above the Map
2. How can I pass the Java Object I create holding the data read from the
Json file to the Reduce method, which needs to be saved to the HBase table?
I can only pass the inbuilt data types to the reduce method from my mapper.
3. I thought of using the distributed cache for the above problem, to store
the object in the cache and pass only the key to the reduce method. But how
do I generate the unique key for all the objects I store in the distributed
Please help me with the above. Please tell me if I am missing some detail
or over looking some important detail.
Damien Hardy 2013-02-07, 11:55
Mohammad Tariq 2013-02-07, 11:28
Panshul Whisper 2013-02-07, 11:35
Mohammad Tariq 2013-02-07, 11:40
Mohammad Tariq 2013-02-07, 11:34