Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> MapReduce to load data in HBase

Copy link to this message
MapReduce to load data in HBase

I am trying to write MapReduce jobs to read data from JSON files and load
it into HBase tables.
Please suggest me an efficient way to do it. I am trying to do it using
Spring Data Hbase Template to make it thread safe and enable table locking.

I use the Map methods to read and parse the JSON files. I use the Reduce
methods to call the HBase Template and store the data into the HBase tables.

My questions:
1. Is this the right approach or should I do all of the above the Map
2. How can I pass the Java Object I create holding the data read from the
Json file to the Reduce method, which needs to be saved to the HBase table?
I can only pass the inbuilt data types to the reduce method from my mapper.
3. I thought of using the distributed cache for the above problem, to store
the object in the cache and pass only the key to the reduce method. But how
do I generate the unique key for all the objects I store in the distributed

Please help me with the above. Please tell me if I am missing some detail
or over looking some important detail.

Thanking You,
Ouch Whisper