Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> MapReduce to load data in HBase


Copy link to this message
-
Re: MapReduce to load data in HBase
Hello,

Thank you for the reply.
1. I cannot serialize the Json and store it as a whole. I need to extract
individual values and store them as later I need to query the stored values
in various aggregation algorithms.
2. Can u please point me in direction where I can find out how to write a
data type to be Writable+Comparable. I will look into Avro, but I prefer to
write my owm data type.
3. I will look into MR counters.

Regards,
On Thu, Feb 7, 2013 at 12:28 PM, Mohammad Tariq <[EMAIL PROTECTED]> wrote:

> Hello Panshul,
>
>     My answers :
> 1- You can serialize the entire jSON into a byte[ ] and store it in a
> cell.(Is it important for you extract individual values from your JSON and
> then put them into the table?)
> 2- You can write your own datatype to pass your object to the reducer.
> But, it must be a Writable+Comparable. Alternatively you van use Avro.
> 3- For generating unique keys, you can use MR counters.
>
> Warm Regards,
> Tariq
> https://mtariq.jux.com/
> cloudfront.blogspot.com
>
>
> On Thu, Feb 7, 2013 at 4:52 PM, Panshul Whisper <[EMAIL PROTECTED]>wrote:
>
>> Hello,
>>
>> I am trying to write MapReduce jobs to read data from JSON files and load
>> it into HBase tables.
>> Please suggest me an efficient way to do it. I am trying to do it using
>> Spring Data Hbase Template to make it thread safe and enable table locking.
>>
>> I use the Map methods to read and parse the JSON files. I use the Reduce
>> methods to call the HBase Template and store the data into the HBase tables.
>>
>> My questions:
>> 1. Is this the right approach or should I do all of the above the Map
>> method?
>> 2. How can I pass the Java Object I create holding the data read from the
>> Json file to the Reduce method, which needs to be saved to the HBase table?
>> I can only pass the inbuilt data types to the reduce method from my mapper.
>> 3. I thought of using the distributed cache for the above problem, to
>> store the object in the cache and pass only the key to the reduce method.
>> But how do I generate the unique key for all the objects I store in the
>> distributed cache.
>>
>> Please help me with the above. Please tell me if I am missing some detail
>> or over looking some important detail.
>>
>> Thanking You,
>>
>>
>> --
>> Regards,
>> Ouch Whisper
>> 010101010101
>>
>
>
--
Regards,
Ouch Whisper
010101010101
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB