Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce, mail # user - Re: MapReduce to load data in HBase


+
Panshul Whisper 2013-02-07, 14:24
+
Panshul Whisper 2013-02-07, 11:22
+
Damien Hardy 2013-02-07, 11:55
+
Mohammad Tariq 2013-02-07, 11:28
Copy link to this message
-
Re: MapReduce to load data in HBase
Panshul Whisper 2013-02-07, 11:35
Hello,

Thank you for the reply.
1. I cannot serialize the Json and store it as a whole. I need to extract
individual values and store them as later I need to query the stored values
in various aggregation algorithms.
2. Can u please point me in direction where I can find out how to write a
data type to be Writable+Comparable. I will look into Avro, but I prefer to
write my owm data type.
3. I will look into MR counters.

Regards,
On Thu, Feb 7, 2013 at 12:28 PM, Mohammad Tariq <[EMAIL PROTECTED]> wrote:

> Hello Panshul,
>
>     My answers :
> 1- You can serialize the entire jSON into a byte[ ] and store it in a
> cell.(Is it important for you extract individual values from your JSON and
> then put them into the table?)
> 2- You can write your own datatype to pass your object to the reducer.
> But, it must be a Writable+Comparable. Alternatively you van use Avro.
> 3- For generating unique keys, you can use MR counters.
>
> Warm Regards,
> Tariq
> https://mtariq.jux.com/
> cloudfront.blogspot.com
>
>
> On Thu, Feb 7, 2013 at 4:52 PM, Panshul Whisper <[EMAIL PROTECTED]>wrote:
>
>> Hello,
>>
>> I am trying to write MapReduce jobs to read data from JSON files and load
>> it into HBase tables.
>> Please suggest me an efficient way to do it. I am trying to do it using
>> Spring Data Hbase Template to make it thread safe and enable table locking.
>>
>> I use the Map methods to read and parse the JSON files. I use the Reduce
>> methods to call the HBase Template and store the data into the HBase tables.
>>
>> My questions:
>> 1. Is this the right approach or should I do all of the above the Map
>> method?
>> 2. How can I pass the Java Object I create holding the data read from the
>> Json file to the Reduce method, which needs to be saved to the HBase table?
>> I can only pass the inbuilt data types to the reduce method from my mapper.
>> 3. I thought of using the distributed cache for the above problem, to
>> store the object in the cache and pass only the key to the reduce method.
>> But how do I generate the unique key for all the objects I store in the
>> distributed cache.
>>
>> Please help me with the above. Please tell me if I am missing some detail
>> or over looking some important detail.
>>
>> Thanking You,
>>
>>
>> --
>> Regards,
>> Ouch Whisper
>> 010101010101
>>
>
>
--
Regards,
Ouch Whisper
010101010101
+
Mohammad Tariq 2013-02-07, 11:40
+
Mohammad Tariq 2013-02-07, 11:34