Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> multiple puts in reducer?


+
T Vinod Gupta 2012-02-28, 14:34
+
Tim Robertson 2012-02-28, 14:44
+
T Vinod Gupta 2012-02-28, 14:50
+
Ben Snively 2012-02-28, 14:45
+
T Vinod Gupta 2012-02-28, 14:51
+
Tim Robertson 2012-02-28, 15:02
+
T Vinod Gupta 2012-02-28, 15:06
+
Ben Snively 2012-02-28, 15:22
+
T Vinod Gupta 2012-02-28, 15:25
+
Jacques 2012-02-28, 16:15
+
Jacques 2012-02-28, 16:21
+
Michel Segel 2012-02-28, 15:44
+
T Vinod Gupta 2012-02-28, 16:14
+
Michael Segel 2012-02-28, 16:20
+
Ben Snively 2012-02-28, 17:40
+
Jacques 2012-02-29, 05:16
+
Michel Segel 2012-02-29, 13:18
+
Ben Snively 2012-02-29, 13:21
Copy link to this message
-
Re: multiple puts in reducer?
The assertion is that for most cases you shouldn't need one. That the rule of thumb is that you should have to defend your use of one.

Reducers are expensive. Running multiple mappers in a job can be cheaper.

All I am saying is that you need to rethink your solution if you insist on using a reducer.

Sent from a remote device. Please excuse any typos...

Mike Segel

On Feb 28, 2012, at 11:40 AM, Ben Snively <[EMAIL PROTECTED]> wrote:

> Is there an assertion that you would never need to run a reducer when
> writing to the DB?
>
> It seems that there are cases when you would not need one, but the general
> statement doesn't apply to all use cases.
>
> If you were trying to process data where you may have two a map task (or
> set of map tasks) output the same key,  you could have a case where you
> need to reduce the data for that key prior to insert the result into hbase.
>
> Am I missing something, but to me, that would be the deciding factor.  If
> the key/values output in the map task are the exact values that need to be
> inserted into HBase versus multiple values aggregated together and the
> results put into the hbase entry?
>
> Thanks,
> Ben
>
>
> On Tue, Feb 28, 2012 at 11:20 AM, Michael Segel
> <[EMAIL PROTECTED]>wrote:
>
>> The better question is why would you need a reducer?
>>
>> That's a bit cryptic, I understand, but you have to ask yourself when do
>> you need to use a reducer when you are writing to a database... ;-)
>>
>>
>> Sent from my iPhone
>>
>> On Feb 28, 2012, at 10:14 AM, "T Vinod Gupta" <[EMAIL PROTECTED]>
>> wrote:
>>
>>> Mike,
>>> I didn't understand - why would I not need reducer in hbase m/r? there
>> can
>>> be cases right.
>>> My use case is very similar to Sujee's blog on frequency counting -
>>> http://sujee.net/tech/articles/hadoop/hbase-map-reduce-freq-counter/
>>> So in the reducer, I can do all the aggregations. Is there a better way?
>> I
>>> can think of another way - to use increments in the map job itself. i
>> have
>>> to figure out if thats possible though.
>>>
>>> thanks
>>>
>>> On Tue, Feb 28, 2012 at 7:44 AM, Michel Segel <[EMAIL PROTECTED]
>>> wrote:
>>>
>>>> Yes you can do it.
>>>> But why do you have a reducer when running a m/r job against HBase?
>>>>
>>>> The trick in writing multiple rows... You do it independently of the
>>>> output from the map() method.
>>>>
>>>>
>>>> Sent from a remote device. Please excuse any typos...
>>>>
>>>> Mike Segel
>>>>
>>>> On Feb 28, 2012, at 8:34 AM, T Vinod Gupta <[EMAIL PROTECTED]>
>> wrote:
>>>>
>>>>> while doing map reduce on hbase tables, is it possible to do multiple
>>>> puts
>>>>> in the reducer? what i want is a way to be able to write multiple rows.
>>>> if
>>>>> its not possible, then what are the other alternatives? i mean like
>>>>> creating a wider table in that case.
>>>>>
>>>>> thanks
>>>>
>>
+
Jacques 2012-03-01, 17:28