Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Increment Counters in HBase during MapReduce

Copy link to this message
Re: Increment Counters in HBase during MapReduce
As the the thread JD pointed out suggests - the best approach if you
want to avoid aggregations later on is to aggregate in an MR job,
output to a file with ad id and the number of impressions found for
that ad. Run a separate client application, likely single threaded if
the number of ads is not too big, and increment each of them in that
run. That's your best option.

On Jun 19, 2012, at 7:41 PM, Sid Kumar <[EMAIL PROTECTED]> wrote:

> Thanks for the info. It seems safer to do the aggregations in the MR code.
> Do you guys think of any better alternative?
> Sid
> On Tue, Jun 19, 2012 at 9:55 AM, Jean-Daniel Cryans <[EMAIL PROTECTED]>wrote:
>> This question was answered here already:
>> http://mail-archives.apache.org/mod_mbox/hbase-user/201101.mbox/%[EMAIL PROTECTED]%3E
>> Counters are not idempotent, this can be hard to manage.
>> J-D
>> On Mon, Jun 18, 2012 at 5:49 PM, Sid Kumar <[EMAIL PROTECTED]> wrote:
>>> Hi everyone,
>>>   I have a use case in HBase that I was wondering if someone may have
>>> stumbled upon. I am maintaining an ad impressions table with columns that
>>> are counters for certain metrics. I started using the
>> incrementColumnValue
>>> method part of the HTable API to update these metrics and that works
>> great.
>>>   I was wondering if this function could be used from a MapReduce job.
>>> The TableOutputFormat supports only Delete and Put operations. Using the
>>> Incremental counters saves me from doing any aggregations in my Map
>> Reduce
>>> code. Ideally i would like to just call this function in my mapper and
>>> wouldn't even need a Reducer.
>>>   Has anyone run into this use case? I would also love to know if there
>>> are any better alternatives of solving this too. Any info would be great.
>>> Thanks
>>> Sid