Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - Coprocessor end point vs MapReduce?


Copy link to this message
-
Coprocessor end point vs MapReduce?
Jean-Marc Spaggiari 2012-10-18, 00:11
Hi,

Can someone please help me to understand the pros and cons between
those 2 options for the following usecase?

I need to transfer all the rows between 2 timestamps to another table.

My first idea was to run a MapReduce to map the rows and store them on
another table, and then delete them using an end point coprocessor.
But the more I look into it, the more I think the MapReduce is not a
good idea and I should use a coprocessor instead.

BUT... The MapReduce framework guarantee me that it will run against
all the regions. I tried to stop a regionserver while the job was
running. The region moved, and the MapReduce restarted the job from
the new location. Will the coprocessor do the same thing?

Also, I found the webconsole for the MapReduce with the number of
jobs, the status, etc. Is there the same thing with the coprocessors?

Are all coprocessors running at the same time on all regions, which
mean we can have 100 of them running on a regionserver at a time? Or
are they running like the MapReduce jobs based on some configured
values?

Thanks,

JM