Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Accumulo, mail # user - Mappers for Accumulo


+
Aji Janis 2013-03-08, 18:11
+
Mike Hugo 2013-03-08, 18:17
+
Aji Janis 2013-03-08, 21:17
+
Keith Turner 2013-03-09, 00:02
Copy link to this message
-
Re: Mappers for Accumulo
Aji Janis 2013-03-11, 20:47
So looks like doing a List<Range> is what I need so that I can have a
mapper per range. However, a more interesting scenario is one when given a
big range I want to split it into multiple ranges. In other words if my
rowid was 1_hello, 2_hello, .... 9_hello, 10_hello. And the range given was
2 to 5. But i want one mapper per integer so 4 mappers in this case... any
ideas on how I can accomplish that?
Thanks all for suggestions.

On Fri, Mar 8, 2013 at 7:02 PM, Keith Turner <[EMAIL PROTECTED]> wrote:

> On Fri, Mar 8, 2013 at 4:17 PM, Aji Janis <[EMAIL PROTECTED]> wrote:
> > Thank you. Follow up question.
> >
> > Would this enforce one mapper per range even if all the data (From three
> > ranges) is on one node/tablet?
>
> Look at disableAutoAdjustRanges(). This determines wether it creates a
> mapper per tablet per range OR per range.
>
>
> >
> >
> >
> > On Fri, Mar 8, 2013 at 1:17 PM, Mike Hugo <[EMAIL PROTECTED]> wrote:
> >>
> >> See AccumuloInputFormat
> >>
> >> ArrayList<Range> ranges = new ArrayList<Range>();
> >> // populate array list of row ranges ...
> >> AccumuloInputFormat.setRanges(job, ranges);
> >>
> >>
> >> You should get one mapper per range.
> >>
> >>
> >>
> >>
> >> On Fri, Mar 8, 2013 at 12:11 PM, Aji Janis <[EMAIL PROTECTED]> wrote:
> >>>
> >>> Hello,
> >>>
> >>>  I am trying to figure out how I can configure number of mappers (if
> its
> >>> even possible) based on a Accumulo row range. My accumulo rowid uses
> the
> >>> format:
> >>>
> >>> abc/1
> >>> abc/2
> >>> ...
> >>> def/3
> >>> ....
> >>> xyz/13...
> >>>
> >>> If I want to specify three ranges: [abc/1 to abc/3] , [def/1 to def 5]
> ,
> >>> [jkl/13 to klm 15]. and have one mapper work on one range, is there a
> way I
> >>> can do this?? How do I even set up my mapreduce job to accept these
> >>> ranges??? Thankyou for all feedback.
> >>>
> >>>
> >>
> >
>
+
William Slacum 2013-03-11, 21:13
+
Aji Janis 2013-03-11, 21:39
+
William Slacum 2013-03-12, 17:21