Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Accumulo >> mail # user >> Mappers for Accumulo


+
Aji Janis 2013-03-08, 18:11
+
Mike Hugo 2013-03-08, 18:17
+
Aji Janis 2013-03-08, 21:17
+
Keith Turner 2013-03-09, 00:02
+
Aji Janis 2013-03-11, 20:47
Copy link to this message
-
Re: Mappers for Accumulo
So you want both auto adjusting and not auto adjusting depending on the
size of a range? I suppose you could lift the code for doing the adjusting,
and do some introspection on the ranges (such as "how may tablets do I have
in this range?") and apply as necessary.

On Mon, Mar 11, 2013 at 4:47 PM, Aji Janis <[EMAIL PROTECTED]> wrote:

> So looks like doing a List<Range> is what I need so that I can have a
> mapper per range. However, a more interesting scenario is one when given a
> big range I want to split it into multiple ranges. In other words if my
> rowid was 1_hello, 2_hello, .... 9_hello, 10_hello. And the range given was
> 2 to 5. But i want one mapper per integer so 4 mappers in this case... any
> ideas on how I can accomplish that?
>
>
> Thanks all for suggestions.
>
>
> On Fri, Mar 8, 2013 at 7:02 PM, Keith Turner <[EMAIL PROTECTED]> wrote:
>
>> On Fri, Mar 8, 2013 at 4:17 PM, Aji Janis <[EMAIL PROTECTED]> wrote:
>> > Thank you. Follow up question.
>> >
>> > Would this enforce one mapper per range even if all the data (From three
>> > ranges) is on one node/tablet?
>>
>> Look at disableAutoAdjustRanges(). This determines wether it creates a
>> mapper per tablet per range OR per range.
>>
>>
>> >
>> >
>> >
>> > On Fri, Mar 8, 2013 at 1:17 PM, Mike Hugo <[EMAIL PROTECTED]> wrote:
>> >>
>> >> See AccumuloInputFormat
>> >>
>> >> ArrayList<Range> ranges = new ArrayList<Range>();
>> >> // populate array list of row ranges ...
>> >> AccumuloInputFormat.setRanges(job, ranges);
>> >>
>> >>
>> >> You should get one mapper per range.
>> >>
>> >>
>> >>
>> >>
>> >> On Fri, Mar 8, 2013 at 12:11 PM, Aji Janis <[EMAIL PROTECTED]> wrote:
>> >>>
>> >>> Hello,
>> >>>
>> >>>  I am trying to figure out how I can configure number of mappers (if
>> its
>> >>> even possible) based on a Accumulo row range. My accumulo rowid uses
>> the
>> >>> format:
>> >>>
>> >>> abc/1
>> >>> abc/2
>> >>> ...
>> >>> def/3
>> >>> ....
>> >>> xyz/13...
>> >>>
>> >>> If I want to specify three ranges: [abc/1 to abc/3] , [def/1 to def
>> 5] ,
>> >>> [jkl/13 to klm 15]. and have one mapper work on one range, is there a
>> way I
>> >>> can do this?? How do I even set up my mapreduce job to accept these
>> >>> ranges??? Thankyou for all feedback.
>> >>>
>> >>>
>> >>
>> >
>>
>
>
+
Aji Janis 2013-03-11, 21:39
+
William Slacum 2013-03-12, 17:21