Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Accumulo, mail # user - MapReduce mapper not seeing all rows


Copy link to this message
-
Re: MapReduce mapper not seeing all rows
Billie Rinaldi 2013-02-26, 20:28
Have you noticed any pattern in the rows it seems to be missing?  E.g.
every other row, the last row in each tablet, etc.?  When you set a range,
what range did you set?

Billie
On Tue, Feb 26, 2013 at 12:17 PM, Mike Hugo <[EMAIL PROTECTED]> wrote:

> Hello,
>
> I'm running a map reduce job over a table using AccumuloRowInputFormat.
>  For debugging purposes I'm logging the key.getRow() so I can see what rows
> it's finding as it progresses.
>
> If I don't specify any ranges on the input format, it skips significant
> number of rows - that is, I don't see any logging indicating that it
> traversed them.
>
> To see if it was a visibility issue, I tried explicitly setting a range,
> like this:
>
>         AccumuloRowInputFormat.setRanges(job.getConfiguration(), ranges);
>
> When doing that it does process the rows that it otherwise skips.
>
> The same TimestampFilter is being applied in both scenarios, no other
> filters / iterators are being used.
>
> Any thoughts on why, when run without the ranges specified, it isn't
> seeing a significant portion of the data?
>
> Thanks,
>
> Mike
>