Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Accumulo >> mail # user >> MapReduce mapper not seeing all rows

Copy link to this message
MapReduce mapper not seeing all rows

I'm running a map reduce job over a table using AccumuloRowInputFormat.
 For debugging purposes I'm logging the key.getRow() so I can see what rows
it's finding as it progresses.

If I don't specify any ranges on the input format, it skips significant
number of rows - that is, I don't see any logging indicating that it
traversed them.

To see if it was a visibility issue, I tried explicitly setting a range,
like this:

        AccumuloRowInputFormat.setRanges(job.getConfiguration(), ranges);

When doing that it does process the rows that it otherwise skips.

The same TimestampFilter is being applied in both scenarios, no other
filters / iterators are being used.

Any thoughts on why, when run without the ranges specified, it isn't seeing
a significant portion of the data?