Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce, mail # user - Predicting how many values will I see in a call to reduce?


Copy link to this message
-
Re: Predicting how many values will I see in a call to reduce?
Owen O'Malley 2010-11-09, 16:28
On Sun, Nov 7, 2010 at 5:38 AM, Anthony Urso <[EMAIL PROTECTED]> wrote:

> Is there any way to know how many values I will see in a call to
> reduce without first counting through them all with the iterator?
No, there currently isn't. The framework doesn't have the information until
the iterator is exhausted. The iterator is not in memory, but is being
synthesized as the result of a N-way merge sort from disk and memory. If
your application needs that knowledge, you could do it from the application.
If your value sets are small enough to fit in memory, the easiest thing to
do is just read them into a list from the iterator (cloning the values to
avoid the object reuse!).

You could try using the resettable iterators, but I don't know how reliable
they are.

-- Owen