Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Predicting how many values will I see in a call to reduce?


Copy link to this message
-
Re: Predicting how many values will I see in a call to reduce?
On Sun, Nov 7, 2010 at 5:38 AM, Anthony Urso <[EMAIL PROTECTED]> wrote:

> Is there any way to know how many values I will see in a call to
> reduce without first counting through them all with the iterator?
No, there currently isn't. The framework doesn't have the information until
the iterator is exhausted. The iterator is not in memory, but is being
synthesized as the result of a N-way merge sort from disk and memory. If
your application needs that knowledge, you could do it from the application.
If your value sets are small enough to fit in memory, the easiest thing to
do is just read them into a list from the iterator (cloning the values to
avoid the object reuse!).

You could try using the resettable iterators, but I don't know how reliable
they are.

-- Owen
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB