Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop >> mail # user >> Reducer MapFileOutpuFormat


+
Mike S 2012-07-23, 20:09
+
Harsh J 2012-07-27, 22:06
+
syed kather 2012-07-27, 01:15
+
Bertrand Dechoux 2012-07-27, 05:54
Copy link to this message
-
Re: Reducer MapFileOutpuFormat
Hi Bertrand,

I believe he is talking about MapFile's index files, explained here:
http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/io/MapFile.html

On Fri, Jul 27, 2012 at 11:24 AM, Bertrand Dechoux <[EMAIL PROTECTED]> wrote:
> Your use of 'index' is indeed not clear. Are you talking about Hive or
> HBase?
>
> I can confirm that you will have one result file per reducer. Of course,
> for efficiency reasons, you need to limit the number of files. But if you
> are using multiple reducers it should mean that one reducer isn't fast
> enough, so it could be assumed that the output for each reducer is big
> enough. If that not the case, you can limit the number of reducer to one.
>
> In general, the 'fragmentation' of the results is dealt by the next job.
> You should provide more information about your real problem and its context.
>
> Bertrand
>
> On Fri, Jul 27, 2012 at 3:15 AM, syed kather <[EMAIL PROTECTED]> wrote:
>
>> Mike ,
>> Can you please give more details . Context is not clear . Can you share ur
>> use case if possible
>> On Jul 24, 2012 1:40 AM, "Mike S" <[EMAIL PROTECTED]> wrote:
>>
>> > If I set my reducer output to map file output format and the job would
>> > say have 100 reducers, will the output generate 100 different index
>> > file (one for each reducer) or one index file for all the reducers
>> > (basically one index file per job)?
>> >
>> > If it is one index file per reducer, can rely on HDFS append to change
>> > the index write behavior and build one index file from all the
>> > reducers by basically making all the parallel reducers to append to
>> > one index file? Data files do not matter.
>> >
>>
>
>
>
> --
> Bertrand Dechoux

--
Harsh J
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB