Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> set mapred.map.tasks=1 not work


Copy link to this message
-
Re: set mapred.map.tasks=1 not work
I've tried jvm reuse, useless too..

Total time is about 130s, data only 10M and all small files, 2 nodes.

hive/hadoop will run 350+ maps ...

2010/6/10 Edward Capriolo <[EMAIL PROTECTED]>

> Also consider setting up jvm reuse this will deal with some mapper
> startup penalty.
>
> How long is you query taking how much data is there? How many nodes?
>
> On Thursday, June 10, 2010, wd <[EMAIL PROTECTED]> wrote:
> > set
> hive.input.format=org.apache.hadoop.hive.ql.io.CombineHiveInputFormat;
> >
> > and
> >
> > set hive.merge.size.per.task=1000000;
> > set hive.merge.mapfiles=true;
> >
> > seames all useless here, time token for execute 'select a, count(1) from
> t1 group by a' is almost the same.
> >
> > Have I missed some other settings ?
> >
> > 2010/6/10 wd <[EMAIL PROTECTED]>
> >
> > Thanks everyone, I'll try CombineHiveInputFormat. :)
> >
> > 2010/6/10 Namit Jain <[EMAIL PROTECTED]>
> >
> >
> > CombineHiveInputFormat
> >
> >
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB