Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Making Pig run faster in local mode


Copy link to this message
-
Re: Making Pig run faster in local mode
Hi Malc,

Unless I am mistaken, all operations happen serially in local mode, so a
group by will be always performed by a single reducer.

Either you can use MR mode to take advantage of parallel, or you can reduce
the size of data to be grouped if possible.

Hope this is helpful.

Thanks,
Cheolsoo
On Fri, Jan 4, 2013 at 9:35 AM, Malcolm Tye <[EMAIL PROTECTED]>wrote:

> Hi,
>
>                 Any ideas on how to make Pig run quicker when running it in
> local mode ?
>
>
>
> I'm processing 3 files of about 13MB each with 3 group by statements in my
> script which seem to suck up the time. There's no joins
>
>
>
> Increasing the heap size has made no difference and it doesn't use all that
> anyway.
>
>
>
> I'm on default settings apart from that.
>
>
>
>
>
> Thanks
>
>
>
> Malc
>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB