Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> Making Pig run faster in local mode


+
Malcolm Tye 2013-01-04, 17:35
+
Jonathan Coveney 2013-01-04, 19:07
+
Russell Jurney 2013-01-04, 22:04
+
Malcolm Tye 2013-01-07, 11:16
+
Cheolsoo Park 2013-01-07, 19:55
+
Cheolsoo Park 2013-01-07, 19:56
+
Dmitriy Ryaboy 2013-01-08, 07:36
+
Malcolm Tye 2013-01-21, 14:01
Copy link to this message
-
Re: Making Pig run faster in local mode
Hi Malc,

Unless I am mistaken, all operations happen serially in local mode, so a
group by will be always performed by a single reducer.

Either you can use MR mode to take advantage of parallel, or you can reduce
the size of data to be grouped if possible.

Hope this is helpful.

Thanks,
Cheolsoo
On Fri, Jan 4, 2013 at 9:35 AM, Malcolm Tye <[EMAIL PROTECTED]>wrote:

> Hi,
>
>                 Any ideas on how to make Pig run quicker when running it in
> local mode ?
>
>
>
> I'm processing 3 files of about 13MB each with 3 group by statements in my
> script which seem to suck up the time. There's no joins
>
>
>
> Increasing the heap size has made no difference and it doesn't use all that
> anyway.
>
>
>
> I'm on default settings apart from that.
>
>
>
>
>
> Thanks
>
>
>
> Malc
>
>