Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> Join,Filter on the same line and optimization


+
Mohit Anchlia 2012-04-11, 22:39
Copy link to this message
-
Re: Join,Filter on the same line and optimization
Please take a look at http://pig.apache.org/docs/r0.9.1/perf.html#filter

On Wed, Apr 11, 2012 at 3:39 PM, Mohit Anchlia <[EMAIL PROTECTED]>wrote:

> Is it possible to say something like
>
>
> F = JOIN A BY (FILE_NAME,CREATED_DATE,FORM_ID,FORM_ID_ROOT), B BY
> (FILE_NAME,CREATED_DATE,FORM_ID,FORM_ID_ROOT) AND FILTER A BY FORM_ID == 0;
>
> Also, how far does pig go in optimizing the job if I do specify the line
> above for instance as:
>
> F = JOIN A BY (FILE_NAME,CREATED_DATE,FORM_ID,FORM_ID_ROOT), B BY
> (FILE_NAME,CREATED_DATE,FORM_ID,FORM_ID_ROOT)
>
> G = FILTER F BY FORM_ID == 0;
>
> Would pig run only one reduce job or multiple in the case above?
>
+
Dmitriy Ryaboy 2012-04-12, 17:03