Ambastha, Abhishek 2013-09-10, 05:07
The cogroup operator and a filter should get you what you want:
-- First, filter table1
t1_filtered = filter table1 by reporting_dt <= '01APR2013';
-- Cogroup, this results in 'bags' you can check
grouped = cogroup
t1_filtered by serial_num,
table2 by serial_num,
table3 by serial_num;
-- Do outer join and max in same projection
maxdate = foreach (filter grouped by not IsEmpty(table2) and IsEmpty(table3)) generate
group as serial_num,
MAX(t1_filtered.reporting_dt) as reporting_dt;
Note that I haven't actually ran this yet but it's pretty straightforward pig.
On Sep 10, 2013, at 12:07 AM, "Ambastha, Abhishek" <[EMAIL PROTECTED]> wrote:
> I want to write a pig query for the below SQL:
> CREATE TABLE MAXDATE AS
> SELECT SERIAL_NUM, MAX(REPORTING_DT) AS REPORTING_DT
> FROM Table_1
> WHERE REPORTING_DT <= '01APR2013' AND
> SERIAL_NUM IN (SELECT SERIAL_NUM FROM TABLE_2) AND
> SERIAL_NUM NOT IN (SELECT SERIAL_NUM FROM TABLE_3)
> GROUP BY SERIAL_NUM;
> REPORTING_DT is a string but I want Maximum Date. Probably, I could use CustomFormatToISO for Date conversion and get the Maximum Value.
> Join is on Table_1, Table_2 and Table_3.
> Please suggest how to do this.