Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Wrong output with Multiquery optimizer


Copy link to this message
-
Wrong output with Multiquery optimizer
Hi,
      I have a script which executes multiple jobs , and there is a
considerable amount of multiquery optimization done.
But it looks like the script generates wrong output with multiquery
enabled. The output is fine with -M option.

Attached a trimmed down version of the actual script. The data is
getting messed up in the nested foreach, which is defined inside a macro.
The UDF aaa.RANKING() add a simple rank over the ordered data.
The a sample output that is expected is like below (without multiquery);

/1,3,1,1378339200,9779,http:///www.abc12345.com/JQueryAddUserControl.aspx,68445,3333,6,99999,6,0//
//1,3,2,1378339200,9779,http:///www.abc12345.com/EN/IN/Home.aspx,113961,3333,3,99999,0,0//
//1,3,3,1378339200,9779,http:///images.abc12345.com/Img/Tabs/servicestab_expandshadow.gif,2686,3333,2,99999,0,0//
//1,3,4,1378339200,9779,http:///www.abc12345.com/Images/Rent_a_Car_414x207.jpg,30616,3333,2,99999,0,0//
//1,3,5,1378339200,9779,http:///images.abc12345.com/Img/Tabs/servicestabon_linehide.gif,2203,3333,2,99999,0,0//
//1,3,6,1378339200,9779,http:///images.abc12345.com/Img/Common/dottedlinehr.gif,2108,3333,2,99999,0,0//
//1,3,7,1378339200,9779,http:///www.abc12345.com/WebResource.axd,2688,3333,2,99999,0,0//
//1,3,8,1378339200,9779,http:///www.abc12345.com/Scripts/Button/mouseoverbutton.js,2526,3333,2,99999,0,0/

But with multi query on, the data is received like below ;

/*1,3,52*,1378339200,*9779*,http:///www.abc12345.com/Images/UAE_Visa_Marhaba_Services_382x208_New.jpg,1228,3333,1,99999,0,0//
//1,3,18,1378339200,9779,http:///images.abc12345.com/Img/TooltipYellow/TooltipYellowArrowBottom.png,1695,3333,1,99999,0,0//
//1,3,56,1378339200,9779,http:///www.abc12345.com/App_Themes/Default/Img/Common/arrowblue_right.gif,1226,3333,1,99999,0,0//
//1,3,90,1378339200,9779,http:///www.abc12345.com/Scripts/PNRStatus.js,1205,3333,1,99999,0,0//
//1,3,51,1378339200,9779,http:///images.abc12345.com/Img/Obe/obe_bg.gif,1081,3333,1,99999,0,0//
//*1,3,52*,1378339200,*9779*,http:///static.abc12345.com/Scripts/Obe/Obe.js,1076,3333,1,99999,0,0//
/
Note : the ordering is lost and there are two rows that end up with the
same key. Happens in both  0.11.1 and 0.10.
-t All also did not help.

Would like to understand if I am doing something wrong in the script
that causes this behavior. So far I couldn't figure out a workaround
other than disabling multiquery.

Thanks
Vivek
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB