Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> When passing the -M parameter to get different results


Copy link to this message
-
When passing the -M parameter to get different results
hi all
Why do I get different results?

When passing the -M parameter to get different results,

If I do not pass the -M parameter, the result is not correct

thx
pig script

set job.name '$jobname'
rawLog = load '$input' as (line);

schemeData = stream rawLog through `/data/hadoop/loganalyze/cut.php` as
(platForm, userKey, reqType, catId);
listDataSet = FILTER schemeData by reqType ==1;
itemDataSet = FILTER schemeData by reqType ==2;
taokeDataSet = FILTER schemeData by reqType ==3;

--list
listDataSetGroup = group listDataSet by (platForm, catId);

listDataUvPv = FOREACH listDataSetGroup { D = DISTINCT listDataSet.userKey;
GENERATE FLATTEN(group), COUNT(D), COUNT($1); };

outputSet1 = stream listDataUvPv through `/data/hadoop/loganalyze/test.php
1 `;

store outputSet1 INTO '$output/list';

itemDataSetGroup = group itemDataSet by (platForm, catId);

itemDataUvPv = FOREACH itemDataSetGroup { D = DISTINCT itemDataSet.userKey;
GENERATE FLATTEN(group), COUNT(D), COUNT($1); };

outputSet2 = stream itemDataUvPv through `/data/hadoop/loganalyze/test.php
2 `;
store outputSet2 INTO '$output/item';

taokeDataSetGroup = group taokeDataSet by (platForm, catId);

taokeDataUvPv = FOREACH taokeDataSetGroup { D = DISTINCT
taokeDataSet.userKey; GENERATE FLATTEN(group), COUNT(D), COUNT($1); };

outputSet3 = stream taokeDataUvPv through `/data/hadoop/loganalyze/test.php
3 `;
store outputSet3 INTO '$output/taoke';