Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> When passing the -M parameter to get different results


Copy link to this message
-
When passing the -M parameter to get different results
hi all
Why do I get different results?

When passing the -M parameter to get different results,

If I do not pass the -M parameter, the result is not correct

thx
pig script

set job.name '$jobname'
rawLog = load '$input' as (line);

schemeData = stream rawLog through `/data/hadoop/loganalyze/cut.php` as
(platForm, userKey, reqType, catId);
listDataSet = FILTER schemeData by reqType ==1;
itemDataSet = FILTER schemeData by reqType ==2;
taokeDataSet = FILTER schemeData by reqType ==3;

--list
listDataSetGroup = group listDataSet by (platForm, catId);

listDataUvPv = FOREACH listDataSetGroup { D = DISTINCT listDataSet.userKey;
GENERATE FLATTEN(group), COUNT(D), COUNT($1); };

outputSet1 = stream listDataUvPv through `/data/hadoop/loganalyze/test.php
1 `;

store outputSet1 INTO '$output/list';

itemDataSetGroup = group itemDataSet by (platForm, catId);

itemDataUvPv = FOREACH itemDataSetGroup { D = DISTINCT itemDataSet.userKey;
GENERATE FLATTEN(group), COUNT(D), COUNT($1); };

outputSet2 = stream itemDataUvPv through `/data/hadoop/loganalyze/test.php
2 `;
store outputSet2 INTO '$output/item';

taokeDataSetGroup = group taokeDataSet by (platForm, catId);

taokeDataUvPv = FOREACH taokeDataSetGroup { D = DISTINCT
taokeDataSet.userKey; GENERATE FLATTEN(group), COUNT(D), COUNT($1); };

outputSet3 = stream taokeDataUvPv through `/data/hadoop/loganalyze/test.php
3 `;
store outputSet3 INTO '$output/taoke';
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB