Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # dev >> [jira] [Created] (PIG-3749) PigPerformance - data in the map gets lost during parsing


Copy link to this message
-
[jira] [Created] (PIG-3749) PigPerformance - data in the map gets lost during parsing
Keren Ouaknine created PIG-3749:
-----------------------------------

             Summary: PigPerformance - data in the map gets lost during parsing
                 Key: PIG-3749
                 URL: https://issues.apache.org/jira/browse/PIG-3749
             Project: Pig
          Issue Type: Bug
    Affects Versions: 0.12.0
            Reporter: Keren Ouaknine
Create a Pigmix sample dataset which looks as follow:
keren 1 2 qt 3 4 5.0 a�aaa�b�bbb mc�ccc�d�ddd�e�eee�d�mf�fff�g�ggg�h�hhh

Launch the following query:
A = load 'page_views_sample.txt' using org.apache.pig.test.pigmix.udf.PigPerformanceLoader()
    as (user, action, timespent, query_term, ip_addr, timestamp, estimated_revenue, page_info, page_links);
store A into 'L1out_A';

B = foreach A generate user, (int)action as action, (map[])page_info as page_info, flatten((bag{tuple(map[])})page_links) as page_links;
store B into 'L1out_B';

The result looks like this:
keren 1 [b#bbb,a#aaa] [d#,e#eee,c#ccc]
keren 1 [b#bbb,a#aaa] [f#fff,g#ggg,h#hhh

It is missing the 'ddd' value and a closing bracket.

Thanks,
Keren

--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

 
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB