Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # dev >> [jira] [Created] (PIG-3749) PigPerformance - data in the map gets lost during parsing


Copy link to this message
-
[jira] [Created] (PIG-3749) PigPerformance - data in the map gets lost during parsing
Keren Ouaknine created PIG-3749:
-----------------------------------

             Summary: PigPerformance - data in the map gets lost during parsing
                 Key: PIG-3749
                 URL: https://issues.apache.org/jira/browse/PIG-3749
             Project: Pig
          Issue Type: Bug
    Affects Versions: 0.12.0
            Reporter: Keren Ouaknine
Create a Pigmix sample dataset which looks as follow:
keren 1 2 qt 3 4 5.0 a�aaa�b�bbb mc�ccc�d�ddd�e�eee�d�mf�fff�g�ggg�h�hhh

Launch the following query:
A = load 'page_views_sample.txt' using org.apache.pig.test.pigmix.udf.PigPerformanceLoader()
    as (user, action, timespent, query_term, ip_addr, timestamp, estimated_revenue, page_info, page_links);
store A into 'L1out_A';

B = foreach A generate user, (int)action as action, (map[])page_info as page_info, flatten((bag{tuple(map[])})page_links) as page_links;
store B into 'L1out_B';

The result looks like this:
keren 1 [b#bbb,a#aaa] [d#,e#eee,c#ccc]
keren 1 [b#bbb,a#aaa] [f#fff,g#ggg,h#hhh

It is missing the 'ddd' value and a closing bracket.

Thanks,
Keren

--
This message was sent by Atlassian JIRA
(v6.1.5#6160)