Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - a simple logic causes very long compiling time on pig 0.10.0


Copy link to this message
-
Re: a simple logic causes very long compiling time on pig 0.10.0
Jonathan Coveney 2012-06-26, 23:01
This is a great find. Please file a ticket.

My guess is that there is some backtracking in the parser, which explodes
for large values.

2012/6/26 Clay B. <[EMAIL PROTECTED]>

> It's worth pointing out that Pig 0.9.2 also runs quickly; we only see the
> degradation with Pig 0.10.0.
>
> The degradation in performance seems to have a knee as 4 or 5 conditionals
> works as expected but as presented, the script takes about 6 minutes at the
> GRUNT> prompt after hitting enter; before any Hadoop execution.
>
> -Clay
>
>
> On Tue, 26 Jun 2012, Danfeng Li wrote:
>
>
>> We found the following simple logic will cause very long compiling time
>> for pig 0.10.0, while using pig 0.8.1,
>> everything is fine.
>>
>>
>>
>> A = load 'A.txt' using PigStorage()  AS (m: int);
>>
>>
>>
>> B = FOREACH A {
>>
>>     days_str = (chararray)
>>
>>         (m == 1 ? 31:
>>
>>         (m == 2 ? 28:
>>
>>         (m == 3 ? 31:
>>
>>         (m == 4 ? 30:
>>
>>         (m == 5 ? 31:
>>
>>         (m == 6 ? 30:
>>
>>         (m == 7 ? 31:
>>
>>         (m == 8 ? 31:
>>
>>         (m == 9 ? 30:
>>
>>         (m == 10 ? 31:
>>
>>         (m == 11 ? 30:31)))))))))));
>>
>> GENERATE
>>
>>    days_str as days_str;
>>
>> }
>>
>> store B into 'B';
>>
>>
>>
>> here’s the pig version we used in the test
>>
>> Apache Pig version 0.10.0-SNAPSHOT (rexported)
>>
>>
>>
>>
>>
>> Attached is the pig code and an example input file.
>>
>>
>>
>> Dan
>>
>>
>>
>>
>>
>>
>>