Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> a simple logic causes very long compiling time on pig 0.10.0


Copy link to this message
-
Re: a simple logic causes very long compiling time on pig 0.10.0
This is a great find. Please file a ticket.

My guess is that there is some backtracking in the parser, which explodes
for large values.

2012/6/26 Clay B. <[EMAIL PROTECTED]>

> It's worth pointing out that Pig 0.9.2 also runs quickly; we only see the
> degradation with Pig 0.10.0.
>
> The degradation in performance seems to have a knee as 4 or 5 conditionals
> works as expected but as presented, the script takes about 6 minutes at the
> GRUNT> prompt after hitting enter; before any Hadoop execution.
>
> -Clay
>
>
> On Tue, 26 Jun 2012, Danfeng Li wrote:
>
>
>> We found the following simple logic will cause very long compiling time
>> for pig 0.10.0, while using pig 0.8.1,
>> everything is fine.
>>
>>
>>
>> A = load 'A.txt' using PigStorage()  AS (m: int);
>>
>>
>>
>> B = FOREACH A {
>>
>>     days_str = (chararray)
>>
>>         (m == 1 ? 31:
>>
>>         (m == 2 ? 28:
>>
>>         (m == 3 ? 31:
>>
>>         (m == 4 ? 30:
>>
>>         (m == 5 ? 31:
>>
>>         (m == 6 ? 30:
>>
>>         (m == 7 ? 31:
>>
>>         (m == 8 ? 31:
>>
>>         (m == 9 ? 30:
>>
>>         (m == 10 ? 31:
>>
>>         (m == 11 ? 30:31)))))))))));
>>
>> GENERATE
>>
>>    days_str as days_str;
>>
>> }
>>
>> store B into 'B';
>>
>>
>>
>> here’s the pig version we used in the test
>>
>> Apache Pig version 0.10.0-SNAPSHOT (rexported)
>>
>>
>>
>>
>>
>> Attached is the pig code and an example input file.
>>
>>
>>
>> Dan
>>
>>
>>
>>
>>
>>
>>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB