Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> LIMIT Issue


+
Matthew Smith 2010-08-04, 22:07
+
Ashutosh Chauhan 2010-08-05, 16:53
+
Matthew Smith 2010-08-05, 17:57
+
Ashutosh Chauhan 2010-08-05, 19:10
+
Matthew Smith 2010-08-05, 21:54
+
Ashutosh Chauhan 2010-08-06, 06:43
+
Matthew Smith 2010-08-06, 14:16
It looks like a bug then. Do you have a script and small enough
dataset which you can upload on jira which reproduces the issue. If
so, go ahead and create a jira ticket with script and data. Are you
using local mode or mapreduce mode ?

Ashutosh
On Fri, Aug 6, 2010 at 07:16, Matthew Smith <[EMAIL PROTECTED]> wrote:
> B is not empty:
> (58.72.19.26, 58.72.19.26,38627,22196,6,512, FS PA)
> (58.72.19.26, 36.65.53.83,44133,10957,6,646, FS PA)
> (58.72.19.26, 68.99.24.4,43951,11023,6,364, FS PA)
> (58.72.19.26, 9.7.68.69,18644,20524,17,228, FS PA)
> (58.72.19.26, 73.77.82.19,25,1024,6,194, FS PA)
> (58.72.19.26, 36.65.53.83,56380,71718,6,1003, FS PA)
> (58.72.19.26, 58.72.19.26,10221,44938,6,277, FS PA)
> (58.72.19.26, 77.52.5.64,69247,11023,6,389, FS PA)
> (58.72.19.26, 93.6.87.73,38149,1024,6,138, FS PA)
> (58.72.19.26, 58.72.19.26,11558,24292,6,812, FS PA)
> (58.72.19.26, 58.72.19.26,65668,71318,6,175, FS PA)
> (58.72.19.26, 68.99.24.4,61923,1024,6,1598, FS PA)
> (58.72.19.26, 60.41.59.65,22421,65796,6,1402, FS PA)
> (58.72.19.26, 58.72.19.26,69740,21873,6,322, S A)
> (58.72.19.26, 95.70.58.21,11058,1024,6,1453, FS PA)
> (58.72.19.26, 42.10.50.36,44863,11023,6,251, FS PA)
> (58.72.19.26, 57.6.91.5,25857,1024,6,1546, FS PA)
> (58.72.19.26, 68.99.24.4,54756,11023,6,219, FS PA)
> (58.72.19.26, 36.65.53.83,73335,43857,6,9, FS PA)
> (58.72.19.26, 95.70.58.21,32204,11023,6,1635, S A)
> (58.72.19.26, 76.48.82.73,46483,1024,6,127, FS PA)
> (58.72.19.26, 81.88.14.14,55609,1024,6,507, FS PA)
> (58.72.19.26, 1.54.61.21,65763,1024,6,370, FS PA)
>
>
> But after I do:
>> grunt> C = ORDER B BY bytes DESC;
>> grunt> Dump C;
>
> I get the same error as before: > java.lang.RuntimeException: org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: file:/user/matt/pigsample_19823722_1281044888160
>
>
> Which would lead me to believe my ORDER is broken. Is there a conf I need to change?
>
>
> -----Original Message-----
> From: Ashutosh Chauhan [mailto:[EMAIL PROTECTED]]
> Sent: Friday, August 06, 2010 2:43 AM
> To: Matthew Smith
> Cc: [EMAIL PROTECTED]
> Subject: Re: LIMIT Issue
>
> This is most likely because B is empty. do
>
> grunt> dump A; -- to verify data is getting loaded as you are expecting.
> grunt> dump B; -- to verify that B is non-empty.
>
> Ashutosh
>
> On Thu, Aug 5, 2010 at 14:54, Matthew Smith <[EMAIL PROTECTED]> wrote:
>> While running grunt I ran into another error. I see it is looking for another file, but I have never run into this problem with grunt before. This environment was freshly installed this morning before the grunt shell was executed.
>>
>> I also checked my PigServer() Java code on the new install, and it still produces a 699 line file which is ORDERed but not LIMITed.
>>
>> Thoughts?
>>
>>
>> grunt> A = LOAD '0' USING PigStorage('|') as (sIP:chararray,dIP:chararray,sPort:int, dPort:int,protocol:int, bytes:int, flags:chararray);
>> grunt> B = FILTER A BY sIP matches '61.81.46.45';
>> grunt> C = ORDER B BY bytes DESC;
>> grunt> D = LIMIT C 10;
>> grunt> DUMP D;
>>
>>
>>
>>
>> 2010-08-05 14:47:52,622 [main] INFO  org.apache.pig.impl.logicalLayer.optimizer.PruneColumns - No column pruned for A
>> 2010-08-05 14:47:52,622 [main] INFO  org.apache.pig.impl.logicalLayer.optimizer.PruneColumns - No map keys pruned for A
>> 2010-08-05 14:47:52,681 [main] INFO  org.apache.hadoop.metrics.jvm.JvmMetrics - Initializing JVM Metrics with processName=JobTracker, sessionId>> 2010-08-05 14:47:52,819 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - (Name: Store(file:/tmp/temp1184504472/tmp-1623830760:org.apache.pig.builtin.BinStorage) - 1-54 Operator Key: 1-54)
>> 2010-08-05 14:47:52,895 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 3
>> 2010-08-05 14:47:52,895 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 3