Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> LIMIT Issue


To cut down on the problem space, can you try your query on grunt. If
it works there, problem would be something to do with PigServer, else
its related to Pig core itself.

Ashutosh
On Thu, Aug 5, 2010 at 10:57, Matthew Smith <[EMAIL PROTECTED]> wrote:
> No I have not used it in grunt. I am looking to use the pigServer because of the parameter passing that is doable through Java. I am using Pig 0.7.0.
>
> -----Original Message-----
> From: Ashutosh Chauhan [mailto:[EMAIL PROTECTED]]
> Sent: Thursday, August 05, 2010 12:54 PM
> To: [EMAIL PROTECTED]
> Subject: Re: LIMIT Issue
>
> Matt,
>
> Which version you are on? What happens if you run your query through
> grunt instead of PigServer?
> I tried load-order-limit sequence on a small dataset on grunt and I
> got expected results.
>
> Ashutosh
> On Wed, Aug 4, 2010 at 15:07, Matthew Smith <[EMAIL PROTECTED]> wrote:
>> Hey,
>>
>>
>>
>> While running in Java a LIMIT statement is not getting executed.
>>
>>
>>
>> /code
>>
>>                        myServer.registerQuery("flow_firstcut = FOREACH
>> data GENERATE sIP, dIP, sPort, dPort, protocol, bytes, flags;");
>>
>>                        myServer.registerQuery("filtered = FILTER
>> flow_firstcut BY sIP matches 'someIP';");
>>
>>
>>
>>                        myServer.registerQuery("O = ORDER filtered BY
>> bytes DESC;");
>>
>>
>>
>>                        myServer.registerQuery("topTen = LIMIT O 10;");
>>
>>
>>
>>                        myServer.store("topTen", outputFilePath);
>>
>>
>>
>> /code
>>
>>
>>
>> This produces a 699 line file. It should produce a 10 line file.
>>
>>
>>
>> /code
>>
>>                        registerQuery("flow_firstcut = FOREACH data
>> GENERATE sIP, dIP, sPort, dPort, protocol, bytes, flags;");
>>
>>                        myServer.registerQuery("filtered = FILTER
>> flow_firstcut BY sIP matches '"+parameters[1]+"';");
>>
>>
>>
>>                        //myServer.registerQuery("O = ORDER filtered BY
>> bytes DESC;");
>>
>>
>>
>>                        myServer.registerQuery("topTen = LIMIT filtered
>> 10;");
>>
>>
>>
>>                        myServer.store("topTen", outputFilePath);
>>
>> /code
>>
>>
>>
>> This produces a 10 line file.
>>
>>
>>
>> Is there a known bug I am unaware of or can you not order then limit?
>>
>> http://hadoop.apache.org/pig/docs/r0.7.0/piglatin_ref2.html#LIMIT
>>
>> indicates that this is a valid sequence of calls.
>>
>>
>>
>> Help?
>>
>>
>>
>> Matt
>>
>>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB