Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> How to use TOP?


Copy link to this message
-
Re: How to use TOP?
Hi Ruslan,

    Thanks for the response.I think I have made a mistake.Actually I
just want the top 5 records each time.I don't have any sorting
requirements.

Regards,
    Mohammad Tariq
On Mon, May 21, 2012 at 9:31 PM, Ruslan Al-fakikh
<[EMAIL PROTECTED]> wrote:
> Hey Mohammad,
>
> Here
> c = TOP(5,3,a);
> you say: take 5 records out of a that have the biggest values in the third
> column. Do you really need that sorting by the third column?
>
> -----Original Message-----
> From: Mohammad Tariq [mailto:[EMAIL PROTECTED]]
> Sent: Monday, May 21, 2012 3:54 PM
> To: [EMAIL PROTECTED]
> Subject: How to use TOP?
>
> Hello list,
>
>  I have an Hdfs file that has 6 columns that contain some data stored in an
> Hbase table.the data looks like this -
>
> 18.98   2000             1.21   193.46  2.64        58.17
> 52.49   2000.5   4.32           947.11  2.74        64.45
> 115.24  2001             16.8   878.58  2.66        94.49
> 55.55   2001.5   33.03  656.56  2.82        60.76
> 156.14  2002             35.52  83.75   2.6         59.57
> 138.77  2002.5   21.51  105.76  2.62        85.89
> 71.89   2003             27.79  709.01  2.63        85.44
> 59.84   2003.5   32.1           444.82  2.72        70.8
> 103.18  2004             4.09   413.15  2.8         54.37
>
> Now I have to take each record along with its next 4 records and do some
> processing(for example, in the first shot I have to take records 1-5, in the
> next shot I have to take 2-6 and so on)..I am trying to use TOP for this,
> but getting the following error -
>
> 2012-05-21 17:04:30,328 [main] ERROR org.apache.pig.tools.grunt.Grunt
> - ERROR 1200: Pig script failed to parse:
> <line 6, column 37> Invalid scalar projection: parameters : A column needs
> to be projected from a relation for it to be used as a scalar Details at
> logfile: /home/mohammad/pig-0.9.2/logs/pig_1337599211281.log
>
> I am using following commands -
>
> grunt> a = load 'hbase://logdata'
>>> using org.apache.pig.backend.hadoop.hbase.HBaseStorage(
>>> 'cf:DGR cf:HD cf:POR cf:RES cf:RHOB cf:SON', '-loadKey true') as (id,
>>> DGR, HD, POR, RES, RHOB, SON);
> grunt> b = foreach a { c = TOP(5,3,a);
>>> generate flatten(c);
>>> }
>
> Could anyone tell me how to achieve that????Many thanks.
>
> Regards,
>     Mohammad Tariq
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB