Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - How to use TOP?


Copy link to this message
-
Re: How to use TOP?
Mohammad Tariq 2012-05-21, 17:33
Hi Ruslan,

    Thanks for the response.I think I have made a mistake.Actually I
just want the top 5 records each time.I don't have any sorting
requirements.

Regards,
    Mohammad Tariq
On Mon, May 21, 2012 at 9:31 PM, Ruslan Al-fakikh
<[EMAIL PROTECTED]> wrote:
> Hey Mohammad,
>
> Here
> c = TOP(5,3,a);
> you say: take 5 records out of a that have the biggest values in the third
> column. Do you really need that sorting by the third column?
>
> -----Original Message-----
> From: Mohammad Tariq [mailto:[EMAIL PROTECTED]]
> Sent: Monday, May 21, 2012 3:54 PM
> To: [EMAIL PROTECTED]
> Subject: How to use TOP?
>
> Hello list,
>
>  I have an Hdfs file that has 6 columns that contain some data stored in an
> Hbase table.the data looks like this -
>
> 18.98   2000             1.21   193.46  2.64        58.17
> 52.49   2000.5   4.32           947.11  2.74        64.45
> 115.24  2001             16.8   878.58  2.66        94.49
> 55.55   2001.5   33.03  656.56  2.82        60.76
> 156.14  2002             35.52  83.75   2.6         59.57
> 138.77  2002.5   21.51  105.76  2.62        85.89
> 71.89   2003             27.79  709.01  2.63        85.44
> 59.84   2003.5   32.1           444.82  2.72        70.8
> 103.18  2004             4.09   413.15  2.8         54.37
>
> Now I have to take each record along with its next 4 records and do some
> processing(for example, in the first shot I have to take records 1-5, in the
> next shot I have to take 2-6 and so on)..I am trying to use TOP for this,
> but getting the following error -
>
> 2012-05-21 17:04:30,328 [main] ERROR org.apache.pig.tools.grunt.Grunt
> - ERROR 1200: Pig script failed to parse:
> <line 6, column 37> Invalid scalar projection: parameters : A column needs
> to be projected from a relation for it to be used as a scalar Details at
> logfile: /home/mohammad/pig-0.9.2/logs/pig_1337599211281.log
>
> I am using following commands -
>
> grunt> a = load 'hbase://logdata'
>>> using org.apache.pig.backend.hadoop.hbase.HBaseStorage(
>>> 'cf:DGR cf:HD cf:POR cf:RES cf:RHOB cf:SON', '-loadKey true') as (id,
>>> DGR, HD, POR, RES, RHOB, SON);
> grunt> b = foreach a { c = TOP(5,3,a);
>>> generate flatten(c);
>>> }
>
> Could anyone tell me how to achieve that????Many thanks.
>
> Regards,
>     Mohammad Tariq
>