Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> How to use TOP?


Copy link to this message
-
How to use TOP?
Hello list,

  I have an Hdfs file that has 6 columns that contain some data stored
in an Hbase table.the data looks like this -

18.98 2000         1.21 193.46 2.64    58.17
52.49 2000.5 4.32        947.11 2.74    64.45
115.24 2001         16.8 878.58 2.66    94.49
55.55 2001.5 33.03 656.56 2.82    60.76
156.14 2002         35.52 83.75 2.6    59.57
138.77 2002.5 21.51 105.76 2.62    85.89
71.89 2003         27.79 709.01 2.63    85.44
59.84 2003.5 32.1        444.82 2.72    70.8
103.18 2004         4.09 413.15 2.8    54.37

Now I have to take each record along with its next 4 records and do
some processing(for example, in the first shot I have to take records
1-5, in the next shot I have to take 2-6 and so on)..I am trying to
use TOP for this, but getting the following error -

2012-05-21 17:04:30,328 [main] ERROR org.apache.pig.tools.grunt.Grunt
- ERROR 1200: Pig script failed to parse:
<line 6, column 37> Invalid scalar projection: parameters : A column
needs to be projected from a relation for it to be used as a scalar
Details at logfile: /home/mohammad/pig-0.9.2/logs/pig_1337599211281.log

I am using following commands -

grunt> a = load 'hbase://logdata'
>> using org.apache.pig.backend.hadoop.hbase.HBaseStorage(
>> 'cf:DGR cf:HD cf:POR cf:RES cf:RHOB cf:SON', '-loadKey true')
>> as (id, DGR, HD, POR, RES, RHOB, SON);
grunt> b = foreach a { c = TOP(5,3,a);
>> generate flatten(c);
>> }

Could anyone tell me how to achieve that????Many thanks.

Regards,
    Mohammad Tariq