Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> Usage of 'limit' with Pig for Hbase


+
kiran chitturi 2013-03-13, 14:48
+
Bill Graham 2013-03-13, 15:37
Copy link to this message
-
Re: Usage of 'limit' with Pig for Hbase
Thank you. This cleared my doubt.
On Wed, Mar 13, 2013 at 11:37 AM, Bill Graham <[EMAIL PROTECTED]> wrote:

> The -limit passed to HBaseStorage is the limit per mapper reading from
> HBase. If you want to limit overall records, also use LIMIT:
>
> fields = LIMIT fields 5;
>
>
> On Wed, Mar 13, 2013 at 7:48 AM, kiran chitturi
> <[EMAIL PROTECTED]>wrote:
>
> > Hi!
> >
> > I am using Pig 0.10.0 with Hbase in distributed mode to read the records
> > and I have used this command below.
> >
> > fields = load 'hbase://documents' using
> >
> org.apache.pig.backend.hadoop.hbase.HBaseStorage('field:fields_j','-loadKey
> > true  -limit 5') as (rowkey, fields:map[]);
> >
> > I want pig to limit the records to only 5 but it is quite different.
> Please
> > see the logs below.
> >
> > Input(s):
> > Successfully read 250 records (16520 bytes) from: "hbase://documents"
> >
> > Output(s):
> > Successfully stored 250 records (19051 bytes) in:
> > "hdfs://LucidN1:50001/tmp/temp1510040776/tmp1443083789"
> >
> > Counters:
> > > Total records written : 250
> > > Total bytes written : 19051
> > > Spillable Memory Manager spill count : 0
> > > Total bags proactively spilled: 0
> > > Total records proactively spilled: 0
> > > Job DAG:
> > > job_201303121846_0056
> > >
> > > 2013-03-13 14:43:10,186 [main] WARN
> > >
> >
>  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> > > - Encountered Warning FIELD_DISCARDED_TYPE_CONVERSION_FAILED 250
> time(s).
> > > 2013-03-13 14:43:10,186 [main] INFO
> > >
> >
>  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> > > - Success!
> > > 2013-03-13 14:43:10,210 [main] INFO
> > >  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input
> > paths
> > > to process : 51
> > > 2013-03-13 14:43:10,211 [main] INFO
> > >  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total
> > > input paths to process : 51
> >
> >
> > Am I using the 'limit' keyword the wrong way ?
> >
> > Please let me know your suggestions.
> >
> > Thanks,
> > --
> > Kiran Chitturi
> >
> > <http://www.linkedin.com/in/kiranchitturi>
> >
>
>
>
> --
> *Note that I'm no longer using my Yahoo! email address. Please email me at
> [EMAIL PROTECTED] going forward.*
>

--
Kiran Chitturi

<http://www.linkedin.com/in/kiranchitturi>
+
Dmitriy Ryaboy 2013-03-15, 01:50
+
kiran chitturi 2013-03-15, 03:16