Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig, mail # user - Usage of 'limit' with Pig for Hbase


Copy link to this message
-
Usage of 'limit' with Pig for Hbase
kiran chitturi 2013-03-13, 14:48
Hi!

I am using Pig 0.10.0 with Hbase in distributed mode to read the records
and I have used this command below.

fields = load 'hbase://documents' using
org.apache.pig.backend.hadoop.hbase.HBaseStorage('field:fields_j','-loadKey
true  -limit 5') as (rowkey, fields:map[]);

I want pig to limit the records to only 5 but it is quite different. Please
see the logs below.

Input(s):
Successfully read 250 records (16520 bytes) from: "hbase://documents"

Output(s):
Successfully stored 250 records (19051 bytes) in:
"hdfs://LucidN1:50001/tmp/temp1510040776/tmp1443083789"

Counters:
> Total records written : 250
> Total bytes written : 19051
> Spillable Memory Manager spill count : 0
> Total bags proactively spilled: 0
> Total records proactively spilled: 0
> Job DAG:
> job_201303121846_0056
>
> 2013-03-13 14:43:10,186 [main] WARN
>  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - Encountered Warning FIELD_DISCARDED_TYPE_CONVERSION_FAILED 250 time(s).
> 2013-03-13 14:43:10,186 [main] INFO
>  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - Success!
> 2013-03-13 14:43:10,210 [main] INFO
>  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths
> to process : 51
> 2013-03-13 14:43:10,211 [main] INFO
>  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total
> input paths to process : 51
Am I using the 'limit' keyword the wrong way ?

Please let me know your suggestions.

Thanks,
--
Kiran Chitturi

<http://www.linkedin.com/in/kiranchitturi>
+
Bill Graham 2013-03-13, 15:37
+
kiran chitturi 2013-03-13, 16:17
+
Dmitriy Ryaboy 2013-03-15, 01:50
+
kiran chitturi 2013-03-15, 03:16