Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Writing to HBase table from Pig script


Copy link to this message
-
Re: Writing to HBase table from Pig script
Store is an operator that doesn't get assigned to a relation. Instead of
this:

copy = store results into 'hbase://results' using
org.apache.pig.backend.hadoop.**hbase.HBaseStorage('cf:res1, cf:res2');

try this:

store results into 'hbase://results' using
org.apache.pig.backend.hadoop.**hbase.HBaseStorage('cf:res1,
cf:res2');

On Mon, Mar 11, 2013 at 5:09 AM, Byte Array <[EMAIL PROTECTED]> wrote:

> I use HBase 0.94.4
>
>
>
> On 03/11/2013 01:04 PM, yonghu wrote:
>
>> What HBase version do you use?
>>
>> On Mon, Mar 11, 2013 at 12:29 PM, Byte Array <[EMAIL PROTECTED]>
>> wrote:
>>
>>> Hello!
>>>
>>> I successfully read from HBase table using:
>>>
>>> table = load 'hbase://temp' using
>>> org.apache.pig.backend.hadoop.**hbase.HBaseStorage('cf:c1, cf:c2',
>>> '-loadKey
>>> true') as (key:chararray, c1:bytearray, c2:bytearray)
>>>
>>> I used UDF to parse column data and convert it into doubles from
>>> bytearrays.
>>>
>>> I do some processing and manage to dump the results:
>>> dump results;
>>>
>>> which prints:
>>> ((product1-20131231-20100101,**1.5,1.5))
>>> ((product2-20131231-20100101,**2.5,2.5))
>>>
>>> However, I cannot write these results into a newly created empty HBase
>>> table:
>>> copy = store results into 'hbase://results' using
>>> org.apache.pig.backend.hadoop.**hbase.HBaseStorage('cf:res1, cf:res2');
>>>
>>> I have also tried .. store results into 'results' using .., but it
>>> doesn't
>>> help.
>>> I am using pig-0.11.0.
>>>
>>> I suspect I should do some sort of casting into bytearrays using UDF,
>>> like I
>>> did when reading the table.
>>>
>>> This is the exception I get:
>>> java.io.IOException: java.lang.**IllegalArgumentException: No columns to
>>> insert
>>>      at
>>> org.apache.pig.backend.hadoop.**executionengine.**mapReduceLayer.**
>>> PigGenericMapReduce$Reduce.**runPipeline(**PigGenericMapReduce.java:470)
>>>      at
>>> org.apache.pig.backend.hadoop.**executionengine.**mapReduceLayer.**
>>> PigGenericMapReduce$Reduce.**processOnePackageOutput(**
>>> PigGenericMapReduce.java:433)
>>>      at
>>> org.apache.pig.backend.hadoop.**executionengine.**mapReduceLayer.**
>>> PigGenericMapReduce$Reduce.**reduce(PigGenericMapReduce.**java:413)
>>>      at
>>> org.apache.pig.backend.hadoop.**executionengine.**mapReduceLayer.**
>>> PigGenericMapReduce$Reduce.**reduce(PigGenericMapReduce.**java:257)
>>>      at org.apache.hadoop.mapreduce.**Reducer.run(Reducer.java:176)
>>>      at
>>> org.apache.hadoop.mapred.**ReduceTask.runNewReducer(**
>>> ReduceTask.java:650)
>>>      at org.apache.hadoop.mapred.**ReduceTask.run(ReduceTask.**java:418)
>>>      at
>>> org.apache.hadoop.mapred.**LocalJobRunner$Job.run(**
>>> LocalJobRunner.java:260)
>>> Caused by: java.lang.**IllegalArgumentException: No columns to insert
>>>      at org.apache.hadoop.hbase.**client.HTable.validatePut(**
>>> HTable.java:970)
>>>      at org.apache.hadoop.hbase.**client.HTable.doPut(HTable.**java:763)
>>>      at org.apache.hadoop.hbase.**client.HTable.put(HTable.java:**749)
>>>      at
>>> org.apache.hadoop.hbase.**mapreduce.TableOutputFormat$**
>>> TableRecordWriter.write(**TableOutputFormat.java:123)
>>>      at
>>> org.apache.hadoop.hbase.**mapreduce.TableOutputFormat$**
>>> TableRecordWriter.write(**TableOutputFormat.java:84)
>>>      at
>>> org.apache.pig.backend.hadoop.**hbase.HBaseStorage.putNext(**
>>> HBaseStorage.java:885)
>>>      at
>>> org.apache.pig.backend.hadoop.**executionengine.**mapReduceLayer.**
>>> PigOutputFormat$**PigRecordWriter.write(**PigOutputFormat.java:139)
>>>      at
>>> org.apache.pig.backend.hadoop.**executionengine.**mapReduceLayer.**
>>> PigOutputFormat$**PigRecordWriter.write(**PigOutputFormat.java:98)
>>>      at
>>> org.apache.hadoop.mapred.**ReduceTask$**NewTrackingRecordWriter.write(**
>>> ReduceTask.java:588)
>>>      at
>>> org.apache.hadoop.mapreduce.**TaskInputOutputContext.write(**
>>> TaskInputOutputContext.java:**80)
>>>      at
>>> org.apache.pig.backend.hadoop.**executionengine.**mapReduceLayer.**
*Note that I'm no longer using my Yahoo! email address. Please email me at
[EMAIL PROTECTED] going forward.*
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB