Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> Changing the schema before Storing

yaboulna@... 2012-12-11, 03:31
Bill Graham 2012-12-11, 07:27
Copy link to this message
Re: Changing the schema before Storing
Hi Bill,

Thanks for your reply. Since this is the case then JavaDocs of the  
class needs to be fixed (see  

Also, I faced a bug that I worked around by explicit casting. For some  
reason all the objects passed to putNext are of type DataByteArray,  
while the schema reports their correct types (tuple(string, int, int),  
long). This causes a lot of ClassCastExceptions because DataByteArray  
cannot be cast to any other type. I worked around this by passing  
everything to the STORE as a DataByteArray.


Quoting Bill Graham <[EMAIL PROTECTED]>:

> The STORE command doesn't take the AS clause, that's to define the schema
> at LOAD time. When storing, just prepare your relation with the the desired
> schema and then STORE it without the AS.
> You can do all the transformations you need to before the STORE and Pig
> will combine them all into as few logical processing steps as possible, so
> no need to worry about specifying many transformation statements.
> On Mon, Dec 10, 2012 at 7:31 PM, <[EMAIL PROTECTED]> wrote:
>> Hello,
>> I'm using HBaseStorage and I want to change the layout of the schema
>> before storage. Specifically I want to group some values into a tuple (thus
>> reducing the number of repetitions of the row and column keys).
>> Even though the JavaDoc gives an example that uses AS schema Grunt
>> complains that it is not parsable. Here's what I am trying:
>> STORE dataToStore INTO 'hbase://tableName' USING HBaseStorage('cf:tuple,
>> cf:date') AS TOTUPLE(val1, val2, val3), date;
>> Is this possible? Or do I have to do the transformation in a separate step:
>> dataTransformed = FOREACH dataToStore GENERATE TOTUPLE(val1, val2, val3),
>> date;
>> In case of the latter, can Pig be told to merge this step with the next
>> one? I tried a nested FOREACH where I can have an assignment operation, but
>> I quickly found out that STORE is not supported within the FOREACH.. what
>> was I thinking :).
>> Thanks!
>> -- Younos
> --
> *Note that I'm no longer using my Yahoo! email address. Please email me at
> [EMAIL PROTECTED] going forward.*

Best regards,
Younos Aboulnaga

Masters candidate
David Cheriton school of computer science
University of Waterloo

Mobile: +1 (519) 497-5669
Bill Graham 2012-12-12, 06:37
yaboulna@... 2012-12-12, 16:04
Bill Graham 2012-12-13, 07:07