Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - Load Pig metadata from file?


Copy link to this message
-
Re: Load Pig metadata from file?
Thejas Nair 2012-05-16, 20:25
you can also use 'pig -dryrun ..' to see what the pig query after
parameter substitution looks like.

Thanks,
Thejas
On 5/15/12 4:56 PM, Saurabh S wrote:
>
> Aniket: You were spot on. This method doesn't allow any spaces in the file because the parameter will get truncated at the first sighting of a white space. I found that using the 'bash -x' method that you suggested. Thanks a lot for that!
>
> Shan: I'm just beginning to use Pig and don't know a lot about macros. I'll look into them, however.
>
> Regards,
> Saurabh
>
>> Date: Tue, 15 May 2012 15:58:53 -0700
>> Subject: Re: Load Pig metadata from file?
>> From: [EMAIL PROTECTED]
>> To: [EMAIL PROTECTED]
>>
>> I think you need to play with some quotes, its more likely a bash problem.
>>
>> one way to debug is bash -x pig  -f script.pig -param md=$(cat
>> metadata.dat) and check what does hadoop jar gets in the end.
>>
>> try - md="$(cat metadata.dat)"
>> or -md="'$(cat metadata.dat)'" (single quote inside double quote
>> and so on..
>>
>> Thanks,
>> Aniket
>>
>> On Tue, May 15, 2012 at 3:34 PM, Saurabh S<[EMAIL PROTECTED]>  wrote:
>>
>>>
>>> Here is a sample LOAD statement from Programming Pig book:
>>>
>>> daily = load 'NYSE_daily' as (exchange:chararray, symbol:chararray,
>>>             date:chararray, open:float, high:float, low:float, close:float,
>>>             volume:int, adj_close:float);
>>>
>>> In my case, there are around 250 columns to load. So, I created a file,
>>> say, metadata.dat with its contents as follows:
>>>
>>>   (exchange:chararray, symbol:chararray,
>>>
>>>             date:chararray, open:float, high:float, low:float, close:float,
>>>
>>>             volume:int, adj_close:float)
>>>
>>> My load statement now looks like
>>>
>>> daily = load 'NYSE_daily' as $md;
>>>
>>> and the execution looks like.
>>>
>>> pig -f script.pig -param md=$(cat metadata.dat)
>>>
>>> However, I get the following error in this method:
>>>
>>> ERROR 1000: Error during parsing. Lexical error at line 9, column 0.
>>>   Encountered:<EOF>  after : ""
>>>
>>> Copying the contents of the file in appropriate place works fine. But the
>>> pig script is cluttered with the metdata and I would like to separate it
>>> from the script. Any ideas?
>>>
>>> HCatLoader() does not seem to be available on my system.
>>>
>>>
>>>
>>>
>>
>>
>>
>>
>> --
>> "...:::Aniket:::... Quetzalco@tl"
>