Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Pig/Avro Question


Copy link to this message
-
Re: Pig/Avro Question
Check the code in PigAvroInputFormat; it overrides 'listStatus' from
FileInputFormat so that files not ending
in .avro are filtered.

stan

On Fri, Feb 3, 2012 at 1:58 PM, Russell Jurney <[EMAIL PROTECTED]> wrote:
> btw - the weird thing is... I've read the code.  There isn't a filter for
> .avro in there.  Does Hadoop, or Avro itself (not that I can see it is
> involved) do so?
>
> On Fri, Feb 3, 2012 at 10:55 AM, Russell Jurney <[EMAIL PROTECTED]>wrote:
>
>> Hmmm I applied it, but I still can't open files that don't end in .avro
>>
>> On Fri, Feb 3, 2012 at 2:23 AM, Philipp <[EMAIL PROTECTED]> wrote:
>>
>>> This patch fixes this issue:
>>>
>>> https://issues.apache.org/**jira/browse/PIG-2492<https://issues.apache.org/jira/browse/PIG-2492>
>>>
>>>
>>>
>>> On 02/03/2012 07:22 AM, Russell Jurney wrote:
>>>
>>>> I have the same bug. I read the code... there is no obvious fix.  Arg.
>>>>
>>>> On Feb 2, 2012, at 10:07 PM, Something Something<mailinglists19@**
>>>> gmail.com <[EMAIL PROTECTED]>>  wrote:
>>>>
>>>>  In my Pig script I have something like this...
>>>>>
>>>>> %default MY_SCHEMA '/user/xyz/my-schema.json';
>>>>>
>>>>> %default MY_AVRO 'org.apache.pig.piggybank.**
>>>>> storage.avro.AvroStorage(\'$**MY_SCHEMA\')';
>>>>>
>>>>> my_files = LOAD '$MY_FILES' USING $MY_AVRO;
>>>>>
>>>>>
>>>>>
>>>>> What I have noticed is that when MY_FILES contains only one file, it
>>>>> works fine.
>>>>>
>>>>> %default MY_FILES '/user/xyz/file1.avro'
>>>>>
>>>>>
>>>>> But when I use a comma separated list it doesn't work. e.g.
>>>>>
>>>>> %default MY_FILES '/user/xyz/file1.avro, /user/xyz/file2.avro'
>>>>>
>>>>> Basically, I get a message saying something like 'Schema cannot be
>>>>> found'.
>>>>>
>>>>> Is there a way to make it work with multiple files?  Please let me
>>>>> know.  Thanks.
>>>>>
>>>>>
>>>
>>
>>
>> --
>> Russell Jurney
>> twitter.com/rjurney
>> [EMAIL PROTECTED]
>> datasyndrome.com
>>
>
>
>
> --
> Russell Jurney
> twitter.com/rjurney
> [EMAIL PROTECTED]
> datasyndrome.com