Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - Multiple files with AvroStorage and comma separated lists


Copy link to this message
-
Re: Multiple files with AvroStorage and comma separated lists
Stan Rosenberg 2012-01-24, 16:31
Great, I'll submit a patch later in the day.

Best,

stan

On Tue, Jan 24, 2012 at 11:29 AM, Bill Graham <[EMAIL PROTECTED]> wrote:
> Oops, I meant to address Stan in my last email. :)
>
> ---------- Forwarded message ----------
> From: Bill Graham <[EMAIL PROTECTED]>
> Date: Tue, Jan 24, 2012 at 8:28 AM
> Subject: Re: Multiple files with AvroStorage and comma separated lists
> To: [EMAIL PROTECTED]
>
>
> Hi Philipp,
>
> This is in fact a bug, so if you wouldn't mind submitting the patch, that
> would be great.
>
> thanks,
> Bill
>
>
> On Tue, Jan 24, 2012 at 8:22 AM, Stan Rosenberg <
> [EMAIL PROTECTED]> wrote:
>
>> Philipp,
>>
>> I would say that it is a bug.  I ran into the same problem some time
>> ago.  Essentially, AvroStorage does not recognize globs and does not
>> recognize commas, both of which
>> are supported by hadoop's FileInputFormat.  I ended up patching
>> AvroStorage to make it compatible with hadoop's semantics of input
>> paths.  I haven't submitted a patch though.
>> If there is some interest, I'd be more than glad to submit it.
>>
>> Bets,
>>
>> stan
>>
>>
>> On Tue, Jan 24, 2012 at 4:26 AM, Philipp <[EMAIL PROTECTED]> wrote:
>> > Dear Pig users,
>> >
>> > I tried to load several files with AvroStorage by using a comma separated
>> > list. The statement I used is:
>> >
>> > test_data= LOAD 'repo_1/part-r-00000.avro,repo_2/part-r-00000.avro' USING
>> > org.apache.pig.piggybank.storage.avro.AvroStorage();
>> >
>> > Pig states that no input paths were specified in job. Please see the
>> > stacktrace below.
>> > I tried pig version0.8.1-cdh3u2 and 0.9.1.
>> >
>> > Does anyone observe the same behavior? Is it a bug or a feature?
>> >
>> > Thanks, Philipp
>> >
>> >
>> >
>> >
>> >
>> > /Stacktrace:/
>> >
>> > rg.apache.pig.backend.executionengine.ExecException: ERROR 2118: No input
>> > paths specified in job
>> >    at
>> >
>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:282)
>> >    at
>> org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:885)
>> >    at
>> > org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779)
>> >    at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:730)
>> >    at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:378)
>> >    at
>> >
>> org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)
>> >    at
>> > org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)
>> >    at java.lang.Thread.run(Thread.java:679)
>> > Caused by: java.io.IOException: No input paths specified in job
>> >    at
>> >
>> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:186)
>> >    at
>> >
>> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:241)
>> >    at
>> >
>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:270)
>> >    ... 7 more
>> >
>>
>
>
>
> --
> *Note that I'm no longer using my Yahoo! email address. Please email me at
> [EMAIL PROTECTED] going forward.*
>
>
>
> --
> *Note that I'm no longer using my Yahoo! email address. Please email me at
> [EMAIL PROTECTED] going forward.*