Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> ORDER failed


+
Lei Liu 2013-04-13, 03:02
+
Prasanth J 2013-04-13, 08:54
+
Lei Liu 2013-04-13, 09:56
Copy link to this message
-
Re: ORDER failed
Hi Lei,

It seems there is something wrong with creating a sampler. The ORDER
 command is not trivial, it works by creating a sampler. I guess something
went wrong with it:
Input path
does not exist:
file:/home/dliu/ApacheLogAnalysisWithPig/pigsample_259943398_1365820592017
I suppose pigsample is not a name that you used in your script, so maybe
Pig failed to create a sample file. Try to run the job on HDFS, we'll see
what happens. I see that you are using the local filesystem: file:/....

Best Regards
On Sat, Apr 13, 2013 at 1:56 PM, Lei Liu <[EMAIL PROTECTED]> wrote:

> I am sure it's not that. The ORDER command fails the whole thing. If I
> remove the ORDER command, the same script runs just fine except the result
> is not in order.
>
>
> On Sat, Apr 13, 2013 at 4:54 PM, Prasanth J <[EMAIL PROTECTED]
> >wrote:
>
> > From the error logs, it seems like input file doesn't exist or not
> > accessible.
> >
> > > Caused by: org.apache.hadoop.mapreduce.lib.input.InvalidInputException:
> > > Input path does not exist:
> > >
> >
> file:/home/dliu/ApacheLogAnalysisWithPig/pigsample_259943398_1365820592017
> >
> > can you please check if the input path in $LOGS is proper?
> >
> > Thanks
> > -- Prasanth
> >
> > On Apr 12, 2013, at 11:02 PM, Lei Liu <[EMAIL PROTECTED]> wrote:
> >
> > > Hi, I am using Pig to analyze the percentage of each UserAgents from an
> > > apache log. The following program failed because of ORDER command at
> the
> > > very last (the result variable is correct and can be dumped out
> > correctly).
> > > I am relative new to Pig and could not figure it out so need you guys
> to
> > > help. Following is the program and error message. Thanks!
> > >
> > > logs = LOAD '$LOGS' USING ApacheCombinedLogLoader AS (remoteHost,
> hyphen,
> > > user, time, method, uri, protocol, statusCode, responseSize, referer,
> > > userAgent);
> > >
> > > uarows = FOREACH logs GENERATE userAgent;
> > > total = FOREACH (GROUP uarows ALL) GENERATE COUNT(uarows) as count;
> > > dump total;
> > >
> > > gpuarows = GROUP uarows BY userAgent;
> > > result = FOREACH gpuarows {
> > >       subtotal = COUNT(uarows);
> > >       GENERATE flatten(group) as ua, subtotal AS SUB_TOTAL,
> > > 100*(double)subtotal/(double)total.count AS percentage;
> > >       };
> > > orderresult = ORDER result BY SUB_TOTAL DESC;
> > > dump orderresult;
> > >
> > > -- what's weird is that 'dump result' works just fine, so it's the
> ORDER
> > > line makes trouble
> > >
> > > Errors:
> > > 2013-04-13 10:36:32,409 [Thread-48] INFO
> >  org.apache.hadoop.mapred.MapTask
> > > - record buffer = 262144/327680
> > > 2013-04-13 10:36:32,437 [Thread-48] WARN
> > > org.apache.hadoop.mapred.LocalJobRunner - job_local_0005
> > > java.lang.RuntimeException:
> > > org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path
> > > does not exist:
> > >
> >
> file:/home/dliu/ApacheLogAnalysisWithPig/pigsample_259943398_1365820592017
> > >    at
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.setConf(WeightedRangePartitioner.java:157)
> > >    at
> > > org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:62)
> > >    at
> > >
> >
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
> > >    at
> > >
> >
> org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:677)
> > >    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:756)
> > >    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
> > >    at
> > >
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
> > > Caused by: org.apache.hadoop.mapreduce.lib.input.InvalidInputException:
> > > Input path does not exist:
> > >
> >
> file:/home/dliu/ApacheLogAnalysisWithPig/pigsample_259943398_1365820592017
> > >    at
> > >
> >
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:235)
> > >    at
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigFileInputFormat.listStatus(PigFileInputFormat.java:37)
+
Lei Liu 2013-04-14, 01:11