Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> ORDER failed


Copy link to this message
-
Re: ORDER failed
Hi Lei,

It seems there is something wrong with creating a sampler. The ORDER
 command is not trivial, it works by creating a sampler. I guess something
went wrong with it:
Input path
does not exist:
file:/home/dliu/ApacheLogAnalysisWithPig/pigsample_259943398_1365820592017
I suppose pigsample is not a name that you used in your script, so maybe
Pig failed to create a sample file. Try to run the job on HDFS, we'll see
what happens. I see that you are using the local filesystem: file:/....

Best Regards
On Sat, Apr 13, 2013 at 1:56 PM, Lei Liu <[EMAIL PROTECTED]> wrote:

> I am sure it's not that. The ORDER command fails the whole thing. If I
> remove the ORDER command, the same script runs just fine except the result
> is not in order.
>
>
> On Sat, Apr 13, 2013 at 4:54 PM, Prasanth J <[EMAIL PROTECTED]
> >wrote:
>
> > From the error logs, it seems like input file doesn't exist or not
> > accessible.
> >
> > > Caused by: org.apache.hadoop.mapreduce.lib.input.InvalidInputException:
> > > Input path does not exist:
> > >
> >
> file:/home/dliu/ApacheLogAnalysisWithPig/pigsample_259943398_1365820592017
> >
> > can you please check if the input path in $LOGS is proper?
> >
> > Thanks
> > -- Prasanth
> >
> > On Apr 12, 2013, at 11:02 PM, Lei Liu <[EMAIL PROTECTED]> wrote:
> >
> > > Hi, I am using Pig to analyze the percentage of each UserAgents from an
> > > apache log. The following program failed because of ORDER command at
> the
> > > very last (the result variable is correct and can be dumped out
> > correctly).
> > > I am relative new to Pig and could not figure it out so need you guys
> to
> > > help. Following is the program and error message. Thanks!
> > >
> > > logs = LOAD '$LOGS' USING ApacheCombinedLogLoader AS (remoteHost,
> hyphen,
> > > user, time, method, uri, protocol, statusCode, responseSize, referer,
> > > userAgent);
> > >
> > > uarows = FOREACH logs GENERATE userAgent;
> > > total = FOREACH (GROUP uarows ALL) GENERATE COUNT(uarows) as count;
> > > dump total;
> > >
> > > gpuarows = GROUP uarows BY userAgent;
> > > result = FOREACH gpuarows {
> > >       subtotal = COUNT(uarows);
> > >       GENERATE flatten(group) as ua, subtotal AS SUB_TOTAL,
> > > 100*(double)subtotal/(double)total.count AS percentage;
> > >       };
> > > orderresult = ORDER result BY SUB_TOTAL DESC;
> > > dump orderresult;
> > >
> > > -- what's weird is that 'dump result' works just fine, so it's the
> ORDER
> > > line makes trouble
> > >
> > > Errors:
> > > 2013-04-13 10:36:32,409 [Thread-48] INFO
> >  org.apache.hadoop.mapred.MapTask
> > > - record buffer = 262144/327680
> > > 2013-04-13 10:36:32,437 [Thread-48] WARN
> > > org.apache.hadoop.mapred.LocalJobRunner - job_local_0005
> > > java.lang.RuntimeException:
> > > org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path
> > > does not exist:
> > >
> >
> file:/home/dliu/ApacheLogAnalysisWithPig/pigsample_259943398_1365820592017
> > >    at
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.setConf(WeightedRangePartitioner.java:157)
> > >    at
> > > org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:62)
> > >    at
> > >
> >
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
> > >    at
> > >
> >
> org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:677)
> > >    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:756)
> > >    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
> > >    at
> > >
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
> > > Caused by: org.apache.hadoop.mapreduce.lib.input.InvalidInputException:
> > > Input path does not exist:
> > >
> >
> file:/home/dliu/ApacheLogAnalysisWithPig/pigsample_259943398_1365820592017
> > >    at
> > >
> >
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:235)
> > >    at
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigFileInputFormat.listStatus(PigFileInputFormat.java:37)
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB