Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> ORDER failed


Copy link to this message
-
Re: ORDER failed
From the error logs, it seems like input file doesn't exist or not accessible.

> Caused by: org.apache.hadoop.mapreduce.lib.input.InvalidInputException:
> Input path does not exist:
> file:/home/dliu/ApacheLogAnalysisWithPig/pigsample_259943398_1365820592017

can you please check if the input path in $LOGS is proper?

Thanks
-- Prasanth

On Apr 12, 2013, at 11:02 PM, Lei Liu <[EMAIL PROTECTED]> wrote:

> Hi, I am using Pig to analyze the percentage of each UserAgents from an
> apache log. The following program failed because of ORDER command at the
> very last (the result variable is correct and can be dumped out correctly).
> I am relative new to Pig and could not figure it out so need you guys to
> help. Following is the program and error message. Thanks!
>
> logs = LOAD '$LOGS' USING ApacheCombinedLogLoader AS (remoteHost, hyphen,
> user, time, method, uri, protocol, statusCode, responseSize, referer,
> userAgent);
>
> uarows = FOREACH logs GENERATE userAgent;
> total = FOREACH (GROUP uarows ALL) GENERATE COUNT(uarows) as count;
> dump total;
>
> gpuarows = GROUP uarows BY userAgent;
> result = FOREACH gpuarows {
>       subtotal = COUNT(uarows);
>       GENERATE flatten(group) as ua, subtotal AS SUB_TOTAL,
> 100*(double)subtotal/(double)total.count AS percentage;
>       };
> orderresult = ORDER result BY SUB_TOTAL DESC;
> dump orderresult;
>
> -- what's weird is that 'dump result' works just fine, so it's the ORDER
> line makes trouble
>
> Errors:
> 2013-04-13 10:36:32,409 [Thread-48] INFO  org.apache.hadoop.mapred.MapTask
> - record buffer = 262144/327680
> 2013-04-13 10:36:32,437 [Thread-48] WARN
> org.apache.hadoop.mapred.LocalJobRunner - job_local_0005
> java.lang.RuntimeException:
> org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path
> does not exist:
> file:/home/dliu/ApacheLogAnalysisWithPig/pigsample_259943398_1365820592017
>    at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.setConf(WeightedRangePartitioner.java:157)
>    at
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:62)
>    at
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
>    at
> org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:677)
>    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:756)
>    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>    at
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
> Caused by: org.apache.hadoop.mapreduce.lib.input.InvalidInputException:
> Input path does not exist:
> file:/home/dliu/ApacheLogAnalysisWithPig/pigsample_259943398_1365820592017
>    at
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:235)
>    at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigFileInputFormat.listStatus(PigFileInputFormat.java:37)
>    at
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:252)
>    at org.apache.pig.impl.io.ReadToEndLoader.init(ReadToEndLoader.java:177)
>    at
> org.apache.pig.impl.io.ReadToEndLoader.<init>(ReadToEndLoader.java:124)
>    at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.setConf(WeightedRangePartitioner.java:131)
>    ... 6 more
> 2013-04-13 10:36:32,525 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - HadoopJobId: job_local_0005
> 2013-04-13 10:36:32,526 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - Processing aliases orderresult
> 2013-04-13 10:36:32,526 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - detailed locations: M: orderresult[19,14] C:  R:
> 2013-04-13 10:36:37,536 [main] WARN
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - Ooops! Some job has failed! Specify -stop_on_failure if you want Pig to
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB