Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Problems loading a datafile..


Copy link to this message
-
Re: Problems loading a datafile..

I might still be missing something useful (we're running elephant-bird
from the gpl-packing distribution, and I've registered most of the
jarfiles from it), but the strack trace has changed a little, so now
it's producing:

Backend error message during job submission
-------------------------------------------
org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Unable to create input slice for: hdfs://master.hadoop:9000/hadooptest/lzofile
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:269)
        at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:810)
        at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:781)
        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:730)
        at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:378)
        at org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)
        at org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)
        at java.lang.Thread.run(Thread.java:662)
Caused by: org.apache.pig.PigException: ERROR 0: no files found a path hdfs://master.hadoop:9000/hadooptest/lzofile
        at com.twitter.elephantbird.pig.load.LzoBaseLoadFunc.slice(Unknown Source)
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:260)
        ... 7 more

Pig Stack Trace
---------------
ERROR 2997: Unable to recreate exception from backend error: org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Unable to create input slice for: hdfs://master.hadoop:9000/hadooptest/lzofile

org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to open iterator for alias test4
        at org.apache.pig.PigServer.openIterator(PigServer.java:482)
        at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:539)
        at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:241)
        at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168)
        at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:144)
        at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:75)
        at org.apache.pig.Main.main(Main.java:352)
Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2997: Unable to recreate exception from backend error: org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Unable to create input slice for: hdfs://master.hadoop:9000/hadooptest/lzofile
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getStats(Launcher.java:176)
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:253)
        at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:249)
        at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:781)
        at org.apache.pig.PigServer.store(PigServer.java:529)
        at org.apache.pig.PigServer.openIterator(PigServer.java:465)
        ... 6 more
===============================================================================
The "ERROR 0: no files found a path hdfs://master.hadoop:9000/hadooptest/lzofile"
message has me really puzzled because in grunt I can see the files, I
can copy them to local, I can rename them with .lzo on the end,
uncompress them, and see the data that I expect, and I can even load
them with PigLoader (though obviously the data's all wrong when I do
that).

Any more tips?

Thanks,
Kris

On Wed, Mar 02, 2011 at 09:32:47AM -0800, Dmitriy Ryaboy wrote:
> Off the top of my head, I can't think of anything, but you can just grab
> everything in Elephant-Bird's lib/ directory and make sure it's on the
> classpath on all the task trackers and your client machine (you can
> propagate it to the TTs via the register keyword if you don't want to bug

Kris Coward http://unripe.melon.org/
GPG Fingerprint: 2BF3 957D 310A FEEC 4733  830E 21A4 05C7 1FEB 12B3
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB