Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Accumulo, mail # user - AcculumoFileOutputFormat class cannot be found by child jvm


Copy link to this message
-
RE: AcculumoFileOutputFormat class cannot be found by child jvm
Bob.Thorman@... 2012-05-22, 16:41
Yep.  Here's the script I'm using...everything is happy until the job
executes under the configuration that uses AccumuloFileOutputFormat
class...

 

HADOOP_BIN=/cloudbase/hadoop-0.20.2/bin

ACCUMULO_BIN=/cloudbase/accumulo-1.4.0/bin

 

INGESTER_JAR=/mnt/hgfs/CSI.Cloudbase/Java/CloudbaseServices/out/artifact
s/CloudbaseIngesters/CloudbaseIngesters.jar

PLACEMARK_CLASS=com.comcept.cloudbase.ingesters.placemarks.PlacemarkInge
ster

CONFIG=/mnt/hgfs/CSI.Cloudbase/Java/CloudbaseServices/out/artifacts/Clou
dbaseIngesters/placemark-config.xml

 

KXML_JAR=/usr/lib/ncct/kxml2-2.3.0.jar

XMLPULL_JAR=/usr/lib/ncct/xmlpull-1.1.3.1.jar

XSTREAM_JAR=/usr/lib/ncct/xstream-1.4.1.jar

 

INGESTER_LIBS=$KXML_JAR,$XMLPULL_JAR,$XSTREAM_JAR

 

$HADOOP_BIN/hadoop dfs -ls /

$HADOOP_BIN/hadoop dfs -rmr /output

$HADOOP_BIN/hadoop dfs -rmr /input

$HADOOP_BIN/hadoop dfs -mkdir /input

$HADOOP_BIN/hadoop dfs -mkdir /output

$HADOOP_BIN/hadoop dfs -mkdir /output/pfailures

$HADOOP_BIN/hadoop dfs -mkdir /output/gfailures

$HADOOP_BIN/hadoop dfs -mkdir /output/efailures

$HADOOP_BIN/hadoop dfs -mkdir /output/tfailures

$HADOOP_BIN/hadoop dfs -put ./*.kml /input

 

$ACCUMULO_BIN/tool.sh $INGESTER_JAR $PLACEMARK_CLASS -libjars
$INGESTER_LIBS -c $CONFIG

 

 

Here is the code that initializes the first job in the chain...

 

                conf.set(_sVisTag, ic.getVisibility());

 

Job job = new Job(conf, "NCCT Placemark Ingester");

job.setJarByClass(this.getClass());

                job.setInputFormatClass(TextInputFormat.class);

                job.setMapperClass(PlacemarkMapClass.class);

                job.setMapOutputKeyClass(Text.class);

                job.setMapOutputValueClass(Text.class);

                job.setReducerClass(PlacemarkReduceClass.class);

 
job.setOutputFormatClass(AccumuloFileOutputFormat.class);

 

AccumuloFileOutputFormat.setZooKeeperInstance(conf, ic.getInstance(),
ic.getZooKeeper());

                Instance instance = new
ZooKeeperInstance(ic.getInstance(), ic.getZooKeeper());

                Connector connector instance.getConnector(ic.getUserName(), password);

                TextInputFormat.setInputPaths(job,new
Path(ic.getHdfsInput()));

                AccumuloFileOutputFormat.setOutputPath(job, new
Path(ic.getHdfsOutput() + "/pfiles"));

 

                job.waitForCompletion(true);

 
connector.tableOperations().importDirectory(ic.getMetaTable(),
ic.getHdfsOutput() + "/pfiles", ic.getHdfsOutput() + "/pfailures",
false);

 

From: John Vines [mailto:[EMAIL PROTECTED]]
Sent: Tuesday, May 22, 2012 09:57
To: [EMAIL PROTECTED]
Subject: Re: AcculumoFileOutputFormat class cannot be found by child jvm

 

Does your script utilize $ACCUMULO_HOME/bin/tool.sh to kick off the
mapreduce? That script is similar to hadoop jar, but it will libjar the
accumulo libraries for you.

John

On Tue, May 22, 2012 at 10:55 AM, <[EMAIL PROTECTED]> wrote:

Right now I'm using stand-alone mode, but is there another place I need
to put the jar file?
-----Original Message-----
From: John Armstrong [mailto:[EMAIL PROTECTED]]
Sent: Tuesday, May 22, 2012 09:49
To: [EMAIL PROTECTED]
Subject: Re: AcculumoFileOutputFormat class cannot be found by child jvm

On 05/22/2012 10:40 AM, [EMAIL PROTECTED] wrote:
> I upgrade to accumulo-1.4.0 and updated my map/reduce jobs and now
> they don't run.  The parent class path has the accumulo-core-1.4.0.jar

> file included.  Do the accumulo jar files have to be manually put on a

> distribute cache?  Any help is appreciated.

Just to check: did you replace the Accumulo JAR files on all the cluster
nodes?