Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Accumulo >> mail # user >> importdirectory in accumulo


Copy link to this message
-
Re: importdirectory in accumulo
I agree with you that changing HADOOP_CLASSPATH like you said should be
done. I couldn't quite do that just yet (people have jobs running and don't
want to risk it).

However, I did a work around. (I am going off the theory that my
Hadoop_classpath is bad so it can't accept all the libraries I am passing
to it so I decided to package all the libraries I needed into a jar.
http://blog.cloudera.com/blog/2011/01/how-to-include-third-party-libraries-in-your-map-reduce-job/)
I downloaded the source code and made a shaded (uber) jar to include all
the libraries I needed. Then I submitted the hadoop job with my uber jar
like any other map reduce job. My mappers and reducers finish the job but I
got an exception for waitForTableOperation. I think this proves my theory
of bad classpath but clearly I have more issues to deal with. If you have
any suggestions on how to even debug that would be awesome!

My console output(removed a lot of server specific stuff for security) is
below. I modified BulkIngestExample.java to add some print statements.
Modified lines shown below also.
[user@nodebulk]$ /opt/hadoop/bin/hadoop jar uber-BulkIngestExample.jar
instance zookeepers user password table inputdir tmp/bulk

3/04/05 11:20:52 INFO input.FileInputFormat: Total input paths to process :
1
13/04/05 11:20:53 INFO mapred.JobClient: Running job: job_201304021611_0045
13/04/05 11:20:54 INFO mapred.JobClient:  map 0% reduce 0%
13/04/05 11:21:10 INFO mapred.JobClient:  map 100% reduce 0%
13/04/05 11:21:25 INFO mapred.JobClient:  map 100% reduce 50%
13/04/05 11:21:26 INFO mapred.JobClient:  map 100% reduce 100%
13/04/05 11:21:31 INFO mapred.JobClient: Job complete: job_201304021611_0045
13/04/05 11:21:31 INFO mapred.JobClient: Counters: 25
13/04/05 11:21:31 INFO mapred.JobClient:   Job Counters
13/04/05 11:21:31 INFO mapred.JobClient:     Launched reduce tasks=2
13/04/05 11:21:31 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=15842
13/04/05 11:21:31 INFO mapred.JobClient:     Total time spent by all
reduces waiting after reserving slots (ms)=0
13/04/05 11:21:31 INFO mapred.JobClient:     Total time spent by all maps
waiting after reserving slots (ms)=0
13/04/05 11:21:31 INFO mapred.JobClient:     Rack-local map tasks=1
13/04/05 11:21:31 INFO mapred.JobClient:     Launched map tasks=1
13/04/05 11:21:31 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=25891
13/04/05 11:21:31 INFO mapred.JobClient:   File Output Format Counters
13/04/05 11:21:31 INFO mapred.JobClient:     Bytes Written=496
13/04/05 11:21:31 INFO mapred.JobClient:   FileSystemCounters
13/04/05 11:21:31 INFO mapred.JobClient:     FILE_BYTES_READ=312
13/04/05 11:21:31 INFO mapred.JobClient:     HDFS_BYTES_READ=421
13/04/05 11:21:31 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=68990
13/04/05 11:21:31 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=496
13/04/05 11:21:31 INFO mapred.JobClient:   File Input Format Counters
13/04/05 11:21:31 INFO mapred.JobClient:     Bytes Read=280
13/04/05 11:21:31 INFO mapred.JobClient:   Map-Reduce Framework
13/04/05 11:21:31 INFO mapred.JobClient:     Reduce input groups=10
13/04/05 11:21:31 INFO mapred.JobClient:     Map output materialized
bytes=312
13/04/05 11:21:31 INFO mapred.JobClient:     Combine output records=0
13/04/05 11:21:31 INFO mapred.JobClient:     Map input records=10
13/04/05 11:21:31 INFO mapred.JobClient:     Reduce shuffle bytes=186
13/04/05 11:21:31 INFO mapred.JobClient:     Reduce output records=10
13/04/05 11:21:31 INFO mapred.JobClient:     Spilled Records=20
13/04/05 11:21:31 INFO mapred.JobClient:     Map output bytes=280
13/04/05 11:21:31 INFO mapred.JobClient:     Combine input records=0
13/04/05 11:21:31 INFO mapred.JobClient:     Map output records=10
13/04/05 11:21:31 INFO mapred.JobClient:     SPLIT_RAW_BYTES=141
13/04/05 11:21:31 INFO mapred.JobClient:     Reduce input records=10

Here is the exception caught:
org.apache.accumulo.core.client.AccumuloException: Internal error
processing waitForTableOperation

E.getMessage returns:
Internal error processing waitForTableOperation
Exception in thread "main" java.lang.RuntimeException:
org.apache.accumulo.core.client.AccumuloException: Internal error
processing waitForTableOperation
        at
org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample.run(BulkIngestExample.java:151)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at
org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample.main(BulkIngestExample.java:166)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:601)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
Caused by: org.apache.accumulo.core.client.AccumuloException: Internal
error processing waitForTableOperation
        at
org.apache.accumulo.core.client.admin.TableOperationsImpl.doTableOperation(TableOperationsImpl.java:290)
        at
org.apache.accumulo.core.client.admin.TableOperationsImpl.doTableOperation(TableOperationsImpl.java:258)
        at
org.apache.accumulo.core.client.admin.TableOperationsImpl.importDirectory(TableOperationsImpl.java:945)
        at
org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample.run(BulkIngestExample.java:146)
        ... 7 more
Caused by: org.apache.thrift.TApplicationException: Internal error
processing waitForTableOperation
        at
org.apache.thrift.TApplicationException.read(TApplicationException.java:108)
        at
org.apache.accumulo.core.master.thrift.MasterClientService$Client.recv_waitForTableOperation(MasterClientService.java:684)
        at
org.apache.accumulo.core.master.thrift.MasterClientService$Client.waitForTableOperation(MasterClientService.java:665)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        a
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB