Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Accumulo, mail # user - importdirectory in accumulo


Copy link to this message
-
Re: importdirectory in accumulo
Keith Turner 2013-04-08, 18:14
On Fri, Apr 5, 2013 at 6:01 PM, David Medinets <[EMAIL PROTECTED]> wrote:
> I ran into this issue. Look in your log files for a directory not found
> exceotion which is not bubbled up to the bash shell.

Could the following issue be the problem?

https://issues.apache.org/jira/browse/ACCUMULO-1171

David, for the issue you ran into.  If you know of a situation where
bulk import errors are not propagating back to client, can you open a
ticket?

>
> On Apr 5, 2013 11:37 AM, "Aji Janis" <[EMAIL PROTECTED]> wrote:
>>
>> I agree with you that changing HADOOP_CLASSPATH like you said should be
>> done. I couldn't quite do that just yet (people have jobs running and don't
>> want to risk it).
>>
>> However, I did a work around. (I am going off the theory that my
>> Hadoop_classpath is bad so it can't accept all the libraries I am passing to
>> it so I decided to package all the libraries I needed into a jar.
>> http://blog.cloudera.com/blog/2011/01/how-to-include-third-party-libraries-in-your-map-reduce-job/)
>> I downloaded the source code and made a shaded (uber) jar to include all the
>> libraries I needed. Then I submitted the hadoop job with my uber jar like
>> any other map reduce job. My mappers and reducers finish the job but I got
>> an exception for waitForTableOperation. I think this proves my theory of bad
>> classpath but clearly I have more issues to deal with. If you have any
>> suggestions on how to even debug that would be awesome!
>>
>> My console output(removed a lot of server specific stuff for security) is
>> below. I modified BulkIngestExample.java to add some print statements.
>> Modified lines shown below also.
>>
>>
>> [user@nodebulk]$ /opt/hadoop/bin/hadoop jar uber-BulkIngestExample.jar
>> instance zookeepers user password table inputdir tmp/bulk
>>
>> 3/04/05 11:20:52 INFO input.FileInputFormat: Total input paths to process
>> : 1
>> 13/04/05 11:20:53 INFO mapred.JobClient: Running job:
>> job_201304021611_0045
>> 13/04/05 11:20:54 INFO mapred.JobClient:  map 0% reduce 0%
>> 13/04/05 11:21:10 INFO mapred.JobClient:  map 100% reduce 0%
>> 13/04/05 11:21:25 INFO mapred.JobClient:  map 100% reduce 50%
>> 13/04/05 11:21:26 INFO mapred.JobClient:  map 100% reduce 100%
>> 13/04/05 11:21:31 INFO mapred.JobClient: Job complete:
>> job_201304021611_0045
>> 13/04/05 11:21:31 INFO mapred.JobClient: Counters: 25
>> 13/04/05 11:21:31 INFO mapred.JobClient:   Job Counters
>> 13/04/05 11:21:31 INFO mapred.JobClient:     Launched reduce tasks=2
>> 13/04/05 11:21:31 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=15842
>> 13/04/05 11:21:31 INFO mapred.JobClient:     Total time spent by all
>> reduces waiting after reserving slots (ms)=0
>> 13/04/05 11:21:31 INFO mapred.JobClient:     Total time spent by all maps
>> waiting after reserving slots (ms)=0
>> 13/04/05 11:21:31 INFO mapred.JobClient:     Rack-local map tasks=1
>> 13/04/05 11:21:31 INFO mapred.JobClient:     Launched map tasks=1
>> 13/04/05 11:21:31 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=25891
>> 13/04/05 11:21:31 INFO mapred.JobClient:   File Output Format Counters
>> 13/04/05 11:21:31 INFO mapred.JobClient:     Bytes Written=496
>> 13/04/05 11:21:31 INFO mapred.JobClient:   FileSystemCounters
>> 13/04/05 11:21:31 INFO mapred.JobClient:     FILE_BYTES_READ=312
>> 13/04/05 11:21:31 INFO mapred.JobClient:     HDFS_BYTES_READ=421
>> 13/04/05 11:21:31 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=68990
>> 13/04/05 11:21:31 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=496
>> 13/04/05 11:21:31 INFO mapred.JobClient:   File Input Format Counters
>> 13/04/05 11:21:31 INFO mapred.JobClient:     Bytes Read=280
>> 13/04/05 11:21:31 INFO mapred.JobClient:   Map-Reduce Framework
>> 13/04/05 11:21:31 INFO mapred.JobClient:     Reduce input groups=10
>> 13/04/05 11:21:31 INFO mapred.JobClient:     Map output materialized
>> bytes=312
>> 13/04/05 11:21:31 INFO mapred.JobClient:     Combine output records=0
>> 13/04/05 11:21:31 INFO mapred.JobClient:     Map input records=10