Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce, mail # user - Re: TaskStatus Exception using HFileOutputFormat


Copy link to this message
-
Re: TaskStatus Exception using HFileOutputFormat
Sean McNamara 2013-02-06, 21:46

> Can you check whether hdfs related config was passed to Job correctly?

Ahhh, that was it!  It wasn't picking up the .xml files.  Fixed that and it seems to be working now.

Thank you for your help!!!

Sean
From: Ted Yu <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>
Reply-To: "[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>" <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>
Date: Wednesday, February 6, 2013 2:25 PM
To: "[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>" <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>
Subject: Re: TaskStatus Exception using HFileOutputFormat

Thanks for this information. Here is related code:

  public static void configureIncrementalLoad(Job job, HTable table)

  throws IOException {

    Configuration conf = job.getConfiguration();

...

    Path partitionsPath = new Path(job.getWorkingDirectory(),

                                   "partitions_" + UUID.randomUUID());

    LOG.info("Writing partition information to " + partitionsPath);

    FileSystem fs = partitionsPath.getFileSystem(conf);

    writePartitions(conf, partitionsPath, startKeys);

    partitionsPath.makeQualified(fs);

Can you check whether hdfs related config was passed to Job correctly ?

Thanks

On Wed, Feb 6, 2013 at 1:15 PM, Sean McNamara <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:
Ok, a bit more info-  From what I can tell is that the partitions file is being placed into the working dir on the node I launch from, and the task trackers are trying to look for that file, which doesn't exist where they run (since they are on other nodes.)
Here is the exception on the TT in case it is helpful:
2013-02-06 17:05:13,002 WARN org.apache.hadoop.mapred.TaskTracker: Exception while localization java.io.FileNotFoundException: File /opt/jobs/MyMapreduceJob/partitions_1360170306728 does not exist.
        at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:397)
        at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:251)
        at org.apache.hadoop.filecache.TaskDistributedCacheManager.setupCache(TaskDistributedCacheManager.java:179)
        at org.apache.hadoop.mapred.TaskTracker$4.run(TaskTracker.java:1212)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
        at org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java:1203)
        at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:1118)
        at org.apache.hadoop.mapred.TaskTracker$5.run(TaskTracker.java:2430)
        at java.lang.Thread.run(Thread.java:662)

From: Sean McNamara <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>
Reply-To: "[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>" <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>
Date: Wednesday, February 6, 2013 9:35 AM

To: "[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>" <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>
Subject: Re: TaskStatus Exception using HFileOutputFormat

> Using the below construct, do you still get exception ?

Correct, I am still getting this exception.

Sean

From: Ted Yu <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>
Reply-To: "[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>" <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>
Date: Tuesday, February 5, 2013 7:50 PM
To: "[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>" <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>
Subject: Re: TaskStatus Exception using HFileOutputFormat

Using the below construct, do you still get exception ?

Please consider upgrading to hadoop 1.0.4

Thanks

On Tue, Feb 5, 2013 at 4:55 PM, Sean McNamara <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:
> an you tell us the HBase and hadoop versions you were using ?

Ahh yes, sorry I left that out:

Hadoop: 1.0.3
HBase: 0.92.0
Our code is as follows:
HTable table = new HTable(conf, configHBaseTable);
FileOutputFormat.setOutputPath(job, outputDir);
HFileOutputFormat.configureIncrementalLoad(job, table);
Thanks!

From: Ted Yu <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>
Reply-To: "[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>" <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>
Date: Tuesday, February 5, 2013 5:46 PM
To: "[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>" <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>
Subject: Re: TaskStatus Exception using HFileOutputFormat

Can you tell us the HBase and hadoop versions you were using ?

    HFileOutputFormat.configureIncrementalLoad(job, table);

    FileOutputFormat.setOutputPath(job, outDir);

I guess you have used the above construct ?

Cheers

On Tue, Feb 5, 2013 at 4:31 PM, Sean McNamara <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:

We're trying to use HFileOutputFormat for bulk hbase loading.   When using HFileOutputFormat's setOutputPath or configureIncrementalLoad, the job is unable to run.  The error I see in the jobtracker logs is: Trying to set finish time for task attempt_201301030046_123198_m_000002_0 when no start time is set, stackTrace is : java.lang.Exception

If I remove an references to HFileOutputFormat, and use FileOutputFormat.setOutputPath, things seem to run great.  Does anyone know what could be causing the TaskStatus error when using HFileOutputFormat?

Thanks,

Sean
What I see on the Job Tracker:

2013-02-06 00:17:33,685 ERROR org.apache.hadoop.mapred.TaskStatus: Trying to set finish time for task attempt_201301030046_123198_m_000002_0 when no start time is set, stackTrace is : java.lang.Exception
        at org.apache.hadoop.mapred.TaskStatus.setFinishTime(TaskStatus.java:145)
        at org.apache.hadoop.mapred.TaskInProgress.incompleteSubTask(TaskInProgress.java:670)
  
+
Ted Yu 2013-02-06, 21:49
+
Sean McNamara 2013-02-06, 00:31
+
Ted Yu 2013-02-06, 02:50
+
Sean McNamara 2013-02-06, 16:35
+
Sean McNamara 2013-02-06, 21:15