Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Re: no _SUCCESS file in MR output directory.


Copy link to this message
-
Re: no _SUCCESS file in MR output directory.
Good observance: Pig does seem to use a default "false" when possible,
to disable the _SUCCESS creation. I don't see Hive do that, nor any
part of the stock Apache Hadoop MR jobs.

Rahul - Do you use a Pig action in your WF? Also, are you definitively
seeing _SUCCESS being created after you add the option manually?

On Mon, May 6, 2013 at 7:54 PM, Eduardo Afonso Ferreira
<[EMAIL PROTECTED]> wrote:
> I'm not sure if Pig disables it or not, but I remember I had issues when that file was to be created by the MR jobs due to the fact of Oozie or Pig removing temporary directories or something like that. I remember seeing an exception about failure to create the SUCCESS file, so I started using the following property in my workflow pig action to disable that:
>
>         <pig>
>             ...
>             <configuration>
>
>                 <property>
>                     <name>mapreduce.fileoutputcommitter.marksuccessfuljobs</name>
>                     <value>false</value>
>                 </property>
>                 ...
>             </configuration>
>             ...
>         </pig>
>
>
>
> ________________________________
>  From: Rahul Bhattacharjee <[EMAIL PROTECTED]>
> To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>; [EMAIL PROTECTED]
> Sent: Monday, May 6, 2013 3:48 AM
> Subject: Re: no _SUCCESS file in MR output directory.
>
>
> I wanted to confirm whether oozie disables the _SUCCESS file creation when
> it triggers a MR job.
>
> I am triggering a MR job (actually a bunch of 'em) from oozie and the
> workflow completes successfully , however I do not see any kind of _SUCCESS
> file in the output directory.
> When I set the file output committer's configuration
> (mapreduce.fileoutputcommitter.
> marksuccessfuljobs
> ) to true.It generates the success file. Wanted to confirm if oozie does
> the disabling of success file creation.
>
> Thanks,
> Rahul
>
>
> On Mon, May 6, 2013 at 12:34 PM, Rahul Bhattacharjee <
> [EMAIL PROTECTED]> wrote:
>
>> Oozie is being used for triggering the MR job. Looks like oozie disables
>> the success file creation using the configuration that you have mentioned
>> for FileOutputCommitter.
>>
>> I have enabled it by setting this property in conf.
>>
>> Rahul
>>
>>
>> On Mon, May 6, 2013 at 9:38 AM, Rahul Bhattacharjee <
>> [EMAIL PROTECTED]> wrote:
>>
>>> Thanks Harsh for the pointers. I will find out more on this.
>>>
>>>
>>> On Sun, May 5, 2013 at 11:26 PM, Harsh J <[EMAIL PROTECTED]> wrote:
>>>
>>>> I can think of a few, most obvious ones:
>>>>
>>>> 1. Job didn't succeed and/or the file was deleted (*shields self*)
>>>> 2. Job overrode the default FileOutputCommitter with something that
>>>> doesn't do success marking.
>>>> 3. Job specifically asked to not create such files, via config
>>>> mapreduce.fileoutputcommitter.marksuccessfuljobs or so, set to false.
>>>>
>>>> On Sun, May 5, 2013 at 9:54 PM, Rahul Bhattacharjee
>>>> <[EMAIL PROTECTED]> wrote:
>>>> > Hi,
>>>> >
>>>> >
>>>> > A few days back , I was going through a MR job's output , but there
>>>> wasn't
>>>> > any _SUCCESS file in the output directory.
>>>> > I was wondering what all reasons for this  (no _SUCCESS file)?
>>>> >
>>>> > Thanks,
>>>> > Rahul
>>>>
>>>>
>>>>
>>>> --
>>>> Harsh J
>>>>
>>>
>>>
>>

--
Harsh J
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB