Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # dev >> In the constructor of JobInProgress, why is it safe to call FileSyste.closeAllForUGI().

Xiao Yu 2013-03-21, 22:42
Copy link to this message
Re: In the constructor of JobInProgress, why is it safe to call FileSyste.closeAllForUGI().
The current user (UserGroupInformation.getCurrentUser()) is the user
active in the RPC call thats invoking these functions, and not the JT
user exactly.

However, given that the JIP construction is outside of a synchronized
step and can potentially happen in parallel with another JIP request,
it is possible that you may have identified a possible bug here.

I've not seen this happen though, even at high loads of submits from a
single user (where I think this could happen). Can you detail your
changes, cause it could be somewhat related to that as well? The UGI
compare inside of closeAllForUGI is probably protective enough but
it'd still be worth looking into.

On Fri, Mar 22, 2013 at 4:12 AM, Xiao Yu <[EMAIL PROTECTED]> wrote:
> Hi,
> This might be a naive question, but I am having a difficult time to
> understand it. At the end of the constructor of JobInProgress, in the
> finally clause, the code calls
> FileSystem.closeAllForUGI(UserGroupInformation.getCurrentUser()), but why
> is it safe.
> My concern is that the current user is the owner of jobtracker, so it will
> close all the files the jobtracker is writing, such as a jobtoken file
> another jip is currently writing.
> I modified some of the code of hadoop-1.1.0 for my research project and saw
> the following error. It could be some bug in my code, but I suspect it is a
> combined effect of this closeAllForUGI function and perhaps a race
> condition in the DFSClient$LeaseChecker.close().
> Could you help me understand why it is safe to call this
> FileSystem.closeAllForUGI function at the end of the JobInProgress
> constructor?
> Thank you very much indeed.
> ==>
> 2013-03-21 21:04:56,677 INFO org.apache.hadoop.mapred.JobTracker:
> Initializing job_201303212104_0005
> 2013-03-21 21:04:56,677 INFO org.apache.hadoop.mapred.JobInProgress:
> Initializing job_201303212104_0005
> 2013-03-21 21:04:56,841 INFO org.apache.hadoop.hdfs.DFSClient:
> read.type=dfs block=blk_700146088908855679_2191
> src=/home/ec2-user/Hadoop/tmp/mapred/staging/ec2-user/.staging/job_201303212104_0006/job.xml
> 2013-03-21 21:04:56,897 INFO org.apache.hadoop.hdfs.DFSClient: ===streamer
> closed here===:org.apache.hadoop.hdfs.DFSClient$DFSOutputStream@239cd5f5dfsclient> DFSClient[clientName=DFSClient_-536745704, ugi=ec2-user]
> leasechecker remove source:
> /test/out/intrecord/4/_logs/history/job_201303212104_0005_conf.xml
> 2013-03-21 21:04:56,898 INFO org.apache.hadoop.mapred.JobInProgress:
> ===generating job token==> 2013-03-21 21:04:56,931 INFO org.apache.hadoop.mapred.JobInProgress:
> job_201303212104_0006: nMaps=1 nReduces=0 max=-1
> 2013-03-21 21:04:56,932 INFO org.apache.hadoop.mapred.JobInProgress:
> ===closeAllForUGI here===:ec2-user jobid= job_201303212104_0006
> *2013-03-21 21:04:56,973 INFO org.apache.hadoop.hdfs.DFSClient: ===streamer
> closed here===:org.apache.hadoop.hdfs.DFSClient$DFSOutputStream@6d56d7c8dfsclient> DFSClient[clientName=DFSClient_1735333485, ugi=ec2-user]
> leasechecker remove source:
> /home/ec2-user/Hadoop/tmp/mapred/system/job_201303212104_0005/jobToken*
> 2013-03-21 21:04:56,973 INFO org.apache.hadoop.mapred.JobInProgress:
> jobToken generated and stored with users keys in
> /home/ec2-user/Hadoop/tmp/mapred/system/job_201303212104_0005/jobToken
> *2013-03-21 21:04:56,973 INFO org.apache.hadoop.hdfs.DFSClient: ===streamer
> null to close===:org.apache.hadoop.hdfs.DFSClient$DFSOutputStream@6d56d7c8dfsclient> DFSClient[clientName=DFSClient_1735333485, ugi=ec2-user]
> leasechecker should remove source:
> /home/ec2-user/Hadoop/tmp/mapred/system/job_201303212104_0005/jobToken*
> 2013-03-21 21:04:56,974 ERROR
> org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
> as:ec2-user cause:java.io.IOException: java.lang.NullPointerException
> *2013-03-21 21:04:56,976 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 4 on 9101, call submitJob(job_201303212104_0006, hdfs://

Harsh J
Xiao Yu 2013-03-22, 15:04
Devaraj Das 2013-03-29, 15:08