|
|
-
libhdfs process fork problem
Tareq Aljabban 2012-03-23, 04:38
Hi, I'm using libhdfs C interface to access HDFS. The application accessing HDFS has a master process that forks many processes, and these processes try to connect to HDFS. Upon connection, one of the processes is exiting as soon as it reaches the hdfsConnect(). I tried simulating this behavior by creating a separate application where the master process forks two processes that connect to HDFS and send requests.. this worked without any problem.. so I got confused.. what's really causing the connection problems then? Any insights on this is much appreciated.
-
Re: libhdfs process fork problem
Brian Bockelman 2012-03-23, 11:25
Hi Tareq,
This is because libhdfs will keep a bit of state data (especially if the master was connected to HDFS).
Three suggestions: 1) [Likely to work] Fork the children first, then do any HDFS actions in the master. Alternately, don't have the master do any HDFS actions; have it fork a child which does them for it. 2) [May work] Try disconnecting the master from HDFS, then forking the children. 3) [If 2 doesn't work] Have the master disconnect, then fork the children, then have the children exec.
Brian
On Mar 22, 2012, at 11:38 PM, Tareq Aljabban wrote:
> Hi, > I'm using libhdfs C interface to access HDFS. > The application accessing HDFS has a master process that forks many > processes, and these processes try to connect to HDFS. > Upon connection, one of the processes is exiting as soon as it reaches the > hdfsConnect(). > I tried simulating this behavior by creating a separate application where > the master process forks two processes that connect to HDFS and send > requests.. this worked without any problem.. so I got confused.. what's > really causing the connection problems then? > Any insights on this is much appreciated.
-
Re: libhdfs process fork problem
Tareq Aljabban 2012-03-27, 15:07
Hi Brian, Thanks for your response.. Actually the master process doesn't even try to access HDFS.. What's happening is that process Master forks 2 processes A and B. A and B are the ones that try to connect to HDFS.. I tried doing this in a separate application and it worked.. for some reason it doesn't work in the other application that I'm working on. Since I'm not connecting from master then I cannot really apply any of your advices.. do you have other ideas? Thanks
On Fri, Mar 23, 2012 at 7:25 AM, Brian Bockelman <[EMAIL PROTECTED]>wrote:
> Hi Tareq, > > This is because libhdfs will keep a bit of state data (especially if the > master was connected to HDFS). > > Three suggestions: > 1) [Likely to work] Fork the children first, then do any HDFS actions in > the master. Alternately, don't have the master do any HDFS actions; have > it fork a child which does them for it. > 2) [May work] Try disconnecting the master from HDFS, then forking the > children. > 3) [If 2 doesn't work] Have the master disconnect, then fork the children, > then have the children exec. > > Brian > > On Mar 22, 2012, at 11:38 PM, Tareq Aljabban wrote: > > > Hi, > > I'm using libhdfs C interface to access HDFS. > > The application accessing HDFS has a master process that forks many > > processes, and these processes try to connect to HDFS. > > Upon connection, one of the processes is exiting as soon as it reaches > the > > hdfsConnect(). > > I tried simulating this behavior by creating a separate application where > > the master process forks two processes that connect to HDFS and send > > requests.. this worked without any problem.. so I got confused.. what's > > really causing the connection problems then? > > Any insights on this is much appreciated. > >
-
Re: libhdfs process fork problem
Brian Bockelman 2012-03-27, 18:49
Hi Tareq,
Sorry - out of ideas, without doing something like "strace"ing the process to verify that the master never tries to initialize a JVM.
If exec'ing doesn't work, then it is certainly something in the environment.
Brian
On Mar 27, 2012, at 10:07 AM, Tareq Aljabban wrote:
> Hi Brian, > Thanks for your response.. > Actually the master process doesn't even try to access HDFS.. > What's happening is that process Master forks 2 processes A and B. > A and B are the ones that try to connect to HDFS.. > I tried doing this in a separate application and it worked.. for some reason it doesn't work in the other application that I'm working on. > Since I'm not connecting from master then I cannot really apply any of your advices.. do you have other ideas? > Thanks > > On Fri, Mar 23, 2012 at 7:25 AM, Brian Bockelman <[EMAIL PROTECTED]> wrote: > Hi Tareq, > > This is because libhdfs will keep a bit of state data (especially if the master was connected to HDFS). > > Three suggestions: > 1) [Likely to work] Fork the children first, then do any HDFS actions in the master. Alternately, don't have the master do any HDFS actions; have it fork a child which does them for it. > 2) [May work] Try disconnecting the master from HDFS, then forking the children. > 3) [If 2 doesn't work] Have the master disconnect, then fork the children, then have the children exec. > > Brian > > On Mar 22, 2012, at 11:38 PM, Tareq Aljabban wrote: > > > Hi, > > I'm using libhdfs C interface to access HDFS. > > The application accessing HDFS has a master process that forks many > > processes, and these processes try to connect to HDFS. > > Upon connection, one of the processes is exiting as soon as it reaches the > > hdfsConnect(). > > I tried simulating this behavior by creating a separate application where > > the master process forks two processes that connect to HDFS and send > > requests.. this worked without any problem.. so I got confused.. what's > > really causing the connection problems then? > > Any insights on this is much appreciated. > >
|
|