Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS, mail # dev - Exception in mid of reading files.


Copy link to this message
-
Re: Exception in mid of reading files.
Chris Nauroth 2013-11-11, 18:16
Divya, thank you for reporting back on this.  Nicholas and I had an offline
conversation and came to the conclusion that this is likely to be a
different problem from HDFS-3373.  Although the symptoms look similar, the
socket caching code mentioned in HDFS-3373 is not present in branch-1.

I filed a new issue for your bug report: HDFS-5493.  Nicholas pointed out a
spot in the DFSClient code where we may have a socket leak.

https://issues.apache.org/jira/browse/HDFS-5493

Chris Nauroth
Hortonworks
http://hortonworks.com/

On Thu, Nov 7, 2013 at 12:34 AM, Divya R <[EMAIL PROTECTED]> wrote:

> Hi Chris,
>
>   Thanks a lot for the help. But after lot of investigation I found that
> the issue was with the cached socket connection which was raised as a bug
> by Nicholas. Bug details are as follows,
>
> HDFS-3373 <https://issues.apache.org/jira/browse/HDFS-3373> FileContext
> HDFS implementation can leak socket caches
>
> When I executed command netstat -a |grep 50010 the count was approximately
> 52000. This issue is fixed in
> 0.20.3<https://issues.apache.org/jira/browse/HDFS/fixforversion/12314814>,
> 0.20.205.0<
> https://issues.apache.org/jira/browse/HDFS/fixforversion/12316392>,
> but its not present in hadoop-1.2.X. Could you please guide me as to what
> could I do.?
>
> -Divya
>
>
> On Sat, Oct 26, 2013 at 12:38 AM, Chris Nauroth <[EMAIL PROTECTED]
> >wrote:
>
> > Hi Divya,
> >
> > The exceptions indicate that the HDFS client failed to establish a
> network
> > connection to a datanode hosting a block that the client is trying to
> read.
> >  After too many of these failures (default 3, but configurable), the HDFS
> > client aborts the read and this bubbles up to the caller with the "could
> > not obtain block" error.
> >
> > I recommend troubleshooting this as a network connectivity issue.  This
> > wiki page includes a few tips as a starting point:
> >
> > http://wiki.apache.org/hadoop/TroubleShooting
> >
> > Hope this helps,
> >
> > Chris Nauroth
> > Hortonworks
> > http://hortonworks.com/
> >
> >
> >
> > On Fri, Oct 25, 2013 at 4:53 AM, Divya R <[EMAIL PROTECTED]> wrote:
> >
> > > Hi Guys,
> > >
> > >    I'm indexing data (~50 -100GB per day) from hadoop. Hadoop is
> Running
> > in
> > > cluster mode (having 2 dataNaodes currently). After every two or three
> > > hours I'm getting this exception. But both Data nodes are up and
> running.
> > > Can any one please guide me as to what I should do or  If I'm doing
> > wrong.
> > >
> > > Code Snippet:
> > > public InitHadoop()  {
> > >
> > >         configuration = new Configuration();
> > >         configuration.set("fs.default.name", "hdfs://<<namenode
> > > IP>>:54310"); // Is this write to specify on namenode IP.?
> > >         configuration.set("mapred.job.tracker", "hdfs://<<namenode
> > > IP>>:54311");
> > >
> > >         try {
> > >             fileSystem = FileSystem.get(configuration);
> > >         } catch (IOException e) {
> > >             e.printStackTrace();
> > >         }
> > > }
> > > private void indexDocument(FSDataInputStream file) {
> > >
> > >             Scanner scanner = new Scanner(file);
> > >
> > >             while (scanner.hasNext() != null) {
> > >                   //   Indexing code
> > >             }
> > >       }
> > > }
> > >
> > > Logs:
> > >
> > > 2013-10-25 09:37:57 WARN  DFSClient:2266 - Failed to connect to
> > > /<<IP>>:50010, add to deadNodes and continuejava.net.BindException:
> > Cannot
> > > assign requested address
> > > 2013-10-25 09:37:57 WARN  DFSClient:2266 - Failed to connect to
> > > /<<IP>>:50010, add to deadNodes and continuejava.net.BindException:
> > Cannot
> > > assign requested address
> > > 2013-10-25 09:37:57 INFO  DFSClient:2432 - Could not obtain block
> > > blk_-8795538519317154213_432897 from any node: java.io.IOException: No
> > live
> > > nodes contain current block. Will get new block locations from namenode
> > and
> > > retry...
> > > 2013-10-25 09:37:58 WARN  DFSClient:2266 - Failed to connect to

CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby notified that
any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender immediately
and delete it from your system. Thank You.