Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # dev >> scanner lease expired/region server shutdown


Copy link to this message
-
RE: scanner lease expired/region server shutdown
This was on a 2.6.20.

Nothing in the /var/log/messages either around that time-- so we may not have hit any system wide limits. Perhaps the DataNode process hit its process limit (64k) under some corner case/leak scenario.

I tried to repro it, but this time lsof count on DataNode & RegionServer processes stayed within 300 each during similar workload. So haven't been able to repro it.

regards,
Kannan
-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of Jean-Daniel Cryans
Sent: Tuesday, January 26, 2010 1:54 PM
To: [EMAIL PROTECTED]
Subject: Re: scanner lease expired/region server shutdown

One other source of Too many open files is if you have an old 2.6.27
http://pero.blogs.aprilmayjune.org/2009/01/22/hadoop-and-linux-kernel-2627-epoll-limits/

J-D

On Tue, Jan 26, 2010 at 12:58 PM, Kannan Muthukkaruppan
<[EMAIL PROTECTED]> wrote:
> Dhruba: yes, the "Too many open files" exception is getting reported by the DN process. The same node is also running an HBase region server.
>
> And yes, I confirmed that the xcievers setting is 2048.
>
> Regards,
> Kannan
> -----Original Message-----
> From: Dhruba Borthakur [mailto:[EMAIL PROTECTED]]
> Sent: Tuesday, January 26, 2010 10:10 AM
> To: [EMAIL PROTECTED]
> Subject: Re: scanner lease expired/region server shutdown
>
> This exception is from the DataNode, right? This means that the datanode
> process has 32K files open simultaneously, how can that be? For each block
> read/write the datanode has two open files, one for the data and one for the
> .meta where the crc gets stored.
>
> On the other hand, the datanode is configured via dfs.datanode.max.xcievers
> to support 2048 read/write request simultanously, right?
>
> thanks,
> dhruba
>
>
> On Tue, Jan 26, 2010 at 7:10 AM, Kannan Muthukkaruppan
> <[EMAIL PROTECTED]>wrote:
>
>>
>> Looking further up in the logs (about 20 minutes prior in the logs when
>> errors first started happening), I noticed the following.
>>
>> btw, ulimit -a shows that I have "open files" set to 64k. Is that not
>> sufficient?
>>
>> 2010-01-25 11:10:21,774 WARN
>> org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(
>> 10.129.68.212:50010,
>> storageID=DS-1418567969-10.129.68.212-50010-1263610251776, infoPort=50075,
>> ipcPort=50020):Data\
>> XceiveServer: java.io.IOException: Too many open files
>>        at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
>>        at
>> sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:145)
>>        at
>> sun.nio.ch.ServerSocketAdaptor.accept(ServerSocketAdaptor.java:84)
>>        at
>> org.apache.hadoop.hdfs.server.datanode.DataXceiverServer.run(DataXceiverServer.java:130)
>>         at java.lang.Thread.run(Thread.java:619)
>>
>> 2010-01-25 11:10:21,566 WARN
>> org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(
>> 10.129.68.212:50010,
>> storageID=DS-1418567969-10.129.68.212-50010-1263610251776, infoPort=50075,
>> ipcPort=50020):Got \
>> exception while serving blk_3332344970774019423_10249 to /10.129.68.212:
>> java.io.FileNotFoundException:
>> /mnt/d1/HDFS-kannan1/current/subdir23/blk_3332344970774019423_10249.meta
>> (Too many open files)
>>        at java.io.FileInputStream.open(Native Method)
>>        at java.io.FileInputStream.<init>(FileInputStream.java:106)
>>        at
>> org.apache.hadoop.hdfs.server.datanode.FSDataset.getMetaDataInputStream(FSDataset.java:682)
>>        at
>> org.apache.hadoop.hdfs.server.datanode.BlockSender.<init>(BlockSender.java:97)
>>        at
>> org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:172)
>>         at
>> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:95)
>>        at java.lang.Thread.run(Thread.java:619)
>>
>>
>>
>> ________________________________________
>> From: Kannan Muthukkaruppan [[EMAIL PROTECTED]]
>> Sent: Tuesday, January 26, 2010 7:01 AM
>> To: [EMAIL PROTECTED]
>> Subject: RE: scanner lease expired/region server shutdown
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB