Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive, mail # user - Hive Server Leaking File Descriptors?


Copy link to this message
-
Re: Hive Server Leaking File Descriptors?
Bennie Schut 2010-02-18, 14:27
I've tried to look into it a bit more and it seems to happen on "load
data inpath"
LOAD DATA INPATH '/user/hive/warehouse/chatsessions_2010-02-01.csv' INTO
TABLE chatsessions_load;

And not on things like
select count(1) from chatsessions_load;

Bennie Schut wrote:
> That's 133k of data so perhaps this is enough?:
>
>  627:             1            112
> org.apache.hadoop.fs.FileSystem$ClientFinalizer
>  723:             1             80  org.apache.hadoop.hdfs.DFSClient
>  882:             1             48  org.apache.hadoop.fs.LocalFileSystem
>  883:             2             48
> org.apache.hadoop.fs.FileSystem$Cache$Key
>  934:             2             48
> org.apache.hadoop.fs.FileSystem$Statistics
>  944:             1             48
> org.apache.hadoop.hdfs.DistributedFileSystem
> 1082:             1             32  org.apache.hadoop.fs.RawLocalFileSystem
> 1643:             1             16  org.apache.hadoop.fs.FileSystem$Cache
>
> After a couple more queries:
>
> 671:             1            112
> org.apache.hadoop.fs.FileSystem$ClientFinalizer
> 772:             1             80  org.apache.hadoop.hdfs.DFSClient
>  930:             1             48  org.apache.hadoop.fs.LocalFileSystem
>  931:             2             48
> org.apache.hadoop.fs.FileSystem$Cache$Key
> 981:             2             48
> org.apache.hadoop.fs.FileSystem$Statistics
>  991:             1             48
> org.apache.hadoop.hdfs.DistributedFileSystem
> 1132:             1             32  org.apache.hadoop.fs.RawLocalFileSystem
> 1743:             1             16  org.apache.hadoop.fs.FileSystem$Cache
>
> some more stuff
> cat jmap.txt | grep hadoop.fs
>  535:             8            192
> org.apache.hadoop.fs.RawLocalFileSystem$TrackingFileInputStream
>  539:             8            192  org.apache.hadoop.fs.permission.FsAction
>  611:             8            128  org.apache.hadoop.fs.Path
>  671:             1            112
> org.apache.hadoop.fs.FileSystem$ClientFinalizer
>  721:             4             96
> org.apache.hadoop.fs.permission.FsPermission$2
>  760:             2             96
> [Lorg.apache.hadoop.fs.permission.FsAction;
>  930:             1             48  org.apache.hadoop.fs.LocalFileSystem
>  931:             2             48
> org.apache.hadoop.fs.FileSystem$Cache$Key
>  981:             2             48
> org.apache.hadoop.fs.FileSystem$Statistics
> 1132:             1             32  org.apache.hadoop.fs.RawLocalFileSystem
> 1213:             1             24
> org.apache.hadoop.fs.permission.FsPermission
> 1347:             1             16  org.apache.hadoop.fs.FileSystem$1
> 1413:             1             16
> org.apache.hadoop.fs.ChecksumFileSystem$1
> 1523:             1             16
> org.apache.hadoop.fs.permission.FsPermission$1
> 1702:             1             16  org.apache.hadoop.fs.BlockLocation$1
> 1743:             1             16  org.apache.hadoop.fs.FileSystem$Cache
>
>
>
> Currently:
> lsof | grep "50010 (ESTABLISHED)" | wc -l
> 453
>
> Zheng Shao wrote:
>  
>> Thanks for the quick reply. It seems the number of threads are normal.
>>
>> Can you do "jmap -histo:live" as well to find out the number of
>> DFSClient, FileSystem, etc?
>>
>>
>> On 2/16/10, Bennie Schut <[EMAIL PROTECTED]> wrote:
>>
>>    
>>> jstack on the Hive process:
>>>
>>> 2010-02-16
>>> 09:30:47
>>>
>>>
>>>
>>> Full thread dump Java HotSpot(TM) 64-Bit Server VM (14.2-b01 mixed
>>> mode):
>>>
>>>
>>>
>>> "Attach Listener" daemon prio=10 tid=0x00007fbe9527f800 nid=0x4a48
>>> waiting on condition [0x0000000000000000]
>>>    java.lang.Thread.State:
>>> RUNNABLE
>>>
>>>
>>>
>>> "pool-1-thread-5" prio=10 tid=0x00007fbe95efd000 nid=0x6bab waiting on
>>> condition [0x000000004238d000]
>>>    java.lang.Thread.State: WAITING
>>> (parking)
>>>         at sun.misc.Unsafe.$$YJP$$park(Native
>>> Method)
>>>         - parking to wait for  <0x00007fbea529a0d8> (a
>>> java.util.concurrent.SynchronousQueue$TransferStack)
>>>         at