-Jobtracker memory issues due to FileSystem$Cache
Marcin Mejran 2013-04-16, 17:47
We've recently run into jobtracker memory issues on our new hadoop cluster. A heap dump shows that there are thousands of copies of DistributedFileSystem kept in FileSystem$Cache, a bit over one for each job run on the cluster and their jobconf objects support this view. I believe these are created when the .staging directories get cleaned up but I may be wrong on that.
>From what I can tell in the dump, the username (probably not ugi, hard to tell), scheme and authority parts of the Cache$Key are the same across multiple objects in FileSystem$Cache. I can only assume that the usergroupinformation piece differs somehow every time it's created.
We're using CDH4.2, MR1, CentOS 6.3 and Java 1.6_31. Kerberos, ldap and so on are not enabled.
Is there any known reason for this type of behavior?