Benoit Perroud 2012-09-05, 07:28
-Re: DFSOutputStream.Packet retention even if close() called when IOE encountered
Sorry for the slow response on this.
The attachment seems to have not come through. Would you mind filing
an HDFS JIRA and attach the reproducer case there?
On Wed, Sep 5, 2012 at 12:28 AM, Benoit Perroud <[EMAIL PROTECTED]> wrote:
> Hi All,
> I experience some memory retention while copying data into HDFS when a
> IOExeption is thrown.
> My use case is the following: I have multiple threads sharing a
> FileSystem object, all uploading files. At some point quota is
> exceeded in one thread and I get a DSQuotaExceededException (subclass
> of IOException). In both regular case and when such exception is
> thrown, I'm closing the DFSOutputStream.
> But for DFSOutputStream that encountered a IOException, the last
> Packet is kept in memory until the FileSystem is closed. Which I
> usually don't close really often.
> So my questions:
> - Is this the expected behavior and need I to deal with ?
> - Is there a way to close properly a DFSOutputStream (and freeing all
> the retained memory) without closing the FileSystem ?
> - Is the usage of one shared FileSystem in several threads recommended ?
> Attached is a simple test reproducing the behavior: MiniDFSCluster is
> launched, a deadly small quota is set to have IOException thrown.
> Random content is generated and uploaded to hdfs. FileSystem is not
> closed, thus memory is growing till an OOM is thrown (don't blame me
> for the @Test(expected = OutOfMemoryError.class) :)). Tested on Hadoop
> Thanks in advance for your answers, pointers and advises.
Software Engineer, Cloudera