Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HDFS >> mail # user >> Re: Is there any way to partially process HDFS edits?


+
Tom Brown 2013-09-25, 21:30
Copy link to this message
-
Re: Is there any way to partially process HDFS edits?
Just try to do mannual checkpointing..
*Thanks & Regards    *


Shashwat Shriparv

On Thu, Sep 26, 2013 at 5:35 PM, Harsh J <[EMAIL PROTECTED]> wrote:

> Hi Tom,
>
> The edits are processed sequentially, and aren't all held in memory.
> Right now there's no mid-way-checkpoint when it is loaded, such that
> it could resume only with remaining work if interrupted. Normally this
> is not a problem in deployments given that SNN or SBN runs for
> checkpointing the images and keeping the edits collection small
> periodically.
>
> If your NameNode is running out of memory _applying_ the edits, then
> the cause is not the edits but a growing namespace. You most-likely
> have more files now than before, and thats going to take up permanent
> memory from the NameNode heap size.
>
> On Thu, Sep 26, 2013 at 3:00 AM, Tom Brown <[EMAIL PROTECTED]> wrote:
> > Unfortunately, I cannot give it that much RAM. The machine has 4GB total
> > (though could be expanded somewhat-- it's a VM).
> >
> > Though if each edit is processed sequentially (in a streaming form), the
> > entire edits file will never be in RAM at once.
> >
> > Is the edits file format well defined (could I break off 100MB chunks and
> > process them individually to achieve the same result as processing the
> whole
> > thing at once)?
> >
> > --Tom
> >
> >
> > On Wed, Sep 25, 2013 at 1:53 PM, Ravi Prakash <[EMAIL PROTECTED]> wrote:
> >>
> >> Tom! I would guess that just giving the NN JVM lots of memory (64Gb /
> >> 96Gb) should be the easiest way.
> >>
> >>
> >> ________________________________
> >> From: Tom Brown <[EMAIL PROTECTED]>
> >> To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
> >> Sent: Wednesday, September 25, 2013 11:29 AM
> >> Subject: Is there any way to partially process HDFS edits?
> >>
> >> I have an edits file on my namenode that is 35GB. This is quite a bit
> >> larger than it should be (the secondary namenode wasn't running for some
> >> time, and HBASE-9648 caused a huge number of additional edits).
> >>
> >> The first time I tried to start the namenode, it chewed on the edits for
> >> about 4 hours and then ran out of memory. I have increased the memory
> >> available to the namenode (was 512MB, now 2GB), and started the process
> >> again.
> >>
> >> Is there any way that the edits file can be partially processed to avoid
> >> having to re-process the same edits over and over until I can allocate
> >> enough memory for it to be done in one shot?
> >>
> >> How long should it take (hours? days?) to process an edits file of that
> >> size?
> >>
> >> Any help is appreciated!
> >>
> >> --Tom
> >>
> >>
> >
>
>
>
> --
> Harsh J
>
+
Tom Brown 2013-09-26, 15:56