Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS, mail # user - Re: Is there any way to partially process HDFS edits?


Copy link to this message
-
Re: Is there any way to partially process HDFS edits?
Tom Brown 2013-09-25, 21:30
Unfortunately, I cannot give it that much RAM. The machine has 4GB total
(though could be expanded somewhat-- it's a VM).

Though if each edit is processed sequentially (in a streaming form), the
entire edits file will never be in RAM at once.

Is the edits file format well defined (could I break off 100MB chunks and
process them individually to achieve the same result as processing the
whole thing at once)?

--Tom
On Wed, Sep 25, 2013 at 1:53 PM, Ravi Prakash <[EMAIL PROTECTED]> wrote:

> Tom! I would guess that just giving the NN JVM lots of memory (64Gb /
> 96Gb) should be the easiest way.
>
>
>   ------------------------------
>  *From:* Tom Brown <[EMAIL PROTECTED]>
> *To:* "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
> *Sent:* Wednesday, September 25, 2013 11:29 AM
> *Subject:* Is there any way to partially process HDFS edits?
>
> I have an edits file on my namenode that is 35GB. This is quite a bit
> larger than it should be (the secondary namenode wasn't running for some
> time, and HBASE-9648 caused a huge number of additional edits).
>
> The first time I tried to start the namenode, it chewed on the edits for
> about 4 hours and then ran out of memory. I have increased the memory
> available to the namenode (was 512MB, now 2GB), and started the process
> again.
>
> Is there any way that the edits file can be partially processed to avoid
> having to re-process the same edits over and over until I can allocate
> enough memory for it to be done in one shot?
>
> How long should it take (hours? days?) to process an edits file of that
> size?
>
> Any help is appreciated!
>
> --Tom
>
>
>