Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # user >> Speeding up Data Deletion From Datanodes


Copy link to this message
-
Re: Speeding up Data Deletion From Datanodes
I am curious now...

If you have a cluster the size of 10, what should the heartbeat be set
as? What about 100, 1000?
I too am interested in tuning documentation.  For example, how much
memory should we allocate to JVM? How much memory for namenode? etc...

On Thu, Jan 13, 2011 at 1:22 PM, Todd Lipcon <[EMAIL PROTECTED]> wrote:
> Hi Sravan,
> You may want to consider backporting HDFS-611 (or using CDH3b3 which
> includes this backport, if you aren't in the mood to patch yourself)
> -Todd
>
> On Thu, Jan 13, 2011 at 9:32 AM, sravankumar <[EMAIL PROTECTED]> wrote:
>>
>> Hi,
>>
>>
>>
>>             I have gone through the file deletion flow and came to know
>> that
>>
>> Replication Monitor is responsible for File Deletions and these
>> configurations will affect the block deletion
>>
>>
>>
>> INVALIDATE_WORK_PCT_PER_ITERATION
>>
>> BLOCK_INVALIDATE_CHUNK
>>
>>
>>
>>                 Can any one suggest how can we tune up these
>> configurations to speed up block deletion and the significance of
>>
>> INVALIDATE_WORK_PCT_PER_ITERATION constant which by default is 32.
>>
>>
>>
>>                 And also can we tune the heartbeat interval  based on the
>> cluster size.
>>
>> Suppose it is 10 Node Cluster can some one suggest how can we tune up the
>> configurations. Is there any documentation
>>
>> for the same regarding tuning up of configurations based on the cluster
>> usage.
>>
>>
>>
>> Thanks & Regards,
>>
>> Sravan kumar.
>>
>>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB