Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # user >> Local block placement policy, request


Copy link to this message
-
Re: Local block placement policy, request
Todd, thanks!

> In general, though, keep in mind that, whenever you write data, you'll
> get a local copy first, if the writer is in the cluster. That's how
> HBase gets locality for most of its accesses

Right.  However in the failover scenario where a node goes down
(hardware failure, or either of the processes, such as the DataNode,
RegionServer, etc), then I think the new RS will not have local data?
We could first make a request that all necessary HDFS files go local
prior to the new RS being available.  At least for search to work this
is a requirement.

> There are some non-public APIs to do this -- have a look at how the
> Balancer works - the dispatch() function is the guts you're looking
> for. It might be nice to expose this functionality as a "limited
> private evolving" API

Perhaps simply mark them as 'expert' or make them package private?
I'll work on a patch.

On Thu, May 26, 2011 at 11:40 AM, Todd Lipcon <[EMAIL PROTECTED]> wrote:
> Hey Jason,
>
> There are some non-public APIs to do this -- have a look at how the
> Balancer works - the dispatch() function is the guts you're looking
> for. It might be nice to expose this functionality as a "limited
> private evolving" API.
>
> In general, though, keep in mind that, whenever you write data, you'll
> get a local copy first, if the writer is in the cluster. That's how
> HBase gets locality for most of its accesses.
>
> -Todd
>
> On Thu, May 26, 2011 at 11:36 AM, Jason Rutherglen
> <[EMAIL PROTECTED]> wrote:
>> Is there a way to send a request to the name node to replicate
>> block(s) to a specific DataNode?  If not, what would be a way to do
>> this?  -Thanks
>>
>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB