Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # user >> Local block placement policy, request


Copy link to this message
-
Re: Local block placement policy, request
Todd, thanks!

> In general, though, keep in mind that, whenever you write data, you'll
> get a local copy first, if the writer is in the cluster. That's how
> HBase gets locality for most of its accesses

Right.  However in the failover scenario where a node goes down
(hardware failure, or either of the processes, such as the DataNode,
RegionServer, etc), then I think the new RS will not have local data?
We could first make a request that all necessary HDFS files go local
prior to the new RS being available.  At least for search to work this
is a requirement.

> There are some non-public APIs to do this -- have a look at how the
> Balancer works - the dispatch() function is the guts you're looking
> for. It might be nice to expose this functionality as a "limited
> private evolving" API

Perhaps simply mark them as 'expert' or make them package private?
I'll work on a patch.

On Thu, May 26, 2011 at 11:40 AM, Todd Lipcon <[EMAIL PROTECTED]> wrote:
> Hey Jason,
>
> There are some non-public APIs to do this -- have a look at how the
> Balancer works - the dispatch() function is the guts you're looking
> for. It might be nice to expose this functionality as a "limited
> private evolving" API.
>
> In general, though, keep in mind that, whenever you write data, you'll
> get a local copy first, if the writer is in the cluster. That's how
> HBase gets locality for most of its accesses.
>
> -Todd
>
> On Thu, May 26, 2011 at 11:36 AM, Jason Rutherglen
> <[EMAIL PROTECTED]> wrote:
>> Is there a way to send a request to the name node to replicate
>> block(s) to a specific DataNode?  If not, what would be a way to do
>> this?  -Thanks
>>
>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>