-Re: APIs to move data blocks within HDFS
Karthiek C 2013-02-22, 21:44
Thank you Harsh and Chris. This really helps!
On Fri, Feb 22, 2013 at 2:46 PM, Chris Nauroth <[EMAIL PROTECTED]>wrote:
> Regarding your question about a pluggable module to control placement of
> data, try taking a look at the abstract class BlockPlacementPolicy and
> BlockPlacementPolicyDefault, which is its default implementation.
> On branch-1, you can find these classes
> at src/hdfs/org/apache/hadoop/hdfs/server/namenode. On trunk, the package
> structure is different, and these classes are
> Best of luck with your research!
> On Fri, Feb 22, 2013 at 11:17 AM, Harsh J <[EMAIL PROTECTED]> wrote:
> > There's no filesystem (i.e. client) level APIs to do this, but the
> > Balancer tool of HDFS does exactly this. Reading its sources should
> > let you understand what kinda calls you need to make to reuse the
> > balancer protocol and achieve what you need.
> > In trunk, the balancer is at
> > HTH, and feel free to ask any relevant follow up questions.
> > On Fri, Feb 22, 2013 at 11:43 PM, Karthiek C <[EMAIL PROTECTED]>
> > > Hi,
> > >
> > > Is there any APIs to move data blocks in HDFS from one node to another
> > > after* they have been added to HDFS? Also can we write some sort of
> > > pluggable module (like scheduler) that controls how data gets placed in
> > > hadoop cluster? I am working with hadoop-1.0.3 version and I couldn't
> > find
> > > any filesystem APIs available to do that.
> > >
> > > PS: I am working on a research project where we want to investigate how
> > to
> > > optimally place data in hadoop.
> > >
> > > Thanks,
> > > Karthiek
> > --
> > Harsh J