Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS, mail # user - Re: HDFS interfaces


Copy link to this message
-
Re: HDFS interfaces
Jay Vyas 2013-06-04, 06:38
Looking in the source, it appears that In HDFS, the Namenode supports
getting this info directly via the client, and ultimately communicates
block locations to the DFSClient , which is used by the
DistributedFileSystem.

  /**
   * @see ClientProtocol#getBlockLocations(String, long, long)
   */
  static LocatedBlocks callGetBlockLocations(ClientProtocol namenode,
      String src, long start, long length)
      throws IOException {
    try {
      return namenode.getBlockLocations(src, start, length);
    } catch(RemoteException re) {
      throw re.unwrapRemoteException(AccessControlException.class,
                                     FileNotFoundException.class,
                                     UnresolvedPathException.class);
    }
  }
On Tue, Jun 4, 2013 at 2:00 AM, Mahmood Naderan <[EMAIL PROTECTED]>wrote:

> There are many instances of getFileBlockLocations in hadoop/fs. Can you
> explain which one is the main?
> >It must be combined with a method of logically splitting the input data
> along block boundaries, and of launching tasks on worker nodes that >are
> close to the data splits
> Is this a user level task of system level task?
>
>
> Regards,
> Mahmood*
> *
>
>   ------------------------------
>  *From:* John Lilley <[EMAIL PROTECTED]>
> *To:* "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>; Mahmood Naderan <
> [EMAIL PROTECTED]>
> *Sent:* Tuesday, June 4, 2013 3:28 AM
> *Subject:* RE: HDFS interfaces
>
>  Mahmood,
>
> It is the in the FileSystem interface.
> http://hadoop.apache.org/docs/current/api/org/apache/hadoop/fs/FileSystem.html#getFileBlockLocations(org.apache.hadoop.fs.Path,
> long, long)<http://hadoop.apache.org/docs/current/api/org/apache/hadoop/fs/FileSystem.html#getFileBlockLocations%28org.apache.hadoop.fs.Path,%20long,%20long%29>
>
> This by itself is not sufficient for application programmers to make good
> use of data locality.  It must be combined with a method of logically
> splitting the input data along block boundaries, and of launching tasks on
> worker nodes that are close to the data splits.  MapReduce does both of
> these things internally along with the file-format input classes.  For an
> application to do so directly, see the new YARN-based interfaces
> ApplicationMaster and ResourceManager.  These are however very new and
> there is little documentation or examples.
>
> john
>
>  *From:* Mahmood Naderan [mailto:[EMAIL PROTECTED]]
> *Sent:* Monday, June 03, 2013 12:09 PM
> *To:* [EMAIL PROTECTED]
> *Subject:* HDFS interfaces
>
>  Hello,
>  It is stated in the "HDFS architecture guide" (
> https://hadoop.apache.org/docs/r1.0.4/hdfs_design.html) that
>
>  *HDFS provides interfaces for applications to move themselves closer to
> where the data is located. *
>
>  What are these interfaces and where they are in the source code? Is
> there any manual for the interfaces?
>
>   Regards,
> Mahmood
>
>
>
--
Jay Vyas
http://jayunit100.blogspot.com