Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # user >> Re: HDFS interfaces


Copy link to this message
-
Re: HDFS interfaces
Looking in the source, it appears that In HDFS, the Namenode supports
getting this info directly via the client, and ultimately communicates
block locations to the DFSClient , which is used by the
DistributedFileSystem.

  /**
   * @see ClientProtocol#getBlockLocations(String, long, long)
   */
  static LocatedBlocks callGetBlockLocations(ClientProtocol namenode,
      String src, long start, long length)
      throws IOException {
    try {
      return namenode.getBlockLocations(src, start, length);
    } catch(RemoteException re) {
      throw re.unwrapRemoteException(AccessControlException.class,
                                     FileNotFoundException.class,
                                     UnresolvedPathException.class);
    }
  }
On Tue, Jun 4, 2013 at 2:00 AM, Mahmood Naderan <[EMAIL PROTECTED]>wrote:

> There are many instances of getFileBlockLocations in hadoop/fs. Can you
> explain which one is the main?
> >It must be combined with a method of logically splitting the input data
> along block boundaries, and of launching tasks on worker nodes that >are
> close to the data splits
> Is this a user level task of system level task?
>
>
> Regards,
> Mahmood*
> *
>
>   ------------------------------
>  *From:* John Lilley <[EMAIL PROTECTED]>
> *To:* "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>; Mahmood Naderan <
> [EMAIL PROTECTED]>
> *Sent:* Tuesday, June 4, 2013 3:28 AM
> *Subject:* RE: HDFS interfaces
>
>  Mahmood,
>
> It is the in the FileSystem interface.
> http://hadoop.apache.org/docs/current/api/org/apache/hadoop/fs/FileSystem.html#getFileBlockLocations(org.apache.hadoop.fs.Path,
> long, long)<http://hadoop.apache.org/docs/current/api/org/apache/hadoop/fs/FileSystem.html#getFileBlockLocations%28org.apache.hadoop.fs.Path,%20long,%20long%29>
>
> This by itself is not sufficient for application programmers to make good
> use of data locality.  It must be combined with a method of logically
> splitting the input data along block boundaries, and of launching tasks on
> worker nodes that are close to the data splits.  MapReduce does both of
> these things internally along with the file-format input classes.  For an
> application to do so directly, see the new YARN-based interfaces
> ApplicationMaster and ResourceManager.  These are however very new and
> there is little documentation or examples.
>
> john
>
>  *From:* Mahmood Naderan [mailto:[EMAIL PROTECTED]]
> *Sent:* Monday, June 03, 2013 12:09 PM
> *To:* [EMAIL PROTECTED]
> *Subject:* HDFS interfaces
>
>  Hello,
>  It is stated in the "HDFS architecture guide" (
> https://hadoop.apache.org/docs/r1.0.4/hdfs_design.html) that
>
>  *HDFS provides interfaces for applications to move themselves closer to
> where the data is located. *
>
>  What are these interfaces and where they are in the source code? Is
> there any manual for the interfaces?
>
>   Regards,
> Mahmood
>
>
>
--
Jay Vyas
http://jayunit100.blogspot.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB