Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # dev >> read short circuit

Copy link to this message
Re: read short circuit
On Thu, Sep 13, 2012 at 10:28 AM, Stack <[EMAIL PROTECTED]> wrote:
> Write a short paragraph and I'll make an HDFS configuration sections
> like this HBase configurations section on manual and stick it in
> there: http://hbase.apache.org/book.html#perf.configurations

Here's a first stab:

Leveraging local data

Since Hadoop 1.0.0 (also 0.22.1, 0.23.1, CDH3u3 and HDP 1.0) via
HDFS-2246[1], it is possible for the DFSClient to take a shortcut and
read directly from disk instead of going through the DataNode when the
data is local. What this means for HBase is that the RegionServers can
read directly off their machine's disks instead of having to open a
socket to talk to the DataNode, the former being generally much
faster[2]. In order to enable it, first hdfs-site.xml needs to be
amended with:

dfs.block.local-path-access.user = the _only_ user that can use the
shortcut. This has to be the user that started HBase.

And in hbase-site.xml:

dfs.client.read.shortcircuit = true

The DataNodes need to be restarted in order to pick up the new
configuration. Be aware that if a process started under another
username than the one configured here also has the shortcircuit
enabled, it will get an Exception regarding an unauthorized access but
the data will still be read.

1. https://issues.apache.org/jira/browse/HDFS-2246
2. http://files.meetup.com/1350427/hug_ebay_jdcryans.pdf