Harsh J 2012-02-05, 16:56
You should get yourself a copy of "Hadoop: The Definitive Guide" by
Tom White (O'Reilly), a lot of what you ask is very well covered in
On Sun, Feb 5, 2012 at 4:51 PM, Alieh Saeedi <[EMAIL PROTECTED]> wrote:
> 1- Is there a way to check disk status (free disk space, used disk space,
> total disk space) of a node?
Easiest way, is to visit
> 2-Is there a way to generally tell Hadoop to write reducer output on a node
> which has enough (more than 25% of the whole node's disk space) free disk
> space (without specifying a node)?
This is a non-worry, HDFS handles writes intelligently. Your reducer
may end up writing properly even if the node isn't having adequate
> If I dont specify a directory for reducer
> output where will Hadoop check the destination node disk space before
> writing on it?
This question doesn't make sense. Remember that you are writing to a
distributed filesystem, not local.
> If there is no enough disk space on the reducer, Does it save
> reducer's output on other nodes?
If the local DN does not have adequate space to store the block,
another DN is chosen and the write is over the network. This is why it
is essential to have a balanced HDFS cluster by running the balancer
Customer Ops. Engineer
Cloudera | http://tiny.cloudera.com/about