Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop, mail # dev - Strategy Of Replica


Copy link to this message
-
Re: Strategy Of Replica
Steve Loughran 2011-10-11, 11:32
On 11/10/11 04:49, gschen wrote:

> In hdfs only one thing we can do is that we could
> set replication factor to change replication strategy, but we can not
> change where the block is stored and what type of storage that we stored
> the data. Just think this case: In order to improve the downloading
> speed, I can choose my block replication near my location or near
> someone's location. I mean that users could have more option to decide
> their block replication strategy.

1. In "apache hadoop goes realtime at facebook", Dhruba and others
discuss their use of alternate block placement policies.

2. Russ perry did some work on rasterization of PDF files in Hadoop
where the final stage -collecting the output and streaming to the
printer- was done on a machine next to the printer. He modified
DFSClient to provide all the location data on all blocks, and had his
app pick blocks off different machines to keep the net busy, avoid
overloading any specific machine with disk IO requests, and to ensure
peak bandwidth between the final destination machine

http://www.hpl.hp.com/techreports/2009/HPL-2009-345.pdf