-Re: Tell Hadoop to store pairs of files at the same location(s) on HDFS
M. C. Srivas 2012-12-05, 21:51
MapR does this already .. and well beyond just 2 files. One can arrange
things so that a boatload of files have all their replicas also placed on
the same set of nodes, ie, files A ... Z will have replica1 on node1,
replica2 on node2, replica3 on node3. etc. (nodes 1. 2 and 3 are picked by
the system based on utilization and node-fullness).
On Wed, Dec 5, 2012 at 11:26 AM, Sigurd Spieckermann <
[EMAIL PROTECTED]> wrote:
> Awesome! That's exactly what I'm looking for. Hadn't seen the JIRA. I hope
> this is coming soon!
> Am 05.12.2012 18:58, schrieb Harsh J:
> You are probably talking of
>> https://issues.apache.org/**jira/browse/HDFS-2576<https://issues.apache.org/jira/browse/HDFS-2576>and similar JIRAs.
>> This feature isn't available in HDFS yet, but may arrive soon.
>> On Wed, Dec 5, 2012 at 11:23 PM, Sigurd Spieckermann
>> <[EMAIL PROTECTED]**> wrote:
>>> Hi guys,
>>> I have been wondering if there's a way (hack'ish would be okay too) to
>>> Hadoop that two files shall be stored together at the same location(s).
>>> would benefit map-side join performance if it could be done somehow
>>> all map tasks would be able to read data from a local copy. Does anyone
>>> a way?