|
|
-
Re: Tell Hadoop to store pairs of files at the same location(s) on HDFSM. C. Srivas 2012-12-05, 21:51
MapR does this already .. and well beyond just 2 files. One can arrange
things so that a boatload of files have all their replicas also placed on the same set of nodes, ie, files A ... Z will have replica1 on node1, replica2 on node2, replica3 on node3. etc. (nodes 1. 2 and 3 are picked by the system based on utilization and node-fullness). On Wed, Dec 5, 2012 at 11:26 AM, Sigurd Spieckermann < [EMAIL PROTECTED]> wrote: > Awesome! That's exactly what I'm looking for. Hadn't seen the JIRA. I hope > this is coming soon! > > Am 05.12.2012 18:58, schrieb Harsh J: > > You are probably talking of >> https://issues.apache.org/**jira/browse/HDFS-2576<https://issues.apache.org/jira/browse/HDFS-2576>and similar JIRAs. >> This feature isn't available in HDFS yet, but may arrive soon. >> >> On Wed, Dec 5, 2012 at 11:23 PM, Sigurd Spieckermann >> <[EMAIL PROTECTED]**> wrote: >> >>> Hi guys, >>> >>> I have been wondering if there's a way (hack'ish would be okay too) to >>> tell >>> Hadoop that two files shall be stored together at the same location(s). >>> It >>> would benefit map-side join performance if it could be done somehow >>> because >>> all map tasks would be able to read data from a local copy. Does anyone >>> know >>> a way? >>> >>> -Sigurd >>> >> >> >> >> |