Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Extension points available for data locality

Copy link to this message
Re: Extension points available for data locality
Hi Mathew,

You should check out this project

It uses Hadoop and RDMBS for analytics.

Dino Kečo
skype: dino.keco
phone: +387 61 507 851
On Tue, Aug 21, 2012 at 11:06 AM, Tharindu Mathew <[EMAIL PROTECTED]>wrote:

> Hi,
> I'm doing some research that involves pulling data stored in a mysql
> cluster directly for a map reduce job, without storing the data in HDFS.
> I'd like to run hadoop task tracker nodes directly on the mysql cluster
> nodes. The purpose of this being, starting mappers directly in the node
> closest to the data if possible (data locality).
> I notice that with HDFS, since the name node knows exactly where each data
> block is, it uses this to achieve data locality.
> Is there a way to achieve my requirement possibly by extending the name
> node or otherwise?
> Thanks in advance.
> --
> Regards,
> Tharindu
> blog: http://mackiemathew.com/