Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce, mail # user - Extension points available for data locality


Copy link to this message
-
Re: Extension points available for data locality
Dino Kečo 2012-08-21, 09:22
Hi Mathew,

You should check out this project
http://db.cs.yale.edu/hadoopdb/hadoopdb.html

It uses Hadoop and RDMBS for analytics.

Regards,
Dino Kečo
msn: [EMAIL PROTECTED]
mail: [EMAIL PROTECTED]
skype: dino.keco
phone: +387 61 507 851
On Tue, Aug 21, 2012 at 11:06 AM, Tharindu Mathew <[EMAIL PROTECTED]>wrote:

> Hi,
>
> I'm doing some research that involves pulling data stored in a mysql
> cluster directly for a map reduce job, without storing the data in HDFS.
>
> I'd like to run hadoop task tracker nodes directly on the mysql cluster
> nodes. The purpose of this being, starting mappers directly in the node
> closest to the data if possible (data locality).
>
> I notice that with HDFS, since the name node knows exactly where each data
> block is, it uses this to achieve data locality.
>
> Is there a way to achieve my requirement possibly by extending the name
> node or otherwise?
>
> Thanks in advance.
>
> --
> Regards,
>
> Tharindu
>
> blog: http://mackiemathew.com/
>
>