Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Similar frameworks like hadoop and taxonomy of distributed computing

Copy link to this message
Re: Similar frameworks like hadoop and taxonomy of distributed computing


see comments in text

On 1/11/2012 4:42 PM, Merto Mertek wrote:
> Hi,
> I was wondering if anyone knows any paper discussing and comparing the
> mentioned topic. I am a little bit confused about the classification of
> hadoop.. Is it a /cluster/comp grid/ a mix of them?
I think that a strict definition would be an implementation of the
map-reduce computing paradigm, for cluster usage.

> What is hadoop in
> relation with a cloud - probably just a technology that enables cloud
> services..
It can be used to enable cloud services through a service oriented
framework, like we are doing in

in which we are trying to create a cloud service that offers MapReduce
clusters as a service and distributed storage (through HDFS).
But this is not the primary usage. This is the back end heavy processing
in a cluster-like manner, specifically for parallel jobs that follow the
MR logic.

>   Can it be compared to cluster middleware like beowulf, oscar, condor,
> sector/sphere, hpcc, dryad, etc? Why not?
I could see some similarities with condor, mainly in the job submission
processes, however i am not really sure how condor deals with parallel jobs.

> Like I could read hadoop main
> field is text processing for problems that are embarrassingly parallel but
> I cannot define what would be the case for deciding to use other cluster
> technologies. Probably there are a lot of similarities between then,
> however any comparison would be helpful.

Theoretically, you could write the program like an MPI implementation,
which is more flexible and is not limited by the MR paradigm. However if
you can find a way to convert your problem to a MR job, then the
implementation would be much easier (I guess) as a hadoop job, since you
will only have to write the Mapper and the Reducer. In MPI you would
probably need all the communication framework too. Furthermore, Hadoop
has also HDFS, which enables shared storage between the various hadoop
components/threads etc. In other clusters you need to set this up
specifically, through NFS or something similar (i guess again).

My two cents,

> It would be a big help to clarify in which field to classify all those
> technologies and what are they most suitable for...
> Thank you


George Kousiouris
Electrical and Computer Engineer
Division of Communications,
Electronics and Information Engineering
School of Electrical and Computer Engineering
Tel: +30 210 772 2546
Mobile: +30 6939354121
Fax: +30 210 772 2569
Site: http://users.ntua.gr/gkousiou/

National Technical University of Athens
9 Heroon Polytechniou str., 157 73 Zografou, Athens, Greece