Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> running a job on single-node setup takes less time than running on a cluster


+
Mahsa Mofidpoor 2012-08-20, 13:03
+
Saurabh bhutyani 2012-08-20, 16:15
+
Mahsa Mofidpoor 2012-08-20, 18:31
+
nagarjuna kanamarlapudi 2012-08-22, 03:46
+
Mahsa Mofidpoor 2012-08-23, 16:19
Copy link to this message
-
Re: running a job on single-node setup takes less time than running on a cluster
I have no answer to your questions , but have some questions though !

What tables are you talking about ?
Considering you are talking about datasets/files when you say tables , why
using hadoop for such some sized tables.

On Mon, Aug 20, 2012 at 6:33 PM, Mahsa Mofidpoor <[EMAIL PROTECTED]>wrote:

> Hello,
>
> I run a simple join (select col_list from table1 join table2 on
> (join_condition)) on both single-node and multi-nodes  setup. The table
> sizes are 1.7 MB and 4.2 MB respectively.  It takes more time to execute
> the query on the cluster then to run it on a single-node hadoop setup.
> I checked to map logs and I saw that both mappings happen on the master
> node.
> Do I need to increase the data in order to benefit from the multi-nodes
> capacity?
> How can I make sure that my data is distributed on all the nodes?
>
> Thank you in advance for your assistance.
>
> Reagrds,
> Mahsa
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB