Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> running a job on single-node setup takes less time than running on a cluster


+
Mahsa Mofidpoor 2012-08-20, 13:03
+
Saurabh bhutyani 2012-08-20, 16:15
+
Mahsa Mofidpoor 2012-08-20, 18:31
+
nagarjuna kanamarlapudi 2012-08-22, 03:46
+
Mahsa Mofidpoor 2012-08-23, 16:19
Copy link to this message
-
Re: running a job on single-node setup takes less time than running on a cluster
I have no answer to your questions , but have some questions though !

What tables are you talking about ?
Considering you are talking about datasets/files when you say tables , why
using hadoop for such some sized tables.

On Mon, Aug 20, 2012 at 6:33 PM, Mahsa Mofidpoor <[EMAIL PROTECTED]>wrote:

> Hello,
>
> I run a simple join (select col_list from table1 join table2 on
> (join_condition)) on both single-node and multi-nodes  setup. The table
> sizes are 1.7 MB and 4.2 MB respectively.  It takes more time to execute
> the query on the cluster then to run it on a single-node hadoop setup.
> I checked to map logs and I saw that both mappings happen on the master
> node.
> Do I need to increase the data in order to benefit from the multi-nodes
> capacity?
> How can I make sure that my data is distributed on all the nodes?
>
> Thank you in advance for your assistance.
>
> Reagrds,
> Mahsa
>