Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Re: number of map and reduce task does not change in M/R program


Copy link to this message
-
Re: number of map and reduce task does not change in M/R program
Thanks a lot for the reply..
On Mon, Oct 21, 2013 at 10:39 AM, Dieter De Witte <[EMAIL PROTECTED]>wrote:

> Anseh,
>
> Let's assume that your job is fully scalable, then it should take: 100 000
> 000 / 600 000 times the amount of time of the first job, which is 1000 / 6
> = 167 times longer. This is an ideal, probably it will be something like
> 200 times. Also try using units in your questions + scientific notation
> 10^8 records or 10^8 bytes?
>
> Regards, irW
>
>
> 2013/10/20 Anseh Danesh <[EMAIL PROTECTED]>
>
>> OK... thanks a lot for the link... it is so useful... ;)
>>
>>
>> On Sun, Oct 20, 2013 at 6:59 PM, Amr Shahin <[EMAIL PROTECTED]> wrote:
>>
>>> Try profiling the job (
>>> http://hadoop.apache.org/docs/stable/mapred_tutorial.html#Profiling)
>>> And yeah the machine specs could be the reason, that's why hadoop was
>>> invented in the first place ;)
>>>
>>>
>>> On Sun, Oct 20, 2013 at 8:39 AM, Anseh Danesh <[EMAIL PROTECTED]>wrote:
>>>
>>>> I try it in a small set of data, in about 600000 data and it does not
>>>> take too long. the execution time was reasonable. but in the set of
>>>> 100000000 data it really works too bad. any thing else, I have 2 processors
>>>> in my machine, I think this amount of data is very huge for my processor
>>>> and this way it takes too long to process... what do you think about this?
>>>>
>>>>
>>>> On Sun, Oct 20, 2013 at 1:49 AM, Amr Shahin <[EMAIL PROTECTED]>wrote:
>>>>
>>>>> Try running the job locally on a small set of the data and see if it
>>>>> takes too long. If so, you map code might have some performance issues
>>>>>
>>>>>
>>>>> On Sat, Oct 19, 2013 at 9:08 AM, Anseh Danesh <[EMAIL PROTECTED]>wrote:
>>>>>
>>>>>> Hi all.. I have a question.. I have a mapreduce program that get
>>>>>> input from cassandra. my input is a little big, about 100000000 data. my
>>>>>> problem is that my program takes too long to process, but I think mapreduce
>>>>>> is good and fast for large volume of data. so I think maybe I have problems
>>>>>> in number of map and reduce tasks.. I set the number of map and reduce asks
>>>>>> with JobConf, with Job, and also in conf/mapred-site.xml, but I don't see
>>>>>> any changes.. in my logs at first there is map 0% reduce 0% and after about
>>>>>> 2 hours working it shows map 1% reduce 0%..!! what should I do? please Help
>>>>>> me I really get confused...
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>