-Re: TeraSort question.
Niels Basjes 2011-01-11, 19:07
Have a look at the graph shown here:
It should make clear that the number of tasks varies greatly over the
lifetime of a job.
Depending on the nodes available this may leave node idle.
2011/1/11 Raj V <[EMAIL PROTECTED]>:
> Thanks. I have all the graphs I need that include, map reduce timeline, system activity for all the nodes when the sort was running. I will publish them once I have them in some presentable format.,
> For legal reasons, I really don't want to send the complete job histiory files.
> My question is still this. When running terasort, would the CPU, disk and network utilization of all the nodes be more or less similar or completely different.
> Sometime during the day, I will post the system data from 5 nodes and that would probably explain my question better.
> From: Ted Dunning <[EMAIL PROTECTED]>
> To: [EMAIL PROTECTED]; Raj V <[EMAIL PROTECTED]>
> Sent: Tuesday, January 11, 2011 8:22:17 AM
> Subject: Re: TeraSort question.
> Do you have the job history files? That would be very useful. I would be
> happy to create some swimlane and related graphs for you if you can send me
> the history files.
> On Mon, Jan 10, 2011 at 9:06 PM, Raj V <[EMAIL PROTECTED]> wrote:
>> I have been running terasort on a 480 node hadoop cluster. I have also
>> collected cpu,memory,disk, network statistics during this run. The system
>> stats are quite intersting. I can post it when I have put them together in
>> some presentable format ( if there is interest.). However while looking at
>> the data, I noticed something interesting.
>> I thought, intutively, that the all the systems in the cluster would have
>> more or less similar behaviour ( time translation was possible) but the
>> overall graph would look the same.,
>> Just to confirm it I took 5 random nodes and looked at the CPU, disk
>> ,network etc. activity when the sort was running. Strangeley enough, it was
>> not so., Two of the 5 systems were seriously busy, big IO with lots of disk
>> and network activity. The other three systems, CPU was more or less 100%
>> idle, slight network and I/O.
>> Is that normal and/or expected? SHouldn't all the nodes be utilized in more
>> or less manner over the length of the run?
>> I generated the data forf the sort using teragen. ( 128MB bloick size,
>> replication =3).
>> I would also be interested in other people timings of sort. Is there some
>> place where people can post sort numbers ( not just the record.)
>> I will post the actual graphs of the 5 nodes, if there is interest,
>> tomorrow. ( Some logistical issues abt. posting them tonight)
>> I am using CDH3B3, even though I think this is not specific to CDH3B3.
>> Sorry for the cross post.
Met vriendelijke groeten,