Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop, mail # user - io.sort.mb configuration?


+
Mark Vigeant 2009-12-22, 16:10
+
Patrick Angeles 2009-12-22, 16:40
+
Long Van Nguyen Dinh 2009-12-22, 17:17
+
Mark Vigeant 2009-12-22, 17:30
Copy link to this message
-
Re: io.sort.mb configuration?
Jeff Hammerbacher 2009-12-23, 01:34
Hey Mark,

While you're grokking this aspect of MapReduce's configuration, you may want
to check out https://issues.apache.org/jira/browse/MAPREDUCE-64, which is on
its way into trunk right now. Chris Douglas from Yahoo! has posted a very
nice explanation of how buffers are managed during the shuffle and which
parameters affect the behavior.

Regards,
Jeff

On Tue, Dec 22, 2009 at 12:30 PM, Mark Vigeant <[EMAIL PROTECTED]
> wrote:

> Thank you for the responses guys!
>
> First, to Patrick, I didn't set it in the code, though I will try it
> because that's a really good idea to set it there, so I shall play around
> with that.
>
> Long: I should have clarified, I am using 0.20.1, and so this is a bit
> different. I set the parameter in mapred-site.xml and for some reason it's
> just not getting implemented. Thank you anyways, though!
>
> -Mark
>
> -----Original Message-----
> From: Long Van Nguyen Dinh [mailto:[EMAIL PROTECTED]]
> Sent: Tuesday, December 22, 2009 12:17 PM
> To: [EMAIL PROTECTED]
> Subject: Re: io.sort.mb configuration?
>
> Hadoop has a default file (hadoop-default.xml - version 19) for all
> configuration, don't change the values in that file (they won't be
> affected), copy the parameter to the file hadoop-site.xml where you
> set up the cluster and set the value you want there.
>
> Long Van
>
> On Tue, Dec 22, 2009 at 11:40 AM, Patrick Angeles
> <[EMAIL PROTECTED]> wrote:
> > You can also set that param per-job. Maybe you called some code that did
> > that behind the scenes?
> >
> > On Tue, Dec 22, 2009 at 11:10 AM, Mark Vigeant <
> [EMAIL PROTECTED]
> >> wrote:
> >
> >> Hey Everyone-
> >>
> >> I've been playing around with Hadoop and Hbase for a while and I noticed
> >> that when running a program to upload data into an HTable I saw the
> output:
> >>
> >> INFO mapred.MapTask: io.sort.mb = 100
> >>
> >> Which is the default value, but in the mapred configuration on all
> machines
> >> in my cluster I set this value to 250. Could it be that my program is
> not
> >> accessing the configuration properly? Is that too large a value? Or is
> it
> >> most likely just a foolish syntax error on my part?
> >>
> >> Thank you very much, all input is appreciated.
> >>
> >> Mark Vigeant
> >> RiskMetrics Group, Inc.
> >>
> >>
> >> This email message and any attachments are for the sole use of the
> intended
> >> recipients and may contain proprietary and/or confidential information
> which
> >> may be privileged or otherwise protected from disclosure. Any
> unauthorized
> >> review, use, disclosure or distribution is prohibited. If you are not an
> >> intended recipient, please contact the sender by reply email and destroy
> the
> >> original message and any copies of the message as well as any
> attachments to
> >> the original message.
> >>
> >
>
> This email message and any attachments are for the sole use of the intended
> recipients and may contain proprietary and/or confidential information which
> may be privileged or otherwise protected from disclosure. Any unauthorized
> review, use, disclosure or distribution is prohibited. If you are not an
> intended recipient, please contact the sender by reply email and destroy the
> original message and any copies of the message as well as any attachments to
> the original message.
>
+
Mark Vigeant 2009-12-23, 17:22
+
Todd Lipcon 2009-12-23, 17:31