Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> io.sort.mb configuration?


Copy link to this message
-
Re: io.sort.mb configuration?
Hey Mark,

While you're grokking this aspect of MapReduce's configuration, you may want
to check out https://issues.apache.org/jira/browse/MAPREDUCE-64, which is on
its way into trunk right now. Chris Douglas from Yahoo! has posted a very
nice explanation of how buffers are managed during the shuffle and which
parameters affect the behavior.

Regards,
Jeff

On Tue, Dec 22, 2009 at 12:30 PM, Mark Vigeant <[EMAIL PROTECTED]
> wrote:

> Thank you for the responses guys!
>
> First, to Patrick, I didn't set it in the code, though I will try it
> because that's a really good idea to set it there, so I shall play around
> with that.
>
> Long: I should have clarified, I am using 0.20.1, and so this is a bit
> different. I set the parameter in mapred-site.xml and for some reason it's
> just not getting implemented. Thank you anyways, though!
>
> -Mark
>
> -----Original Message-----
> From: Long Van Nguyen Dinh [mailto:[EMAIL PROTECTED]]
> Sent: Tuesday, December 22, 2009 12:17 PM
> To: [EMAIL PROTECTED]
> Subject: Re: io.sort.mb configuration?
>
> Hadoop has a default file (hadoop-default.xml - version 19) for all
> configuration, don't change the values in that file (they won't be
> affected), copy the parameter to the file hadoop-site.xml where you
> set up the cluster and set the value you want there.
>
> Long Van
>
> On Tue, Dec 22, 2009 at 11:40 AM, Patrick Angeles
> <[EMAIL PROTECTED]> wrote:
> > You can also set that param per-job. Maybe you called some code that did
> > that behind the scenes?
> >
> > On Tue, Dec 22, 2009 at 11:10 AM, Mark Vigeant <
> [EMAIL PROTECTED]
> >> wrote:
> >
> >> Hey Everyone-
> >>
> >> I've been playing around with Hadoop and Hbase for a while and I noticed
> >> that when running a program to upload data into an HTable I saw the
> output:
> >>
> >> INFO mapred.MapTask: io.sort.mb = 100
> >>
> >> Which is the default value, but in the mapred configuration on all
> machines
> >> in my cluster I set this value to 250. Could it be that my program is
> not
> >> accessing the configuration properly? Is that too large a value? Or is
> it
> >> most likely just a foolish syntax error on my part?
> >>
> >> Thank you very much, all input is appreciated.
> >>
> >> Mark Vigeant
> >> RiskMetrics Group, Inc.
> >>
> >>
> >> This email message and any attachments are for the sole use of the
> intended
> >> recipients and may contain proprietary and/or confidential information
> which
> >> may be privileged or otherwise protected from disclosure. Any
> unauthorized
> >> review, use, disclosure or distribution is prohibited. If you are not an
> >> intended recipient, please contact the sender by reply email and destroy
> the
> >> original message and any copies of the message as well as any
> attachments to
> >> the original message.
> >>
> >
>
> This email message and any attachments are for the sole use of the intended
> recipients and may contain proprietary and/or confidential information which
> may be privileged or otherwise protected from disclosure. Any unauthorized
> review, use, disclosure or distribution is prohibited. If you are not an
> intended recipient, please contact the sender by reply email and destroy the
> original message and any copies of the message as well as any attachments to
> the original message.
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB