Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop, mail # user - Maps split size


+
Mark Olimpiati 2012-10-26, 20:47
+
Bertrand Dechoux 2012-10-26, 21:23
+
Mark Olimpiati 2012-10-29, 04:25
Copy link to this message
-
Re: Maps split size
Bertrand Dechoux 2012-10-29, 06:15
Okay, then it would be because you didn't really change the block size.
Of course, you might change the value of the property but the block size is
actually something which is part of the file definition. It was stored as
blocks of 64MB (the default) and so you can only read it as blocks of 64MB.
If you want to change the block size of a file, you will have to recreate
it eg by copying it.

Regards

Bertrand

On Mon, Oct 29, 2012 at 5:25 AM, Mark Olimpiati <[EMAIL PROTECTED]> wrote:

> Well, when I said I found a solution this link was one of them :). Even
> though I set :
>
> dfs.block.size = mapred.min.split.size = mapred.max.split.size = 14MB the
> job is still running maps with 64MB !
>
> I don't see what else can I change :(
>
> Thanks,
> Mark
>
> On Fri, Oct 26, 2012 at 2:23 PM, Bertrand Dechoux <[EMAIL PROTECTED]
> >wrote:
>
> > Hi Mark,
> >
> > I think http://wiki.apache.org/hadoop/HowManyMapsAndReduces might
> interest
> > you.
> > If you require more information, feel free to ask after reading it.
> >
> > Regards
> >
> > Bertrand
> >
> > On Fri, Oct 26, 2012 at 10:47 PM, Mark Olimpiati <[EMAIL PROTECTED]
> > >wrote:
> >
> > > Hi,
> > >
> > >   I've found that the solution to control the split size per mapper is
> to
> > > modify the following configurations:
> > >
> > > mapred.min.split.size and mapred.max.split.size, but when I set them
> both
> > > to 14MB with dfs.block.size = 64MB, the splits are still = 64MB.
> > >
> > > So, is there a relation between them that I should consider?
> > >
> > > Thank you,
> > > Mark
> > >
> >
> >
> >
> > --
> > Bertrand Dechoux
> >
>

--
Bertrand Dechoux