Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Can't achieve load distribution


Copy link to this message
-
Re: Can't achieve load distribution
And that is exactly what I found.

I have a "hack" for now - give all files on the command line - and I will
wait for the next release in some distribution.

Thank you,
Mark

On Thu, Feb 2, 2012 at 9:55 PM, Harsh J <[EMAIL PROTECTED]> wrote:

> New API NLineInputFormat is only available from 1.0.1, and not in any
> of the earlier 1 (1.0.0) or 0.20 (0.20.x, 0.20.xxx) vanilla Apache
> releases.
>
> On Fri, Feb 3, 2012 at 7:08 AM, Praveen Sripati
> <[EMAIL PROTECTED]> wrote:
> > Mark,
> >
> > NLineInputFormat was not something which was introduced in 0.21, I have
> > just sent the reference to the 0.21 url FYI. It's in 0.20.205, 1.0.0 and
> > 0.23 releases also.
> >
> > Praveen
> >
> > On Fri, Feb 3, 2012 at 1:25 AM, Mark Kerzner <[EMAIL PROTECTED]
> >wrote:
> >
> >> Praveen,
> >>
> >> this seems just like the right thing, but it's API 0.21 (I googled about
> >> the problems with it), so I have to use either the next Cloudera
> release,
> >> or Hortonworks, or something, am I right?
> >>
> >> Mark
> >>
> >> On Thu, Feb 2, 2012 at 7:39 AM, Praveen Sripati <
> [EMAIL PROTECTED]
> >> >wrote:
> >>
> >> > > I have a simple MR job, and I want each Mapper to get one line from
> my
> >> > input file (which contains further instructions for lengthy
> processing).
> >> >
> >> > Use the NLineInputFormat class.
> >> >
> >> >
> >> >
> >>
> http://hadoop.apache.org/mapreduce/docs/r0.21.0/api/org/apache/hadoop/mapreduce/lib/input/NLineInputFormat.html
> >> >
> >> > Praveen
> >> >
> >> > On Thu, Feb 2, 2012 at 9:43 AM, Mark Kerzner <
> [EMAIL PROTECTED]
> >> > >wrote:
> >> >
> >> > > Thanks!
> >> > > Mark
> >> > >
> >> > > On Wed, Feb 1, 2012 at 7:44 PM, Anil Gupta <[EMAIL PROTECTED]>
> >> > wrote:
> >> > >
> >> > > > Yes, if ur block size is 64mb. Btw, block size is configurable in
> >> > Hadoop.
> >> > > >
> >> > > > Best Regards,
> >> > > > Anil
> >> > > >
> >> > > > On Feb 1, 2012, at 5:06 PM, Mark Kerzner <
> [EMAIL PROTECTED]>
> >> > > wrote:
> >> > > >
> >> > > > > Anil,
> >> > > > >
> >> > > > > do you mean one block of HDFS, like 64MB?
> >> > > > >
> >> > > > > Mark
> >> > > > >
> >> > > > > On Wed, Feb 1, 2012 at 7:03 PM, Anil Gupta <
> [EMAIL PROTECTED]>
> >> > > > wrote:
> >> > > > >
> >> > > > >> Do u have enough data to start more than one mapper?
> >> > > > >> If entire data is less than a block size then only 1 mapper
> will
> >> > run.
> >> > > > >>
> >> > > > >> Best Regards,
> >> > > > >> Anil
> >> > > > >>
> >> > > > >> On Feb 1, 2012, at 4:21 PM, Mark Kerzner <
> >> [EMAIL PROTECTED]>
> >> > > > wrote:
> >> > > > >>
> >> > > > >>> Hi,
> >> > > > >>>
> >> > > > >>> I have a simple MR job, and I want each Mapper to get one line
> >> from
> >> > > my
> >> > > > >>> input file (which contains further instructions for lengthy
> >> > > > processing).
> >> > > > >>> Each line is 100 characters long, and I tell Hadoop to read
> only
> >> > 100
> >> > > > >> bytes,
> >> > > > >>>
> >> > > > >>>
> >> > > > >>
> >> > > >
> >> > >
> >> >
> >>
> job.getConfiguration().setInt("mapreduce.input.linerecordreader.line.maxlength",
> >> > > > >>> 100);
> >> > > > >>>
> >> > > > >>> I see that this part works - it reads only one line at a time,
> >> and
> >> > > if I
> >> > > > >>> change this parameter, it listens.
> >> > > > >>>
> >> > > > >>> However, on a cluster only one node receives all the map
> tasks.
> >> > Only
> >> > > > one
> >> > > > >>> map tasks is started. The others never get anything, they just
> >> > wait.
> >> > > > I've
> >> > > > >>> added 100 seconds wait to the mapper - no change!
> >> > > > >>>
> >> > > > >>> Any advice?
> >> > > > >>>
> >> > > > >>> Thank you. Sincerely,
> >> > > > >>> Mark
> >> > > > >>
> >> > > >
> >> > >
> >> >
> >>
>
>
>
> --
> Harsh J
> Customer Ops. Engineer
> Cloudera | http://tiny.cloudera.com/about
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB