Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce, mail # user - Re: [Cosmos-dev] Out of memory in identity mapper?


Copy link to this message
-
Re: [Cosmos-dev] Out of memory in identity mapper?
Hemanth Yamijala 2012-09-07, 03:49
Harsh,

Could IsolationRunner be used here. I'd put up a patch for HADOOP-8765,
after applying which IsolationRunner works for me. Maybe we could use it to
re-run the map task that's failing and debug.

Thanks
hemanth

On Thu, Sep 6, 2012 at 9:42 PM, Harsh J <[EMAIL PROTECTED]> wrote:

> Protobuf involvement makes me more suspicious that this is possibly a
> corruption or an issue with serialization as well. Perhaps if you can
> share some stack traces, people can help better. If it is reliably
> reproducible, then I'd also check for the count of records until after
> this occurs, and see if the stacktraces are always same.
>
> Serialization formats such as protobufs allocate objects based on read
> sizes (like for example, a string size may be read first before the
> string's bytes are read, and upon size read, such a length is
> pre-allocated for the bytes to be read into), and in cases of corrupt
> data or bugs in the deserialization code, it is quite easy for it to
> make a large alloc request due to a badly read value. Its one
> possibility.
>
> Is the input compressed too, btw? Can you seek out the input file the
> specific map fails on, and try to read it in an isolated manner to
> validate it? Or do all maps seem to fail?
>
> On Thu, Sep 6, 2012 at 9:01 PM, SEBASTIAN ORTEGA TORRES <[EMAIL PROTECTED]>
> wrote:
> > Input files are small fixed-size protobuf records and yes, it is
> > reproducible (but it takes some time).
> > In this case I cannot use combiners since I need to process all the
> elements
> > with the same key altogether.
> >
> > Thanks for the prompt response
> >
> > --
> > Sebastián Ortega Torres
> > Product Development & Innovation / Telefónica Digital
> > C/ Don Ramón de la Cruz 82-84
> > Madrid 28006
> >
> >
> >
> >
> >
> >
> > El 06/09/2012, a las 17:13, Harsh J escribió:
> >
> > I can imagine a huge record size possibly causing this. Is this
> > reliably reproducible? Do you also have combiners enabled, which may
> > run the reducer-logic on the map-side itself?
> >
> > On Thu, Sep 6, 2012 at 8:20 PM, JOAQUIN GUANTER GONZALBEZ <[EMAIL PROTECTED]>
> > wrote:
> >
> > Hello hadoopers!
> >
> >
> >
> >
> > In a reduce-only Hadoop job input files are handled by the identity
> mapper
> >
> > and sent to the reducers without modification. In one of my job I was
> >
> > surprised to see the job failing in the map phase with "Out of memory
> error"
> >
> > and "GC overhead limit exceeded".
> >
> >
> >
> >
> > In my understanding, a memory leak on the identity mapper is out of the
> >
> > question.
> >
> >
> > What can be the cause of such error?
> >
> >
> >
> >
> > Thanks,
> >
> >
> > Ximo.
> >
> >
> >
> >
> > P.S. The logs show no stack trace other than the messages I mentioned
> >
> > before.
> >
> >
> >
> > ________________________________
> >
> > Este mensaje se dirige exclusivamente a su destinatario. Puede consultar
> >
> > nuestra política de envío y recepción de correo electrónico en el enlace
> >
> > situado más abajo.
> >
> > This message is intended exclusively for its addressee. We only send and
> >
> > receive email on the basis of the terms set out at:
> >
> > http://www.tid.es/ES/PAGINAS/disclaimer.aspx
> >
> >
> >
> >
> > --
> > Harsh J
> >
> > _______________________________________________
> > Cosmos-dev mailing list
> > [EMAIL PROTECTED]
> > https://listas.tid.es/mailman/listinfo/cosmos-dev
> >
> >
> >
> > ________________________________
> >
> > Este mensaje se dirige exclusivamente a su destinatario. Puede consultar
> > nuestra política de envío y recepción de correo electrónico en el enlace
> > situado más abajo.
> > This message is intended exclusively for its addressee. We only send and
> > receive email on the basis of the terms set out at:
> > http://www.tid.es/ES/PAGINAS/disclaimer.aspx
>
>
>
> --
> Harsh J
>