|
|
+
SEBASTIAN ORTEGA TORRES 2012-09-06, 15:31
+
Harsh J 2012-09-06, 16:12
-
Re: [Cosmos-dev] Out of memory in identity mapper?Hemanth Yamijala 2012-09-07, 03:49
Harsh,
Could IsolationRunner be used here. I'd put up a patch for HADOOP-8765, after applying which IsolationRunner works for me. Maybe we could use it to re-run the map task that's failing and debug. Thanks hemanth On Thu, Sep 6, 2012 at 9:42 PM, Harsh J <[EMAIL PROTECTED]> wrote: > Protobuf involvement makes me more suspicious that this is possibly a > corruption or an issue with serialization as well. Perhaps if you can > share some stack traces, people can help better. If it is reliably > reproducible, then I'd also check for the count of records until after > this occurs, and see if the stacktraces are always same. > > Serialization formats such as protobufs allocate objects based on read > sizes (like for example, a string size may be read first before the > string's bytes are read, and upon size read, such a length is > pre-allocated for the bytes to be read into), and in cases of corrupt > data or bugs in the deserialization code, it is quite easy for it to > make a large alloc request due to a badly read value. Its one > possibility. > > Is the input compressed too, btw? Can you seek out the input file the > specific map fails on, and try to read it in an isolated manner to > validate it? Or do all maps seem to fail? > > On Thu, Sep 6, 2012 at 9:01 PM, SEBASTIAN ORTEGA TORRES <[EMAIL PROTECTED]> > wrote: > > Input files are small fixed-size protobuf records and yes, it is > > reproducible (but it takes some time). > > In this case I cannot use combiners since I need to process all the > elements > > with the same key altogether. > > > > Thanks for the prompt response > > > > -- > > Sebastián Ortega Torres > > Product Development & Innovation / Telefónica Digital > > C/ Don Ramón de la Cruz 82-84 > > Madrid 28006 > > > > > > > > > > > > > > El 06/09/2012, a las 17:13, Harsh J escribió: > > > > I can imagine a huge record size possibly causing this. Is this > > reliably reproducible? Do you also have combiners enabled, which may > > run the reducer-logic on the map-side itself? > > > > On Thu, Sep 6, 2012 at 8:20 PM, JOAQUIN GUANTER GONZALBEZ <[EMAIL PROTECTED]> > > wrote: > > > > Hello hadoopers! > > > > > > > > > > In a reduce-only Hadoop job input files are handled by the identity > mapper > > > > and sent to the reducers without modification. In one of my job I was > > > > surprised to see the job failing in the map phase with "Out of memory > error" > > > > and "GC overhead limit exceeded". > > > > > > > > > > In my understanding, a memory leak on the identity mapper is out of the > > > > question. > > > > > > What can be the cause of such error? > > > > > > > > > > Thanks, > > > > > > Ximo. > > > > > > > > > > P.S. The logs show no stack trace other than the messages I mentioned > > > > before. > > > > > > > > ________________________________ > > > > Este mensaje se dirige exclusivamente a su destinatario. Puede consultar > > > > nuestra política de envío y recepción de correo electrónico en el enlace > > > > situado más abajo. > > > > This message is intended exclusively for its addressee. We only send and > > > > receive email on the basis of the terms set out at: > > > > http://www.tid.es/ES/PAGINAS/disclaimer.aspx > > > > > > > > > > -- > > Harsh J > > > > _______________________________________________ > > Cosmos-dev mailing list > > [EMAIL PROTECTED] > > https://listas.tid.es/mailman/listinfo/cosmos-dev > > > > > > > > ________________________________ > > > > Este mensaje se dirige exclusivamente a su destinatario. Puede consultar > > nuestra política de envío y recepción de correo electrónico en el enlace > > situado más abajo. > > This message is intended exclusively for its addressee. We only send and > > receive email on the basis of the terms set out at: > > http://www.tid.es/ES/PAGINAS/disclaimer.aspx > > > > -- > Harsh J > +
SEBASTIAN ORTEGA TORRES 2012-09-06, 16:22
|