Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Explosion in datasize using HBase as a MR sink

Copy link to this message
Re: Explosion in datasize using HBase as a MR sink
On Tue, Jun 4, 2013 at 9:58 PM, Rob Verkuylen <[EMAIL PROTECTED]> wrote:

> Finally fixed this, my code was at fault.
> Protobufs require a builder object which was a (non static) protected
> object in an abstract class all parsers extend. The mapper calls a parser
> factory depending on the input record. Because we designed the parser
> instances as singletons, the builder object in the abstract class got
> reused and all data got appended to the same builder. Doh! This only shows
> up in a job, not in single tests. Ah well, I've learned a lot  :)
Thanks for updating the list Rob.

Yours is a classic except it is first time I've heard of someone
protobufing it..  Usually it is a reuse of an Hadoop Writable instance