Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Explosion in datasize using HBase as a MR sink


Copy link to this message
-
Re: Explosion in datasize using HBase as a MR sink
On Tue, Jun 4, 2013 at 9:58 PM, Rob Verkuylen <[EMAIL PROTECTED]> wrote:

> Finally fixed this, my code was at fault.
>
> Protobufs require a builder object which was a (non static) protected
> object in an abstract class all parsers extend. The mapper calls a parser
> factory depending on the input record. Because we designed the parser
> instances as singletons, the builder object in the abstract class got
> reused and all data got appended to the same builder. Doh! This only shows
> up in a job, not in single tests. Ah well, I've learned a lot  :)
>
>
Thanks for updating the list Rob.

Yours is a classic except it is first time I've heard of someone
protobufing it..  Usually it is a reuse of an Hadoop Writable instance
accumulating....

St.Ack
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB