Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Avro >> mail # user >> improve performance of avro map reduce jobs


Copy link to this message
-
RE: improve performance of avro map reduce jobs

Let me put the question in another way.  Companies like Twitter they use Protocol Buffer as their serialization tool.  It seems to have better performance.  Is there any compelling reason that Avro can do and Protocol Buffer cannot ?  Thanks.
Ey-Chih

From: [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Subject: improve performance of avro map reduce jobs
Date: Fri, 24 Jun 2011 16:55:58 -0700
Our Map/Reduce jobs are all based on avro.  We would like to enhance their performance.  The objects collected in our mappers and reducers are mainly of the type GenericData.Record.  Currently, most of jobs are CPU, rather than IO, bound.  Can anybody suggest ways to improve the performance of the jobs?  Thanks a lot.
Ey-Chih Chow              
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB