Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Avro >> mail # user >> Mapreduce outputs directly to SortedKeyValueFile


Copy link to this message
-
Re: Mapreduce outputs directly to SortedKeyValueFile
Hi Jeremy,

There's no OutputFormat for directly writing the SortedKeyValueFile in
the avro-mapred package yet, but it can certainly be written by
you/added to avro. I don't see why you can't write the sorted file
right from your job (from a reducer, I'm assuming, for it is sorted).
You merely need to extend the OutputFormat and use a
SortedKeyValueFile writer instead of a simple DataFile writer (as
AvroOutputFormat's getRecordWriter provides/does). Please do file an
AVRO JIRA for this, as its a hole in what Avro provides that needs to
be filled.

On Sun, Jul 22, 2012 at 12:50 AM, Jeremy Lewi <[EMAIL PROTECTED]> wrote:
> Hi avro-users,
>
> Is it possible for my mapreduce job to write directly to a SortedKeyValue
> file? Or must I first output to a regular avro file and then build the
> index?
>
> Thanks
> J

--
Harsh J
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB