Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Avro >> mail # user >> Mapreduce outputs directly to SortedKeyValueFile

Copy link to this message
Re: Mapreduce outputs directly to SortedKeyValueFile
Hi Jeremy,

There's no OutputFormat for directly writing the SortedKeyValueFile in
the avro-mapred package yet, but it can certainly be written by
you/added to avro. I don't see why you can't write the sorted file
right from your job (from a reducer, I'm assuming, for it is sorted).
You merely need to extend the OutputFormat and use a
SortedKeyValueFile writer instead of a simple DataFile writer (as
AvroOutputFormat's getRecordWriter provides/does). Please do file an
AVRO JIRA for this, as its a hole in what Avro provides that needs to
be filled.

On Sun, Jul 22, 2012 at 12:50 AM, Jeremy Lewi <[EMAIL PROTECTED]> wrote:
> Hi avro-users,
> Is it possible for my mapreduce job to write directly to a SortedKeyValue
> file? Or must I first output to a regular avro file and then build the
> index?
> Thanks
> J

Harsh J