Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Avro, mail # user - Mapreduce outputs directly to SortedKeyValueFile


Copy link to this message
-
Re: Mapreduce outputs directly to SortedKeyValueFile
Harsh J 2012-07-21, 19:30
Hi Jeremy,

There's no OutputFormat for directly writing the SortedKeyValueFile in
the avro-mapred package yet, but it can certainly be written by
you/added to avro. I don't see why you can't write the sorted file
right from your job (from a reducer, I'm assuming, for it is sorted).
You merely need to extend the OutputFormat and use a
SortedKeyValueFile writer instead of a simple DataFile writer (as
AvroOutputFormat's getRecordWriter provides/does). Please do file an
AVRO JIRA for this, as its a hole in what Avro provides that needs to
be filled.

On Sun, Jul 22, 2012 at 12:50 AM, Jeremy Lewi <[EMAIL PROTECTED]> wrote:
> Hi avro-users,
>
> Is it possible for my mapreduce job to write directly to a SortedKeyValue
> file? Or must I first output to a regular avro file and then build the
> index?
>
> Thanks
> J

--
Harsh J